Cutting‑Edge AI Was Supposed to Get Cheaper. It’s More Expensive Than Ever

by Team Lumida

August 30, 2025

in AI

Reading Time: 4 mins read

Key Takeaways

While inference cost per token continues to fall sharply (≈10x/year for mid‑tier models, up to 900x/year for most capable ones), overall AI usage costs are rising because newer models perform complex “reasoning” that consumes orders of magnitude more tokens.
Basic chatbot Q&As may use a few hundred tokens, but advanced tasks like legal analysis, multi‑step agents, or complex coding can consume 100,000 to 1 million+ tokens, pushing application costs significantly higher.
AI‑driven startups (e.g., Notion, Replit, Cursor) report margin compression; some have adopted “effort‑based pricing,” creating pushback among users but preserving enterprise‑level margins (≈80–90%).
Big Tech (Google, Microsoft, Meta, OpenAI, Anthropic) can subsidize AI offerings with >$100B annual infra spend, outcompeting smaller firms and even offering tools for free (e.g., Google’s code‑assistant), creating a squeeze for startups.
Strategic tension: will end‑users embrace “cheaper, dumber” AI for mass adoption, or will the premium for “smarter” agents force consolidation and tilt advantage toward giants with proprietary compute and capital?

What Happened?

Recent WSJ analysis highlights that, despite steep cost declines on a per‑token basis, application‑level AI costs are rising due to the explosion of tokens consumed per complex task. Startups reliant on embedding cutting‑edge AI into SaaS products face shrinking margins, leading to pricing changes and user backlash. At the same time, well‑capitalized incumbents continue investing heavily in infrastructure, betting that mass adoption will ultimately validate these costs.

Why It Matters

Unit‑Economics Risk: Small and mid‑cap software providers integrating AI risk eroding cloud‑like margins as token usage outpaces cost declines.
Competitive Dynamics: Giants can afford near‑zero margin consumer products while monetizing enterprise and infra, squeezing startups out of the middle.
Consolidation Outlook: Price wars, subsidization, and infrastructure spending point to eventual shake‑outs, with well‑funded incumbents holding structural advantage.
Investor Lens: Critical to separate AI beneficiaries (infra providers, hyperscalers, chipmakers, leading model owners) from exposed SaaS layers seeing margin pressure. Long‑term winners will be those with control over compute, distribution, and proprietary ecosystems.

What’s Next?

Investors should watch for pricing model adjustments at AI SaaS startups, user churn measures, and evolving adoption of “lightweight” vs. “full‑capability” models as enterprises and consumers balance cost against accuracy. Track infrastructure spending disclosures from hyperscalers and margin guidance revisions at AI‑first startups. Also monitor regulatory and competitive risks as giants potentially cross‑subsidize consumer AI products, raising antitrust scrutiny.

Source