Nvidia Is Betting the Next Trillion Dollars in AI Will Come From Inference

Key takeaways

Nvidia is pivoting hard toward inference, the stage of AI where models generate responses and perform tasks in real time.
Jensen Huang projected $1 trillion in Blackwell and Rubin chip sales by the end of 2027, doubling prior expectations.
The new Groq 3 LPX rack is designed to dramatically improve inference speed and memory capacity, addressing a major bottleneck in AI deployment.
Nvidia is expanding beyond chips into AI factories, digital twins, robotics, and autonomous vehicles, broadening the company’s role across the AI stack.

What Happened?

At Nvidia’s annual GTC conference, CEO Jensen Huang introduced a new flagship inference-focused server system, the Nvidia Groq 3 LPX rack, and framed it as the next major phase of AI computing. The system combines 72 Vera Rubin servers with 256 language processing units from Groq, a startup whose leadership team Nvidia effectively absorbed in a major licensing deal last year. Huang said the new system can generate 700 million tokens per second, far exceeding older Hopper-era infrastructure.

The announcement marked a clear strategic shift. Nvidia built its AI dominance by selling GPUs used to train large models, but customers now increasingly want better infrastructure for running models at scale, not just building them. Huang used the event to declare that the “inference inflection” has arrived and positioned Nvidia’s next generation of products as the foundation for this new phase.

Why It Matters

This is important because it changes the shape of the AI infrastructure investment cycle. Training remains critical, but inference is where AI becomes commercially useful — powering search, copilots, agents, enterprise software, robotaxis, and physical AI systems. That means the next wave of spending may be driven less by frontier model labs and more by companies trying to deploy AI into real products and workflows.

For investors, Huang’s $1 trillion sales target is the headline, but the deeper message is that Nvidia is trying to own the transition from training chips to full-stack AI deployment infrastructure. If inference becomes the dominant workload, Nvidia’s growth story can extend well beyond the first phase of the AI boom. The company is also using this moment to reinforce adjacent businesses in robotics, simulation, autonomous driving, and enterprise AI software tooling, making the platform more diversified and harder to displace.

What’s Next?

The next major question is whether customers adopt Nvidia’s inference systems at the scale Huang is projecting. Investors should watch for large cloud and enterprise deployments of the new Rubin-era systems, how quickly inference spending accelerates relative to training, and whether Nvidia can preserve its lead as rivals push more specialized chips into the market.

Also important is execution across Nvidia’s broader ecosystem bets — from AI factories and digital twins to robotaxis and physical AI. If inference demand ramps the way Huang expects, Nvidia may not just remain the leading AI chip company. It could become the core infrastructure provider for the next phase of AI commercialization.

Source