Research & Development World

  • R&D World Home
  • Topics
    • Aerospace
    • Automotive
    • Biotech
    • Careers
    • Chemistry
    • Environment
    • Energy
    • Life Science
    • Material Science
    • R&D Management
    • Physics
  • Technology
    • 3D Printing
    • A.I./Robotics
    • Software
    • Battery Technology
    • Controlled Environments
      • Cleanrooms
      • Graphene
      • Lasers
      • Regulations/Standards
      • Sensors
    • Imaging
    • Nanotechnology
    • Scientific Computing
      • Big Data
      • HPC/Supercomputing
      • Informatics
      • Security
    • Semiconductors
  • R&D Market Pulse
  • R&D 100
    • 2025 R&D 100 Award Winners
    • 2025 Professional Award Winners
    • 2025 Special Recognition Winners
    • R&D 100 Awards Event
    • R&D 100 Submissions
    • Winner Archive
  • Resources
    • Research Reports
    • Digital Issues
    • Educational Assets
    • R&D Index
    • Subscribe
    • Video
    • Webinars
    • Content submission guidelines for R&D World
  • Global Funding Forecast
  • Top Labs
  • Advertise
  • SUBSCRIBE

NVIDIA aims to deliver 15-exaflop AI compute in one rack by 2027

By Brian Buntz | March 18, 2025

Vera Rubin

The term “exascale computing” used to conjure images of sprawling supercomputer facilities like Frontier or El Capitan. Now, NVIDIA is attaching that same term to a single rack of its upcoming Rubin Ultra AI chips. Scheduled for 2027, Rubin Ultra is projected to deliver 15 exaflops of AI compute. That computational horsepower, in a sense, would rival the throughput of today’s largest supercomputers, at least in lower-precision AI tasks.

But the comparison between future-gen GPUs and current-gen supercomputers is decidedly apples to oranges. Before picturing an entire national lab supercomputer replaced by one server cabinet, there’s a crucial asterisk: NVIDIA’s exaflops claims refer to AI-oriented precision, not the 64-bit performance that defines top supercomputers. Still, for large-scale inference and training, a Rubin Ultra rack represents a seismic shift.

Blackwell to Feynman: A multi-year GPU architecture roadmap

During his GTC 2025 keynote, CEO Jensen Huang unveiled a plan for powering what he calls the world’s “AI factories.” It begins with Blackwell Ultra in the second half of 2025, offering one and a half times more FLOPS and memory, and “two times more networking bandwidth” compared to the current Hopper architecture. The roadmap consists of the following:

  • Vera Rubin (H2 2026): New CPU, new GPU, new CX9 networking, and HBM4 memory. “Basically everything is brand new except for the chassis,” Huang said.
  • Vera Rubin Ultra (H2 2027): A jump to 15 exaFLOPS of AI compute with bandwidth (4,600 TB/s). “Everything is X-factored more: 14 times more FLOPS, 15 exaFLOPS instead of one exaFLOP,” Huang said.
  • Feynman (2028): The next-generation platform with details yet to come.

From data centers to ‘AI factories’

Huang explained the long planning horizons for such undertakings: “Look, we’re building AI factories and AI infrastructure. It’s going to take years of planning… which is the reason why I show you our roadmap a couple two, three years in advance so that we don’t surprise you.”

NVIDIA sees a paradigm shift from conventional data centers to “AI factories,” where the primary output is “tokens” for generative AI services. Huang described this transition: “The computer has become a generator of tokens, not a retrieval of files… “

I call them AI factories because they have one job and one job only: generating these incredible tokens that we then reconstitute into music, into words, into videos, into research, into chemicals or proteins.

Jensen Huang

Supporting these AI factories is NVIDIA Dynamo—an “operating system” tailored for generative AI infrastructure, effectively replacing older enterprise software stacks like VMware. Dynamo OS goes beyond just managing hardware resources—it orchestrates complex AI workloads, dynamically partitions GPUs for different phases of inference (prefill and decode), and manages key-value caches critical for large language models. As Huang described it, Dynamo is “the operating system of the AI factory,” analogous to the original Dynamo that powered the industrial revolution.

Networking for scale could help tame soaring inference demands

Another shift in the new architectures is the move to disaggregated MVLink and co-packaged silicon photonics. These technologies overcome the limitations of integrated MVLink and enable dramatic power efficiency gains compared to traditional transceivers. This architectural shift is what allows NVIDIA to scale systems to potentially connect millions of GPUs while maintaining the bandwidth necessary for massive AI models.

As agentic AI systems grow more complex, inference is emerging as a major driver of computation: “The amount of computation we need at this point… is easily 100 times more than we thought we needed this time last year.”

While DeepSeek’s rise was a momentary headache for NVIDIA, Huang uses its R1 model to make a point. A simple task of asking a traditional large language model to help with, say, a seating arrangement might use just over 400 tokens. But R1 could use more than 8,000 tokens of “thinking”  on the same task while coming up with a more satisfactory answer. Roughly 20x more tokens and well more than 100x more compute in this example.

The amount of compute is steadily going up, necessitating new architectures to handle the burgeoning demand. “Once a year, once a year like clock ticks, once a year,” Huang said. “We try to take silicon risk or networking risk or system chassis risk in pieces so that we can move the industry forward.”

Related Articles Read More >

OpenAI’s GPT-5 autonomously ran 36,000 protein synthesis experiments in Ginkgo Bioworks’ cloud lab
Claude Opus 4.6 targets research workflows with 1M-token context window, improved scientific reasoning
OpenAI logo on black background. Chernihiv, Ukraine - January 15, 2022
OpenAI exec envisions 25 years of science in 5. Meanwhile, Anthropic crashes software stocks.
NVIDIA, Dassault Systèmes target materials discovery, drug development and more with industrial AI platform
rd newsletter
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest info on technologies, trends, and strategies in Research & Development.
RD 25 Power Index

R&D World Digital Issues

Fall 2025 issue

Browse the most current issue of R&D World and back issues in an easy to use high quality format. Clip, share and download with the leading R&D magazine today.

R&D 100 Awards
Research & Development World
  • Subscribe to R&D World Magazine
  • Sign up for R&D World’s newsletter
  • Contact Us
  • About Us
  • Drug Discovery & Development
  • Pharmaceutical Processing
  • Global Funding Forecast

Copyright © 2026 WTWH Media LLC. All Rights Reserved. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media
Privacy Policy | Advertising | About Us

Search R&D World

  • R&D World Home
  • Topics
    • Aerospace
    • Automotive
    • Biotech
    • Careers
    • Chemistry
    • Environment
    • Energy
    • Life Science
    • Material Science
    • R&D Management
    • Physics
  • Technology
    • 3D Printing
    • A.I./Robotics
    • Software
    • Battery Technology
    • Controlled Environments
      • Cleanrooms
      • Graphene
      • Lasers
      • Regulations/Standards
      • Sensors
    • Imaging
    • Nanotechnology
    • Scientific Computing
      • Big Data
      • HPC/Supercomputing
      • Informatics
      • Security
    • Semiconductors
  • R&D Market Pulse
  • R&D 100
    • 2025 R&D 100 Award Winners
    • 2025 Professional Award Winners
    • 2025 Special Recognition Winners
    • R&D 100 Awards Event
    • R&D 100 Submissions
    • Winner Archive
  • Resources
    • Research Reports
    • Digital Issues
    • Educational Assets
    • R&D Index
    • Subscribe
    • Video
    • Webinars
    • Content submission guidelines for R&D World
  • Global Funding Forecast
  • Top Labs
  • Advertise
  • SUBSCRIBE