NVIDIARubin

NVIDIA Rubin R100 GPU

The next era of AI compute. Built on TSMC 3nm with 336 billion transistors, Rubin delivers 50 petaflops of FP4 inference—5X faster than Blackwell. With 288GB HBM4 memory at 22TB/s bandwidth and NVLink 6 connectivity, Rubin redefines what's possible for AI factories and agentic AI at scale.

R100
Rubin Architecture
50 PFLOPS
H2 2026
288GB HBM4 Memory
22TB/s Memory Bandwidth
50 PFLOPS FP4 Inference
3nm TSMC N3P Process

The Rubin Era

Generational leap over Blackwell for next-generation AI workloads

5X
Faster Inference
50 PFLOPS vs 10 PFLOPS FP4
3.5X
Faster Training
35 PFLOPS vs 10 PFLOPS
10X
Lower Cost/Token
Agentic AI economics
4X
Fewer GPUs Needed
Same workload, less hardware

Complete Platform Stack

Six specialized components designed for extreme codesign

Rubin GPU

288GB HBM4, 50 PFLOPS FP4, 336B transistors on TSMC 3nm with 6th-gen Transformer Engine

Vera CPU

88 Olympus cores with spatial multi-threading (176 threads), 2X performance vs Grace

NVLink 6 Switch

3.6TB/s all-to-all bandwidth per GPU for scale-up training and inference

ConnectX-9 SuperNIC

1.6Tb/s per-GPU networking for scale-out connectivity to thousands of GPUs

BlueField-4 DPU

Data acceleration and security offload for enterprise AI workloads

Spectrum-X Ethernet

Integrated silicon photonics for lossless AI networking at scale

Rubin Configurations

From individual GPUs to rack-scale systems

R100 Rubin GPU
Memory 288GB HBM4
Bandwidth 22TB/s
FP4 Inference 50 PFLOPS
FP4 Training 35 PFLOPS
NVLink 6 3.6TB/s
Process TSMC 3nm N3P
Availability H2 2026

Vera Rubin NVL72

The most powerful AI system ever built

72 Rubin GPUs
36 Vera CPUs
20.7TB HBM4 Memory
3.6 EFLOPS FP4 Inference
260TB/s NVLink 6 aggregate bandwidth
1,580TB/s HBM4 memory bandwidth
3,168 Arm-compatible CPU cores
3rd-gen MGX modular design
Cable-free modular trays
80+ MGX ecosystem partners
65X more AI compute than Hopper systems for next-generation reasoning and agentic AI

Technical Specifications

Complete Rubin architecture specifications

Specification Rubin R100 Vera Rubin NVL72
Architecture
GPU Architecture Rubin 72x Rubin GPUs
CPU Architecture N/A (discrete GPU) 36x Vera (88 Olympus cores each)
Process Node TSMC 3nm N3P TSMC 3nm N3P
Transistors 336 billion 24+ trillion (system)
Memory
GPU Memory 288GB HBM4 20.7TB HBM4
Memory Bandwidth 22TB/s 1,580TB/s
Memory Type HBM4 HBM4
Performance
FP4 Inference 50 PFLOPS 3,600 PFLOPS (3.6 EFLOPS)
FP4 Training 35 PFLOPS 2,500 PFLOPS (2.5 EFLOPS)
vs Blackwell (Inference) 5X faster 65X vs Hopper
vs Blackwell (Training) 3.5X faster 4X efficiency
Connectivity
NVLink 6 3.6TB/s per GPU 260TB/s aggregate
Network ConnectX-9 (1.6Tb/s) Integrated Spectrum-X
Availability
Production H2 2026 H2 2026
Cloud Partners AWS, Google Cloud, Microsoft Azure, Oracle Cloud

Built for Next-Gen AI

Rubin enables workloads impossible on previous architectures

Agentic AI

10X lower cost per token enables economically viable autonomous AI agents with complex multi-step reasoning at scale.

Reasoning Models

Train mixture-of-expert models with 4X fewer GPUs than Blackwell. Massive memory enables trillion-parameter architectures.

AI Factories

NVL72 delivers 3.6 exaflops in a single rack—the compute foundation for next-generation AI data centers.

Real-Time Translation

Sub-10ms inference latency enables real-time multilingual AI assistants with human-quality comprehension.

Video Generation

HBM4 bandwidth enables high-resolution video generation at interactive framerates for media production.

Scientific Discovery

Drug discovery, protein engineering, materials science—compute at the scale of biological complexity.

Rubin vs Previous Generations

Generational leap in AI compute efficiency

Rubin vs Blackwell (B200)

  • Memory: 288GB HBM4 vs 192GB HBM3e (+50%)
  • Bandwidth: 22TB/s vs 8TB/s (+175%)
  • FP4 Inference: 50 PFLOPS vs 10 PFLOPS (5X)
  • Training: 35 PFLOPS vs 10 PFLOPS (3.5X)
  • Transistors: 336B vs 208B (1.6X)
  • Process: 3nm N3P vs 4NP
Verdict: Rubin delivers a true generational leap for AI inference and training

Vera Rubin NVL72 vs DGX GB200

  • AI Compute: 3.6 EFLOPS vs 720 PFLOPS (5X)
  • Memory: 20.7TB vs 13.5TB (+53%)
  • NVLink BW: 260TB/s vs 130TB/s (2X)
  • vs Hopper: 65X more AI compute
  • Cost/Token: 10X lower than Blackwell
Verdict: NVL72 is the foundation for exascale AI factories

Frequently Asked Questions

What is the NVIDIA Rubin R100 GPU?

Rubin R100 is NVIDIA's next-generation AI accelerator announced at CES 2026. Built on TSMC 3nm with 336 billion transistors, it features 288GB HBM4 memory with 22TB/s bandwidth. R100 delivers 50 petaflops of FP4 inference—5X faster than Blackwell—and 35 petaflops of training performance.

What is the Vera Rubin Superchip?

The Vera Rubin Superchip combines one Vera CPU (88 Olympus cores with spatial multi-threading for 176 threads) with two Rubin GPUs via NVLink-C2C. This unified architecture delivers 100 PFLOPS of FP4 inference and 2X the CPU performance of Grace.

When is NVIDIA Rubin available?

NVIDIA announced at CES 2026 that Rubin has entered full-scale production with systems available in H2 2026. AWS, Google Cloud, Microsoft Azure, and Oracle Cloud will be among the first to offer Vera Rubin instances. Contact SLYD to reserve allocation.

How does Rubin compare to Blackwell?

Rubin delivers 5X faster FP4 inference, 3.5X faster training, 50% more memory (288GB vs 192GB), and 175% more memory bandwidth (22TB/s vs 8TB/s). It achieves 10X lower cost per token for inference and requires 4X fewer GPUs to train equivalent models.

What is the Vera Rubin NVL72?

NVL72 is a rack-scale system containing 72 Rubin GPUs and 36 Vera CPUs (36 Vera Rubin Superchips). It delivers 3.6 exaflops of FP4 inference, 20.7TB of HBM4 memory, and 260TB/s of NVLink 6 bandwidth. NVIDIA claims it provides 65X more AI compute than Hopper systems.

Should I wait for Rubin or buy Blackwell now?

For production workloads deploying in 2025, Blackwell (B200/B300/GB200) offers excellent performance and is available now. If your deployment timeline extends to H2 2026 or you're planning next-generation AI factory infrastructure, Rubin's 5X inference improvement may warrant waiting. Contact SLYD for capacity planning.

Reserve NVIDIA Rubin Allocation

Join the queue for next-generation AI compute. SLYD is working with OEM partners to secure Rubin allocation for H2 2026 delivery.

Cloud partners deploying Rubin:
AWS Google Cloud Microsoft Azure Oracle Cloud
Reconnecting to the server...

Please wait while we restore your connection

An unhandled error has occurred. Reload 🗙