NVIDIARubin

NVIDIA Rubin R100 GPU

Q: What is the NVIDIA Rubin R100 GPU?

The NVIDIA Rubin R100 is NVIDIA's next-generation AI accelerator built on TSMC 3nm process. It features 288GB HBM4 memory with 22TB/s bandwidth, delivering 50 petaflops of FP4 inference performance—5X faster than Blackwell. Production begins in H2 2026.

Q: How does Rubin compare to Blackwell?

Rubin delivers 5X higher FP4 inference performance, 3.5X faster training, 1.6X more transistors (336B vs 208B), and moves to HBM4 memory with 22TB/s bandwidth. It offers 10X lower cost per token for AI inference compared to Blackwell.

Q: When is NVIDIA Rubin available?

NVIDIA announced Rubin at CES 2026 with production entering full-scale in H2 2026. AWS, Google Cloud, Microsoft Azure, and Oracle Cloud will be among the first to deploy Vera Rubin instances. Contact SLYD to reserve allocation.

The next era of AI compute. Built on TSMC 3nm with 336 billion transistors, Rubin delivers 50 petaflops of FP4 inference—5X faster than Blackwell. With 288GB HBM4 memory at 22TB/s bandwidth and NVLink 6 connectivity, Rubin redefines what's possible for AI factories and agentic AI at scale.

Reserve Allocation View Specifications

R100

Rubin Architecture

50 PFLOPS

H2 2026

288GB HBM4 Memory

22TB/s Memory Bandwidth

50 PFLOPS FP4 Inference

3nm TSMC N3P Process

The Rubin Era

Generational leap over Blackwell for next-generation AI workloads

Faster Inference

50 PFLOPS vs 10 PFLOPS FP4

3.5X

Faster Training

35 PFLOPS vs 10 PFLOPS

10X

Lower Cost/Token

Agentic AI economics

Fewer GPUs Needed

Same workload, less hardware

Complete Platform Stack

Six specialized components designed for extreme codesign

Rubin GPU

288GB HBM4, 50 PFLOPS FP4, 336B transistors on TSMC 3nm with 6th-gen Transformer Engine

Vera CPU

88 Olympus cores with spatial multi-threading (176 threads), 2X performance vs Grace

NVLink 6 Switch

3.6TB/s all-to-all bandwidth per GPU for scale-up training and inference

ConnectX-9 SuperNIC

1.6Tb/s per-GPU networking for scale-out connectivity to thousands of GPUs

BlueField-4 DPU

Data acceleration and security offload for enterprise AI workloads

Spectrum-X Ethernet

Integrated silicon photonics for lossless AI networking at scale

Rubin Configurations

From individual GPUs to rack-scale systems

R100 Rubin GPU

Memory 288GB HBM4

Bandwidth 22TB/s

FP4 Inference 50 PFLOPS

FP4 Training 35 PFLOPS

NVLink 6 3.6TB/s

Process TSMC 3nm N3P

Availability H2 2026

Vera Rubin Superchip

Configuration 1x Vera + 2x Rubin

GPU Memory 576GB HBM4

FP4 Inference 100 PFLOPS

FP4 Training 70 PFLOPS

CPU Cores 88 Olympus (176 threads)

Interconnect NVLink-C2C

Availability H2 2026

Vera Rubin NVL72

The most powerful AI system ever built

72 Rubin GPUs

36 Vera CPUs

20.7TB HBM4 Memory

3.6 EFLOPS FP4 Inference

260TB/s NVLink 6 aggregate bandwidth

1,580TB/s HBM4 memory bandwidth

3,168 Arm-compatible CPU cores

3rd-gen MGX modular design

Cable-free modular trays

80+ MGX ecosystem partners

65X more AI compute than Hopper systems for next-generation reasoning and agentic AI

Technical Specifications

Complete Rubin architecture specifications

Specification	Rubin R100	Vera Rubin NVL72
Architecture
GPU Architecture	Rubin	72x Rubin GPUs
CPU Architecture	N/A (discrete GPU)	36x Vera (88 Olympus cores each)
Process Node	TSMC 3nm N3P	TSMC 3nm N3P
Transistors	336 billion	24+ trillion (system)
Memory
GPU Memory	288GB HBM4	20.7TB HBM4
Memory Bandwidth	22TB/s	1,580TB/s
Memory Type	HBM4	HBM4
Performance
FP4 Inference	50 PFLOPS	3,600 PFLOPS (3.6 EFLOPS)
FP4 Training	35 PFLOPS	2,500 PFLOPS (2.5 EFLOPS)
vs Blackwell (Inference)	5X faster	65X vs Hopper
vs Blackwell (Training)	3.5X faster	4X efficiency
Connectivity
NVLink 6	3.6TB/s per GPU	260TB/s aggregate
Network	ConnectX-9 (1.6Tb/s)	Integrated Spectrum-X
Availability
Production	H2 2026	H2 2026
Cloud Partners	AWS, Google Cloud, Microsoft Azure, Oracle Cloud

Built for Next-Gen AI

Rubin enables workloads impossible on previous architectures

Agentic AI

10X lower cost per token enables economically viable autonomous AI agents with complex multi-step reasoning at scale.

Reasoning Models

Train mixture-of-expert models with 4X fewer GPUs than Blackwell. Massive memory enables trillion-parameter architectures.

AI Factories

NVL72 delivers 3.6 exaflops in a single rack—the compute foundation for next-generation AI data centers.

Real-Time Translation

Sub-10ms inference latency enables real-time multilingual AI assistants with human-quality comprehension.

Video Generation

HBM4 bandwidth enables high-resolution video generation at interactive framerates for media production.

Scientific Discovery

Drug discovery, protein engineering, materials science—compute at the scale of biological complexity.

Rubin vs Previous Generations

Generational leap in AI compute efficiency

Rubin vs Blackwell (B200)

Memory: 288GB HBM4 vs 192GB HBM3e (+50%)
Bandwidth: 22TB/s vs 8TB/s (+175%)
FP4 Inference: 50 PFLOPS vs 10 PFLOPS (5X)
Training: 35 PFLOPS vs 10 PFLOPS (3.5X)
Transistors: 336B vs 208B (1.6X)
Process: 3nm N3P vs 4NP

Verdict: Rubin delivers a true generational leap for AI inference and training

Vera Rubin NVL72 vs DGX GB200

AI Compute: 3.6 EFLOPS vs 720 PFLOPS (5X)
Memory: 20.7TB vs 13.5TB (+53%)
NVLink BW: 260TB/s vs 130TB/s (2X)
vs Hopper: 65X more AI compute
Cost/Token: 10X lower than Blackwell

Verdict: NVL72 is the foundation for exascale AI factories

Frequently Asked Questions

What is the NVIDIA Rubin R100 GPU?

Rubin R100 is NVIDIA's next-generation AI accelerator announced at CES 2026. Built on TSMC 3nm with 336 billion transistors, it features 288GB HBM4 memory with 22TB/s bandwidth. R100 delivers 50 petaflops of FP4 inference—5X faster than Blackwell—and 35 petaflops of training performance.

What is the Vera Rubin Superchip?

The Vera Rubin Superchip combines one Vera CPU (88 Olympus cores with spatial multi-threading for 176 threads) with two Rubin GPUs via NVLink-C2C. This unified architecture delivers 100 PFLOPS of FP4 inference and 2X the CPU performance of Grace.

When is NVIDIA Rubin available?

NVIDIA announced at CES 2026 that Rubin has entered full-scale production with systems available in H2 2026. AWS, Google Cloud, Microsoft Azure, and Oracle Cloud will be among the first to offer Vera Rubin instances. Contact SLYD to reserve allocation.

How does Rubin compare to Blackwell?

Rubin delivers 5X faster FP4 inference, 3.5X faster training, 50% more memory (288GB vs 192GB), and 175% more memory bandwidth (22TB/s vs 8TB/s). It achieves 10X lower cost per token for inference and requires 4X fewer GPUs to train equivalent models.

What is the Vera Rubin NVL72?

NVL72 is a rack-scale system containing 72 Rubin GPUs and 36 Vera CPUs (36 Vera Rubin Superchips). It delivers 3.6 exaflops of FP4 inference, 20.7TB of HBM4 memory, and 260TB/s of NVLink 6 bandwidth. NVIDIA claims it provides 65X more AI compute than Hopper systems.

Should I wait for Rubin or buy Blackwell now?

For production workloads deploying in 2025, Blackwell (B200/B300/GB200) offers excellent performance and is available now. If your deployment timeline extends to H2 2026 or you're planning next-generation AI factory infrastructure, Rubin's 5X inference improvement may warrant waiting. Contact SLYD for capacity planning.

Reserve NVIDIA Rubin Allocation

Join the queue for next-generation AI compute. SLYD is working with OEM partners to secure Rubin allocation for H2 2026 delivery.

Reserve Allocation

Cloud partners deploying Rubin:

AWS Google Cloud Microsoft Azure Oracle Cloud