NVIDIA RTX Pro 6000 Blackwell — the AI GPU for serious workstations

96 GB of ECC GDDR7, NVLink support, and Blackwell tensor cores: NVIDIA's professional AI GPU benchmarked against the RTX 5090 for local inference and fine-tuning.

Who needs this over an RTX 5090?

The RTX 5090 has 32 GB of VRAM. The RTX Pro 6000 has 96 GB — ECC-protected, dual-slot, NVLink-capable. For fine-tuning 70B+ models or running multi-model inference in parallel, the VRAM gap isn’t an inconvenience, it’s a hard constraint.

Inference benchmarks

RTX Pro 6000 Blackwell — LLM inference

Llama 3 70B Q4 (tok/s)

38.7 tok/s

RTX 5090 (same model)

21.3 tok/s

Llama 3 8B Q4 (tok/s)

118.4 tok/s

The cost reality

At approximately $6,000–$8,000 street price, the RTX Pro 6000 is for research labs, ML engineers, and studios — not enthusiast builds. If 32 GB is enough, the RTX 5090 at $1,599 remains the consumer inference recommendation.