GPU GUIDE · NVIDIA
Best AI models for the
NVIDIA B200
The NVIDIA B200 has 192GB of VRAM. Below are the top 30 open-source AI models that fit, ranked by composite benchmark score. Each row shows the best quantization that fits your hardware.
VRAM
192GB
Brand
nvidia
Models that fit
30
Generation
Data Center
Top 30 models for the NVIDIA B200
01
82.9GLM-5
230BBest fit: Q6_K · 190.5GBCan I run it? →
02
83.3Qwen 3.6 Plus
235BBest fit: Q5_K_M · 167.8GBCan I run it? →
03
77.5DeepSeek V4 Flash
37.0BBest fit: fp16 · 75.0GBCan I run it? →
04
76.4Qwen 3.6 27B
27.0BBest fit: fp16 · 55.0GBCan I run it? →
05
74.9MiMo V2 Omni
120BBest fit: Q8_0 · 128.5GBCan I run it? →
06
72.5Qwen 3.6 35B A3B
35.0BBest fit: fp16 · 71.0GBCan I run it? →
07
70.1Qwen3.5-27B
27.8BBest fit: f32 · 112.2GBCan I run it? →
08
69.3Qwen 3.5 122B A10B
122BBest fit: Q8_0 · 130.6GBCan I run it? →
09
65.4Mistral Medium 3.5 128B
128BBest fit: Q6_K · 106.2GBCan I run it? →
10
65.3Gemma 4 31B (free)
31.0BBest fit: fp16 · 63.0GBCan I run it? →
11
64.4Qwen 3.5 Omni Plus
235BBest fit: Q5_K_M · 167.8GBCan I run it? →
12
61.9Qwen 3.5 35B A3B
35.0BBest fit: fp16 · 71.0GBCan I run it? →
13
59.9NVIDIA Nemotron 3 Super 120B A12B BF16
124BBest fit: Q8_0 · 132.3GBCan I run it? →
14
55.4GPT-OSS 120B
120BBest fit: Q8_0 · 128.5GBCan I run it? →
15
54.0Qwen 3.5 9B
9.0BBest fit: fp16 · 19.0GBCan I run it? →
16
53.8gemma 4 31B
32.7BBest fit: fp16 · 66.4GBCan I run it? →
17
52.0Gemma 4 26B A4B (free)
26.5BBest fit: f32 · 107.0GBCan I run it? →
18
49.2Qwen3 235B A22B Thinking 2507
235BBest fit: Q4_K_M · 143.5GBCan I run it? →
19
45.1Qwen 3.5 4B
4.0BBest fit: fp16 · 9.0GBCan I run it? →
20
45.0Qwen 3 Next 80B A3B
80.0BBest fit: fp16 · 161.0GBCan I run it? →
21
44.5Qwen3 Next 80B A3B Thinking
81.3BBest fit: Q8_0 · 87.4GBCan I run it? →
22
41.6Qwen 3 235B A22B Instruct 2507
235BBest fit: Q5_K_M · 167.8GBCan I run it? →
23
40.8GPT-OSS 20B
20.0BBest fit: fp16 · 41.0GBCan I run it? →
24
40.5Nemotron 3 Nano 30B A3B (free)
31.6BBest fit: Q8_0 · 34.6GBCan I run it? →
25
37.4Qwen3 30B A3B Thinking 2507
30.5BBest fit: Q8_0 · 33.4GBCan I run it? →
26
36.7Devstral 2
24.0BBest fit: fp16 · 49.0GBCan I run it? →
27
35.7Nemotron 3 Nano Omni 30B A3B Reasoning BF16
30.0BBest fit: fp16 · 61.0GBCan I run it? →
28
33.3Qwen3 Coder 30B A3B Instruct
30.5BBest fit: Q8_0 · 33.4GBCan I run it? →
29
32.9QwQ 32B
32.0BBest fit: fp16 · 65.0GBCan I run it? →
30
32.8Qwen3 VL 30B A3B Thinking
31.1BBest fit: f32 · 125.4GBCan I run it? →
Advertisement
FAQ — running AI on the NVIDIA B200
How many AI models can the NVIDIA B200 run?
With 192GB of VRAM, the NVIDIA B200 can run 30+ open-source models from our database, including GLM-5, Qwen 3.6 Plus, DeepSeek V4 Flash.
What's the largest LLM I can run on a NVIDIA B200?
The biggest model that fits is approximately 235B. Larger models would need to be quantized further or won't fit at all.
Is 192GB of VRAM enough for local AI?
Yes — 192GB comfortably runs most popular open-source models including 30B-class LLMs at Q4_K_M.