Can I Run / Qwen 3.6 Plus / on NVIDIA RTX 3090

Can I Run Qwen 3.6 Plus on a NVIDIA RTX 3090?

No

Won't fit — even the smallest quant (Q4_K_M) needs 143.5GB VRAM.

Model size
235B
GPU memory
24.0GB
Smallest quant
Q4_K_M
Best fit

None of Qwen 3.6 Plus's quantizations fit

Even the most aggressive quantization needs more memory than the NVIDIA RTX 3090 provides. Your options below: rent a bigger GPU in the cloud, or upgrade.

Or upgrade your hardware

GPUs that would let you run this model locally:

Apple Mac Studio M3 Ultra (192GB)~$7,499

Unified memory means ~190GB of usable model RAM in a single quiet box. Runs 405B at Q4.

NVIDIA H100 80GB~$30,000

Datacenter-grade. Most users should rent rather than buy — see cloud options.

Advertisement
Full model details
Qwen 3.6 Plus

All quant variants, benchmark scores, and use-case tags.

Best models for this GPU
NVIDIA RTX 3090

Top-ranked open-source models that fit in 24.0GB.

FAQ

Can the NVIDIA RTX 3090 run Qwen 3.6 Plus?

No. Qwen 3.6 Plus (235B) needs at least 143.5GB even at its smallest quantization, more than the 24.0GB on the NVIDIA RTX 3090.

What's the best quantization to use?

None of Qwen 3.6 Plus's available quantizations fit in 24.0GB. You'll need either a larger GPU, a smaller model, or to run it in the cloud.

What if I need more headroom for context length?

KV cache memory grows with context length. The numbers above assume a baseline 2K-4K context. For long-context use (32K+), add another 2-6GB depending on the model architecture.