Select a RAM tier to see which Qwen 3.5 models fit, how memory is allocated, and which multi-model combinations you can run.
Select Your RAM Tier
All sizes shown are Q4_K_M quantization. KV cache calculated at 16K context window.
See how memory is distributed when running a model on this tier.
Ollama can load multiple models simultaneously. Here are practical combos for your selected tier.
All Qwen 3.5 models across all RAM tiers. Free headroom shown in GB after model weights, KV cache (16K), OS, and agents.