Comparison Ollama Vulkan vs Ollama Rocm

Maeddes_G January 28, 2026, 4:15pm 2

Update : Added context length scaling (2k-32k) and CPU baseline (R9 7945hx 64GB)

num_ctx	Vulkan	ROCm	Native (CPU)
2048	56.8 t/s	49.7 t/s	24.6 t/s
8192	52.6 t/s	49.1 t/s	24.8 t/s
16384	46.1 t/s	46.5 t/s	24.7 t/s
32768	40.3 t/s	43.4 t/s	24.5 t/s

Context scaling (2k → 32k performance loss):

Vulkan: -29%
ROCm: -13%
Native: -0.2%

Power consumption:

Vulkan: ~65 W (0.7-0.9 t/W)
ROCm: ~150 W (0.3 t/W)
Native: ~11 W (2.2 t/W)

Takeaways:

Small contexts (≤8k): Vulkan wins on speed + efficiency
Large contexts (≥16k): ROCm catches up in speed, but 2x power
CPU scales perfectly but is 2x slower than GPU
If power/heat matters more than speed: CPU is surprisingly viable at 25 t/s