Benchmark Results - 20260310T102149
Model Selection (6-slot / 2-socket)
| Slot |
Socket |
Role |
Model |
Composite Score |
| 1 |
Node 1 (port 11434) |
General (locked) |
llama3.2:3b |
0.819 |
| 2 |
Node 1 (port 11434) |
General (locked) |
llama3.1:8b |
0.621 |
| 5 |
Node 1 (port 11434) |
General (rotate) |
gemma3:12b-it-q4_K_M |
0.484 |
| 3 |
Node 0 (port 11435) |
Coding (locked) |
deepseek-coder-v2:16b |
0.707 |
| 4 |
Node 0 (port 11435) |
Coding (locked) |
deepseek-coder-v2:latest |
0.681 |
| 6 |
Node 0 (port 11435) |
Coding (rotate) |
qwen2.5-coder:latest |
0.644 |
Detailed Metrics
codellama:34b
- Category: coding
- Coding Quality: 0.783
- General Quality: 0.586
- Avg Tokens/sec: 3.2
- Latency (ms): 4350.0
- Coding Composite: 0.409
- General Composite: 0.32
deepseek-coder-v2:16b
- Category: coding
- Coding Quality: 0.783
- General Quality: 0.885
- Avg Tokens/sec: 24.6
- Latency (ms): 1586.8
- Coding Composite: 0.707
- General Composite: 0.753
qwen2.5-coder:14B
- Category: coding
- Coding Quality: 0.8
- General Quality: 0.931
- Avg Tokens/sec: 6.6
- Latency (ms): 2223.7
- Coding Composite: 0.549
- General Composite: 0.608
deepseek-coder-v2:latest
- Category: coding
- Coding Quality: 0.783
- General Quality: 0.885
- Avg Tokens/sec: 22.2
- Latency (ms): 1759.1
- Coding Composite: 0.681
- General Composite: 0.727
qwen2.5-coder:latest
- Category: coding
- Coding Quality: 0.8
- General Quality: 0.91
- Avg Tokens/sec: 12.8
- Latency (ms): 1239.2
- Coding Composite: 0.644
- General Composite: 0.694
llama3.1:8b
- Category: general
- Coding Quality: 0.8
- General Quality: 0.877
- Avg Tokens/sec: 11.8
- Latency (ms): 2251.2
- Coding Composite: 0.586
- General Composite: 0.621
qwen2.5-coder:7b
- Category: coding
- Coding Quality: 0.8
- General Quality: 0.91
- Avg Tokens/sec: 12.3
- Latency (ms): 1258.3
- Coding Composite: 0.639
- General Composite: 0.689
gemma3:12b-it-q4_K_M
- Category: general
- Coding Quality: 0.85
- General Quality: 0.966
- Avg Tokens/sec: 6.6
- Latency (ms): 5701.3
- Coding Composite: 0.432
- General Composite: 0.484
llama3.2:3b
- Category: general
- Coding Quality: 0.85
- General Quality: 0.954
- Avg Tokens/sec: 22.7
- Latency (ms): 613.5
- Coding Composite: 0.772
- General Composite: 0.819
Scoring Formula
- Composite = quality * 0.45 + token_speed_normalized * 0.30 + latency_score * 0.25
- Speed normalized against 40 tok/sec ceiling (hardware-observed max)
- Coding quality (per-prompt):
code_gen: has_def×0.20 + has_return×0.20 + has_docstring×0.15 + has_type_hint×0.15 + has_code_block×0.10 + has_assert×0.08 + has_test_def×0.07 + has_import×0.05
debug: has_def×0.30 + has_return×0.30 + has_code_block×0.25 + has_assert×0.15
refactor: has_def×0.25 + has_return×0.25 + has_code_block×0.20 + has_type_hint×0.15 + has_import×0.15
- Category: override dict → quality delta (coding_avg - general_avg >= 0.1) → name pattern (coder/codestral/codellama/starcoder) → general