|
|
@@ -0,0 +1,117 @@
|
|
|
+# Benchmark Results - 20260310T102149
|
|
|
+
|
|
|
+## Model Selection (6-slot / 2-socket)
|
|
|
+
|
|
|
+
|
|
|
+| Slot | Socket | Role | Model | Composite Score |
|
|
|
+| ---- | ------------------- | ---------------- | ------------------------ | --------------- |
|
|
|
+| 1 | Node 1 (port 11434) | General (locked) | llama3.2:3b | 0.819 |
|
|
|
+| 2 | Node 1 (port 11434) | General (locked) | llama3.1:8b | 0.621 |
|
|
|
+| 5 | Node 1 (port 11434) | General (rotate) | gemma3:12b-it-q4_K_M | 0.484 |
|
|
|
+| 3 | Node 0 (port 11435) | Coding (locked) | deepseek-coder-v2:16b | 0.707 |
|
|
|
+| 4 | Node 0 (port 11435) | Coding (locked) | deepseek-coder-v2:latest | 0.681 |
|
|
|
+| 6 | Node 0 (port 11435) | Coding (rotate) | qwen2.5-coder:latest | 0.644 |
|
|
|
+
|
|
|
+
|
|
|
+## Detailed Metrics
|
|
|
+
|
|
|
+### codellama:34b
|
|
|
+
|
|
|
+- **Category**: coding
|
|
|
+- **Coding Quality**: 0.783
|
|
|
+- **General Quality**: 0.586
|
|
|
+- **Avg Tokens/sec**: 3.2
|
|
|
+- **Latency (ms)**: 4350.0
|
|
|
+- **Coding Composite**: 0.409
|
|
|
+- **General Composite**: 0.32
|
|
|
+
|
|
|
+### deepseek-coder-v2:16b
|
|
|
+
|
|
|
+- **Category**: coding
|
|
|
+- **Coding Quality**: 0.783
|
|
|
+- **General Quality**: 0.885
|
|
|
+- **Avg Tokens/sec**: 24.6
|
|
|
+- **Latency (ms)**: 1586.8
|
|
|
+- **Coding Composite**: 0.707
|
|
|
+- **General Composite**: 0.753
|
|
|
+
|
|
|
+### qwen2.5-coder:14B
|
|
|
+
|
|
|
+- **Category**: coding
|
|
|
+- **Coding Quality**: 0.8
|
|
|
+- **General Quality**: 0.931
|
|
|
+- **Avg Tokens/sec**: 6.6
|
|
|
+- **Latency (ms)**: 2223.7
|
|
|
+- **Coding Composite**: 0.549
|
|
|
+- **General Composite**: 0.608
|
|
|
+
|
|
|
+### deepseek-coder-v2:latest
|
|
|
+
|
|
|
+- **Category**: coding
|
|
|
+- **Coding Quality**: 0.783
|
|
|
+- **General Quality**: 0.885
|
|
|
+- **Avg Tokens/sec**: 22.2
|
|
|
+- **Latency (ms)**: 1759.1
|
|
|
+- **Coding Composite**: 0.681
|
|
|
+- **General Composite**: 0.727
|
|
|
+
|
|
|
+### qwen2.5-coder:latest
|
|
|
+
|
|
|
+- **Category**: coding
|
|
|
+- **Coding Quality**: 0.8
|
|
|
+- **General Quality**: 0.91
|
|
|
+- **Avg Tokens/sec**: 12.8
|
|
|
+- **Latency (ms)**: 1239.2
|
|
|
+- **Coding Composite**: 0.644
|
|
|
+- **General Composite**: 0.694
|
|
|
+
|
|
|
+### llama3.1:8b
|
|
|
+
|
|
|
+- **Category**: general
|
|
|
+- **Coding Quality**: 0.8
|
|
|
+- **General Quality**: 0.877
|
|
|
+- **Avg Tokens/sec**: 11.8
|
|
|
+- **Latency (ms)**: 2251.2
|
|
|
+- **Coding Composite**: 0.586
|
|
|
+- **General Composite**: 0.621
|
|
|
+
|
|
|
+### qwen2.5-coder:7b
|
|
|
+
|
|
|
+- **Category**: coding
|
|
|
+- **Coding Quality**: 0.8
|
|
|
+- **General Quality**: 0.91
|
|
|
+- **Avg Tokens/sec**: 12.3
|
|
|
+- **Latency (ms)**: 1258.3
|
|
|
+- **Coding Composite**: 0.639
|
|
|
+- **General Composite**: 0.689
|
|
|
+
|
|
|
+### gemma3:12b-it-q4_K_M
|
|
|
+
|
|
|
+- **Category**: general
|
|
|
+- **Coding Quality**: 0.85
|
|
|
+- **General Quality**: 0.966
|
|
|
+- **Avg Tokens/sec**: 6.6
|
|
|
+- **Latency (ms)**: 5701.3
|
|
|
+- **Coding Composite**: 0.432
|
|
|
+- **General Composite**: 0.484
|
|
|
+
|
|
|
+### llama3.2:3b
|
|
|
+
|
|
|
+- **Category**: general
|
|
|
+- **Coding Quality**: 0.85
|
|
|
+- **General Quality**: 0.954
|
|
|
+- **Avg Tokens/sec**: 22.7
|
|
|
+- **Latency (ms)**: 613.5
|
|
|
+- **Coding Composite**: 0.772
|
|
|
+- **General Composite**: 0.819
|
|
|
+
|
|
|
+## Scoring Formula
|
|
|
+
|
|
|
+- Composite = quality * 0.45 + token_speed_normalized * 0.30 + latency_score * 0.25
|
|
|
+- Speed normalized against 40 tok/sec ceiling (hardware-observed max)
|
|
|
+- Coding quality (per-prompt):
|
|
|
+code_gen: has_def×0.20 + has_return×0.20 + has_docstring×0.15 + has_type_hint×0.15 + has_code_block×0.10 + has_assert×0.08 + has_test_def×0.07 + has_import×0.05
|
|
|
+debug: has_def×0.30 + has_return×0.30 + has_code_block×0.25 + has_assert×0.15
|
|
|
+refactor: has_def×0.25 + has_return×0.25 + has_code_block×0.20 + has_type_hint×0.15 + has_import×0.15
|
|
|
+- Category: override dict → quality delta (coding_avg - general_avg >= 0.1) → name pattern (coder/codestral/codellama/starcoder) → general
|
|
|
+
|