benchmark_20260307T125148.md 3.9 KB

Benchmark Results - 20260307T125148

Model Selection

Slot Role Model Composite Score
1 General (Primary) deepseek-coder-v2:16b-lite-instruct-q4_K_M 0.683
2 General (Secondary) qwen2.5-coder:7b-instruct-q4_K_M 0.619
3 Coding (Primary) deepseek-coder-v2:16b-lite-instruct-q4_K_M 0.618
4 Coding (Secondary) none N/A

Detailed Metrics

gpt-oss:20b

  • Category: general
  • Coding Quality: 0.978
  • General Quality: 0.925
  • Avg Tokens/sec: 10.3
  • Latency (ms): 8158.0
  • Coding Composite: 0.471
  • General Composite: 0.447

    deepseek-r1:14b

  • Category: general

  • Coding Quality: 0.853

  • General Quality: 0.948

  • Avg Tokens/sec: 6.4

  • Latency (ms): 2677.7

  • Coding Composite: 0.519

  • General Composite: 0.562

    phi4:14b

  • Category: general

  • Coding Quality: 0.904

  • General Quality: 0.931

  • Avg Tokens/sec: 6.6

  • Latency (ms): 4394.9

  • Coding Composite: 0.457

  • General Composite: 0.469

    qwen3-coder-next:latest

  • Category: general

  • Coding Quality: 0.785

  • General Quality: 0.892

  • Avg Tokens/sec: 4.6

  • Latency (ms): 3462.7

  • Coding Composite: 0.444

  • General Composite: 0.492

    qwen3.5:35b

  • Category: general

  • Coding Quality: 0.879

  • General Quality: 1.0

  • Avg Tokens/sec: 5.3

  • Latency (ms): 133176.0

  • Coding Composite: 0.411

  • General Composite: 0.466

    qwen3-coder:30b

  • Category: general

  • Coding Quality: 0.885

  • General Quality: 0.872

  • Avg Tokens/sec: 7.9

  • Latency (ms): 1769.0

  • Coding Composite: 0.584

  • General Composite: 0.578

    qwen2.5-coder:7b-instruct-q4_K_M

  • Category: general

  • Coding Quality: 0.83

  • General Quality: 0.887

  • Avg Tokens/sec: 11.5

  • Latency (ms): 1301.7

  • Coding Composite: 0.593

  • General Composite: 0.619

    qwen2.5-coder:7b-instruct-q5_K_M

  • Category: general

  • Coding Quality: 0.81

  • General Quality: 0.925

  • Avg Tokens/sec: 9.0

  • Latency (ms): 2900.9

  • Coding Composite: 0.496

  • General Composite: 0.548

    qwen2.5-coder:7b-instruct-q6_K

  • Category: general

  • Coding Quality: 0.832

  • General Quality: 0.919

  • Avg Tokens/sec: 5.9

  • Latency (ms): 2112.8

  • Coding Composite: 0.536

  • General Composite: 0.576

    deepseek-coder-v2:16b-lite-instruct-q4_K_M

  • Category: general

  • Coding Quality: 0.855

  • General Quality: 1.0

  • Avg Tokens/sec: 21.3

  • Latency (ms): 1617.0

  • Coding Composite: 0.618

  • General Composite: 0.683

    qwen2.5-coder:14b-instruct-q4_K_M

  • Category: general

  • Coding Quality: 0.84

  • General Quality: 0.848

  • Avg Tokens/sec: 4.9

  • Latency (ms): 6865.3

  • Coding Composite: 0.393

  • General Composite: 0.396

    codellama:13b-instruct-q5_K_M

  • Category: general

  • Coding Quality: 0.804

  • General Quality: 0.671

  • Avg Tokens/sec: 4.1

  • Latency (ms): 1126.4

  • Coding Composite: 0.568

  • General Composite: 0.508

    codestral:22b-v0.1-q4_K_M

  • Category: general

  • Coding Quality: 0.696

  • General Quality: 0.887

  • Avg Tokens/sec: 2.3

  • Latency (ms): 58429.3

  • Coding Composite: 0.32

  • General Composite: 0.406

    dolphin-mixtral:8x7b

  • Category: general

  • Coding Quality: 0.755

  • General Quality: 0.725

  • Avg Tokens/sec: 4.8

  • Latency (ms): 3065.7

  • Coding Composite: 0.451

  • General Composite: 0.437

    mistral:7b-instruct

  • Category: general

  • Coding Quality: 0.846

  • General Quality: 0.717

  • Avg Tokens/sec: 12.1

  • Latency (ms): 6696.2

  • Coding Composite: 0.417

  • General Composite: 0.359

Scoring Formula

  • Composite = quality * 0.45 + token_speed_normalized * 0.30 + latency_score * 0.25
  • Category: coding if (coding_composite - general_composite) >= 0.15, else general