German Artificial Analytics
a PeerBench project

Admin · Bradley-Terry ★ aggregation internal

Bradley-Terry MLE (Hunter MM, weak prior) over per-benchmark pairwise wins; batch fit, order/seed-independent. CI: 300x benchmark bootstrap.

Weight
Price
Speed
#
Model
Score
Price
Speed
English
1
Gemini 3.1 Pro 5/7
Google · Closed
80.3
78–89
$6.75
$$$ · /Mtok
127
tok/s
57
2
Gemini 3.5 Flash 7/7
Google · Closed
68.4
67–75
$2.81
$$ · /Mtok
202.5
tok/s
55
3
Qwen3.7 Max 4/7
Alibaba · Closed
57.7
51–67
$1.28
$$ · /Mtok
170.3
tok/s
57
4
Gemini 3.1 Flash-Lite 7/7
Google · Closed
56.4
52–65
$0.44
$ · /Mtok
302.3
tok/s
34
5
DeepSeek V4 Pro 4/7
DeepSeek · Open weights
54.1
50–59
$1.62
$$ · /Mtok
57.8
tok/s
52
6
Gemma 4 31B 7/7
Google · Open weights
52.4
49–57
$0.17
$ · /Mtok
38.4
tok/s
39
7
DeepSeek V4 Flash 7/7
DeepSeek · Open weights
52.1
48–58
$0.14
$ · /Mtok
102.5
tok/s
47
8
MiMo V2.5 Pro 5/7
Xiaomi · Open weights
51.7
46–61
$0.85
$$ · /Mtok
49.2
tok/s
54
9
Gemini 2.5 Flash 7/7
Google · Closed
51.0
47–57
$1.00
$$ · /Mtok
216.2
tok/s
31
10
Qwen3.6 35B-A3B 6/7
Alibaba · Open weights
48.7
46–53
$0.30
$ · /Mtok
141.6
tok/s
44
11
Claude Haiku 4.5 3/7
Anthropic · Closed
46.4
44–49
$3.36
$$$ · /Mtok
140.6
tok/s
37
12
Gemma 4 26B A4B 7/7
Google · Open weights
45.0
40–51
$0.23
$ · /Mtok
84.4
tok/s
31
13
Gemma 4 12B 7/7
Google · Open weights
44.8
40–49
46.3
tok/s
29
14
GLM-5.1 5/7
Z.ai · Open weights
44.7
36–52
$1.45
$$ · /Mtok
71.4
tok/s
51
15
Tencent HY3-Preview 6/7
Tencent · Open weights
44.7
38–48
$0.08
$ · /Mtok
94.8
tok/s
42
16
Qwen3.5 9B 7/7
Alibaba · Open weights
39.1
34–42
$0.12
$ · /Mtok
57.7
tok/s
32
17
Qwen3 14B 7/7
Alibaba · Open weights
35.4
29–39
$0.14
$ · /Mtok
64.7
tok/s
16

Showing 17 models that ran ≥3 of 7 benchmarks (9 excluded for thin coverage). Price = median effective $/1M tokens; Speed = throughput + latency. The n/7 chip = how many benchmarks back the score. English = English-language intelligence, a background prior anchoring every Score (not a German benchmark, never in the German tables).