German Artificial Analytics
a PeerBench project

Admin · MICE (robust-z) aggregation internal

Multiple imputation (chained equations, m=40) on per-benchmark robust-z; composite pooled with Rubin's rules. Score: 50=average model.

Weight
Price
Speed
#
Model
Score
Price
Speed
English
1
Gemini 3.1 Pro 5/7
Google · Closed
69.3
60–79
$6.75
$$$ · /Mtok
127
tok/s
57
2
Gemini 3.5 Flash 7/7
Google · Closed
65.4
63–68
$2.81
$$ · /Mtok
202.5
tok/s
55
3
Qwen3.7 Max 4/7
Alibaba · Closed
60.7
51–71
$1.28
$$ · /Mtok
170.3
tok/s
57
4
Gemini 3.1 Flash-Lite 7/7
Google · Closed
59.4
57–62
$0.44
$ · /Mtok
302.3
tok/s
34
5
Gemma 4 31B 7/7
Google · Open weights
57.9
56–60
$0.17
$ · /Mtok
38.4
tok/s
39
6
Gemini 2.5 Flash 7/7
Google · Closed
57.6
55–60
$1.00
$$ · /Mtok
216.2
tok/s
31
7
Qwen3.6 35B-A3B 6/7
Alibaba · Open weights
57.1
51–64
$0.30
$ · /Mtok
141.6
tok/s
44
8
DeepSeek V4 Pro 4/7
DeepSeek · Open weights
56.7
47–66
$1.62
$$ · /Mtok
57.8
tok/s
52
9
MiMo V2.5 Pro 5/7
Xiaomi · Open weights
55.6
47–65
$0.85
$$ · /Mtok
49.2
tok/s
54
10
Claude Haiku 4.5 3/7
Anthropic · Closed
53.5
38–69
$3.36
$$$ · /Mtok
140.6
tok/s
37
11
Gemma 4 12B 7/7
Google · Open weights
51.5
49–54
46.3
tok/s
29
12
DeepSeek V4 Flash 7/7
DeepSeek · Open weights
51.2
39–64
$0.14
$ · /Mtok
102.5
tok/s
47
13
Tencent HY3-Preview 6/7
Tencent · Open weights
49.1
34–64
$0.08
$ · /Mtok
94.8
tok/s
42
14
Qwen3.5 9B 7/7
Alibaba · Open weights
45.3
40–51
$0.12
$ · /Mtok
57.7
tok/s
32
15
Gemma 4 26B A4B 7/7
Google · Open weights
41.1
16–67
$0.23
$ · /Mtok
84.4
tok/s
31
16
Qwen3 14B 7/7
Alibaba · Open weights
36.2
26–47
$0.14
$ · /Mtok
64.7
tok/s
16
17
GLM-5.1 5/7
Z.ai · Open weights
17.0
0–52
$1.45
$$ · /Mtok
71.4
tok/s
51

Showing 17 models that ran ≥3 of 7 benchmarks (9 excluded for thin coverage). Price = median effective $/1M tokens; Speed = throughput + latency. The n/7 chip = how many benchmarks back the score. English = English-language intelligence, a background prior anchoring every Score (not a German benchmark, never in the German tables).