Code Arena🏆Overall

View overall rankings across AI models on agentic coding tasks involving multi-step reasoning and tool use.

Apr 9, 2026
231,158 votes
60 models
Rank Spread
1
13
Anthropic
Anthropic · Proprietary
1548+11/-11
4,015$5 / $251M
2
13
Anthropic
Anthropic · Proprietary
1542+10/-10
4,841$5 / $251M
3
14
Z.ai · MIT
1530+20/-20
1,046$0.95 / $3.15202.8K
4
34
Anthropic
Anthropic · Proprietary
1521+9/-9
6,979$3 / $151M
5
55
Anthropic
1490+7/-7
13,065$5 / $25200K
6
69
Anthropic
Anthropic · Proprietary
1466+7/-7
14,517$5 / $25200K
7
615
OpenAI · Proprietary
1457+17/-17
1,485$2.50 / $151.1M
8
612
Google · Proprietary
1456+9/-9
5,819$2 / $121M
9
615
Alibaba · Proprietary
1453+14/-14
2,112$0.33 / $1.951M
10
717
Z.ai · MIT
1439+10/-10
4,878$0.39 / $1.75202.8K
11
717
Z.ai · MIT
1439+10/-10
4,731$1 / $3.20202.8K
12
817
Google · Proprietary
1438+7/-7
17,157$2 / $121M
13
717
OpenAI · Proprietary
1437+16/-16
1,449$2.50 / $151.1M
14
817
Google · Proprietary
1436+7/-7
13,265$0.50 / $31M
15
817
Xiaomi · Proprietary
1433+12/-12
3,049$1 / $31M
16
1017
Moonshot · Modified MIT
1429+8/-8
6,480$0.60 / $3N/A
17
1020
MiniMax · Proprietary
1425+12/-12
2,884$0.30 / $1.20204.8K
18
1726
Moonshot · Modified MIT
1408+11/-11
3,610$0.38 / $1.72262.1K
19
1728
OpenAI · Proprietary
1407+12/-12
2,971$1.75 / $14400K
20
1731
OpenAI · Proprietary
1403+17/-17
1,461$1.75 / $14400K
21
1831
1393+11/-11
3,156$2 / $62M
22
1831
OpenAI · Proprietary
1393+13/-13
3,755$1.25 / $10400K
23
1831
MiniMax · Modified MIT
1392+8/-8
7,024$0.12 / $0.99196.6K
24
1831
MiniMax · MIT
1391+8/-8
9,271$0.29 / $0.95196.6K
25
1831
OpenAI · Proprietary
1390+9/-9
6,124$1.25 / $10400K
26
1931
1390+7/-7
12,511$0.50 / $31M
27
2031
Anthropic
1388+6/-6
15,742$3 / $15200K
28
1833
OpenAI · Proprietary
1388+15/-15
1,651$2.50 / $151.1M
29
1931
Alibaba · Apache 2.0
1386+9/-9
5,824$0.39 / $2.34262.1K
30
2031
Anthropic
Anthropic · Proprietary
1386+6/-6
18,527$3 / $15200K
31
2031
Anthropic
Anthropic · Proprietary
1385+9/-9
8,573$15 / $75200K
32
3134
DeepSeek · MIT
1368+8/-8
7,992$0.26 / $0.38163.8K
33
3134
Alibaba · Apache 2.0
1365+10/-10
4,562$0.26 / $2.08262.1K
34
3236
Z.ai · MIT
1354+9/-9
8,350$0.39 / $1.90204.8K
35
3441
Alibaba · Apache 2.0
1344+10/-10
4,206$0.20 / $1.56262.1K
36
3441
OpenAI · Proprietary
1339+7/-7
12,870$1.25 / $10400K
37
3541
1337+8/-8
6,731$0.09 / $0.29262.1K
38
3541
OpenAI · Proprietary
1335+8/-8
7,763$1.75 / $14400K
39
3541
DeepSeek · MIT
1330+7/-7
9,859$0.26 / $0.38163.8K
40
3541
Moonshot · Modified MIT
1329+6/-6
15,484$1.15 / $8262.1K
41
3542
OpenAI · Proprietary
1329+9/-9
6,227$1.25 / $10400K
42
4144
Anthropic
Anthropic · Proprietary
1315+6/-6
16,929$1 / $5200K
43
4245
MiniMax · Apache 2.0
1304+9/-9
8,401$0.26 / $1196.6K
44
4246
1300+14/-14
2,092$0.09 / $0.29262.1K
45
4346
DeepSeek · MIT
1286+11/-11
4,870$0.27 / $0.41163.8K
46
4446
Alibaba · Apache 2.0
1281+7/-7
15,206$0.40 / $1.60262.1K
47
4752
Kwai
KwaiKAT · Proprietary
1257+15/-15
1,883$0.21 / $0.83256K
48
4753
Alibaba · Apache 2.0
1247+16/-16
1,818$0.16 / $1.30262.1K
49
4754
OpenAI · Proprietary
1239+17/-17
1,444$0.25 / $2400K
50
4753
Google · Proprietary
1237+10/-10
5,394$0.25 / $1.501M
51
4754
Alibaba · Proprietary
1236+17/-17
1,562N/AN/A
52
4754
xAI · Proprietary
1233+9/-9
6,916$0.20 / $0.502M
53
4856
Mistral · Apache 2.0
1222+20/-20
1,032$0.50 / $1.50N/A
54
5057
xAI · Proprietary
1207+20/-20
1,209N/AN/A
55
5356
Google · Proprietary
1202+13/-13
3,300$1.25 / $101M
56
5357
Mistral · Modified MIT
1197+17/-17
1,577N/AN/A
57
5559
Inception AI · Proprietary
1166+23/-23
951$0.25 / $0.75128K
58
5759
xAI · Proprietary
1148+23/-23
936$0.20 / $0.502M
59
5759
xAI · Proprietary
1139+22/-22
984$0.20 / $1.50256K
60
6060
Mistral · Proprietary
1091+23/-23
993$0.40 / $2128K

Remove Style Control Leaderboard Plots

Confidence Intervals on Model Strength (via Bootstrapping)

Battle Count for Each Combination of Models (without Ties)

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)

Fraction of Model A Wins for All Non-tied A vs. B Battles