Coding

Best models for code generation and debugging.

Updated July 4, 2026

Best models for code generation and debugging.

# Model Score AI Index Context Input / 1M Output / 1M Caps
1Anthropic: Claude Fable 5anthropic New Top Pick93.359.91M$10.00$50.00
2Anthropic: Claude Sonnet 5anthropic New In-House Pick9353.41M$2.00$10.00
3Anthropic: Claude Opus 4.8anthropic Top Pick In-House Pick92.355.71M$5.00$25.00
4Google: Gemini 3.1 Pro Previewgoogle89.846.51M$2.00$12.00
5Anthropic: Claude Opus 4.7anthropic89.453.51M$5.00$25.00
6Google: Gemini 3.5 Flashgoogle88.950.21M$1.50$9.00
7OpenAI: GPT-5.5openai Top Pick88.354.81.1M$5.00$30.00
8OpenAI: GPT-5.4openai8751.41.1M$2.50$15.00
9Z.ai: GLM 5.2z-ai New86.151.11M$0.7700$2.42
10Anthropic: Claude Sonnet 4.6anthropic85.147.21M$3.00$15.00
11Qwen: Qwen3.7 Maxqwen83.5461M$1.25$3.75
12Anthropic: Claude Opus 4.6anthropic8337.81M$5.00$25.00
13DeepSeek: DeepSeek V4 Prodeepseek82.144.31M$0.4350$0.8700
14xAI: Grok 4x-ai8233.3256K$3.00$15.00
15Inception: Mercury 2inception8225.3128K$0.2500$0.7500
16DeepSeek R1 Distill Qwen 14BDeepSeek829.8FreeFree
17MoonshotAI: Kimi K2 0905moonshotai8223.5262K$0.6000$2.50
18Qwen: Qwen3 32Bqwen8211.5131K$0.0800$0.2800
19Xiaomi: MiMo-V2.5-Proxiaomi8242.21M$0.4350$0.8700
20Mistral: Mistral Medium 3.5mistralai8229.9262K$1.50$7.50
21Z.ai: GLM 4.5z-ai8219.5131K$0.6000$2.20
22OpenAI: o3 Mini Highopenai8215.6200K$1.10$4.40
23Hermes 4 – Llama-3.1 405B (Reasoning)Nous Research829$1.00$3.00
24MoonshotAI: Kimi K2.7 Codemoonshotai New8241.9262K$0.7400$3.50
25Qwen: Qwen3 Maxqwen8224262K$0.7800$3.90
26TNG: DeepSeek R1T2 Chimeratngtech8227164K$0.3000$1.10
27DeepSeek-Coder-V2DeepSeek825.1FreeFree
28OpenAI: GPT-5.2openai8226400K$1.75$14.00
29Qwen: Qwen3 30B A3B Thinking 2507qwen8214.4131K$0.1300$1.56
30Qwen: Qwen3 235B A22Bqwen8213.4131K$0.4550$1.82
31OpenAI: GPT-5.1-Codexopenai8234.7400K$1.25$10.00
32Z.ai: GLM 4.5 Airz-ai8216.5131K$0.1300$0.8500
33OpenAI: o3 Miniopenai8219200K$1.10$4.40
34Hermes 4 – Llama-3.1 70B (Reasoning)Nous Research8210$0.1300$0.4000
35Qwen: Qwen3.5 397B A17Bqwen8233.7256K$0.3850$2.45
36Google: Gemini 2.5 Progoogle8225.81M$1.25$10.00
37OpenAI: GPT-5.4 Nanoopenai8238.2400K$0.2000$1.25
38DeepSeek R1 Distill Llama 8BDeepSeek826.4FreeFree
39Mistral: Devstral 2 2512mistralai8219.2262K$0.4000$2.00
40xAI: Grok Code Fast 1x-ai8221.6256K$0.2000$1.50
41OpenAI: o3openai8230.4200K$2.00$8.00
42DeepSeek: DeepSeek V4 Flashdeepseek Best Value8240.31M$0.0900$0.1800
43inclusionAI: Ring-2.6-1Tinclusionai8230.6262K$0.0750$0.6250
44OpenAI: GPT-5.1-Codex-Miniopenai8230.6400K$0.2500$2.00
45DeepSeek: R1deepseek8218.5164K$0.7000$2.50
46ERNIE 5.0 Thinking PreviewBaidu8221.9FreeFree
47Nex AGI: Nex-N2-Pronex-agi New Best for Agents8241262K$0.2500$1.00
48Z.ai: GLM 5z-ai8239.5203K$0.6000$1.92
49OpenAI: GPT-5 Codexopenai8236.1400K$1.25$10.00
50xAI: Grok 3 Minix-ai8222.5131K$0.3000$0.5000
#1NewTop Pick93.3
Anthropic: Claude Fable 5anthropic
AI 59.91M ctx$10.00/M in
#2NewIn-House Pick93
Anthropic: Claude Sonnet 5anthropic
AI 53.41M ctx$2.00/M in
#3Top PickIn-House Pick92.3
Anthropic: Claude Opus 4.8anthropic
AI 55.71M ctx$5.00/M in
#489.8
Google: Gemini 3.1 Pro Previewgoogle
AI 46.51M ctx$2.00/M in
#589.4
Anthropic: Claude Opus 4.7anthropic
AI 53.51M ctx$5.00/M in
#688.9
Google: Gemini 3.5 Flashgoogle
AI 50.21M ctx$1.50/M in
#7Top Pick88.3
OpenAI: GPT-5.5openai
AI 54.81.1M ctx$5.00/M in
#887
OpenAI: GPT-5.4openai
AI 51.41.1M ctx$2.50/M in
#9New86.1
Z.ai: GLM 5.2z-ai
AI 51.11M ctx$0.7700/M in
#1085.1
Anthropic: Claude Sonnet 4.6anthropic
AI 47.21M ctx$3.00/M in
#1183.5
Qwen: Qwen3.7 Maxqwen
AI 461M ctx$1.25/M in
#1283
Anthropic: Claude Opus 4.6anthropic
AI 37.81M ctx$5.00/M in
#1382.1
DeepSeek: DeepSeek V4 Prodeepseek
AI 44.31M ctx$0.4350/M in
#1482
xAI: Grok 4x-ai
AI 33.3256K ctx$3.00/M in
#1582
Inception: Mercury 2inception
AI 25.3128K ctx$0.2500/M in
#1682
DeepSeek R1 Distill Qwen 14BDeepSeek
AI 9.8Free/M in
#1782
MoonshotAI: Kimi K2 0905moonshotai
AI 23.5262K ctx$0.6000/M in
#1882
Qwen: Qwen3 32Bqwen
AI 11.5131K ctx$0.0800/M in
#1982
Xiaomi: MiMo-V2.5-Proxiaomi
AI 42.21M ctx$0.4350/M in
#2082
Mistral: Mistral Medium 3.5mistralai
AI 29.9262K ctx$1.50/M in
#2182
Z.ai: GLM 4.5z-ai
AI 19.5131K ctx$0.6000/M in
#2282
OpenAI: o3 Mini Highopenai
AI 15.6200K ctx$1.10/M in
#2382
Hermes 4 – Llama-3.1 405B (Reasoning)Nous Research
AI 9$1.00/M in
#24New82
MoonshotAI: Kimi K2.7 Codemoonshotai
AI 41.9262K ctx$0.7400/M in
#2582
Qwen: Qwen3 Maxqwen
AI 24262K ctx$0.7800/M in
#2682
TNG: DeepSeek R1T2 Chimeratngtech
AI 27164K ctx$0.3000/M in
#2782
DeepSeek-Coder-V2DeepSeek
AI 5.1Free/M in
#2882
OpenAI: GPT-5.2openai
AI 26400K ctx$1.75/M in
#2982
Qwen: Qwen3 30B A3B Thinking 2507qwen
AI 14.4131K ctx$0.1300/M in
#3082
Qwen: Qwen3 235B A22Bqwen
AI 13.4131K ctx$0.4550/M in
#3182
OpenAI: GPT-5.1-Codexopenai
AI 34.7400K ctx$1.25/M in
#3282
Z.ai: GLM 4.5 Airz-ai
AI 16.5131K ctx$0.1300/M in
#3382
OpenAI: o3 Miniopenai
AI 19200K ctx$1.10/M in
#3482
Hermes 4 – Llama-3.1 70B (Reasoning)Nous Research
AI 10$0.1300/M in
#3582
Qwen: Qwen3.5 397B A17Bqwen
AI 33.7256K ctx$0.3850/M in
#3682
Google: Gemini 2.5 Progoogle
AI 25.81M ctx$1.25/M in
#3782
OpenAI: GPT-5.4 Nanoopenai
AI 38.2400K ctx$0.2000/M in
#3882
DeepSeek R1 Distill Llama 8BDeepSeek
AI 6.4Free/M in
#3982
Mistral: Devstral 2 2512mistralai
AI 19.2262K ctx$0.4000/M in
#4082
xAI: Grok Code Fast 1x-ai
AI 21.6256K ctx$0.2000/M in
#4182
OpenAI: o3openai
AI 30.4200K ctx$2.00/M in
#42Best Value82
DeepSeek: DeepSeek V4 Flashdeepseek
AI 40.31M ctx$0.0900/M in
#4382
inclusionAI: Ring-2.6-1Tinclusionai
AI 30.6262K ctx$0.0750/M in
#4482
OpenAI: GPT-5.1-Codex-Miniopenai
AI 30.6400K ctx$0.2500/M in
#4582
DeepSeek: R1deepseek
AI 18.5164K ctx$0.7000/M in
#4682
ERNIE 5.0 Thinking PreviewBaidu
AI 21.9Free/M in
#47NewBest for Agents82
Nex AGI: Nex-N2-Pronex-agi
AI 41262K ctx$0.2500/M in
#4882
Z.ai: GLM 5z-ai
AI 39.5203K ctx$0.6000/M in
#4982
OpenAI: GPT-5 Codexopenai
AI 36.1400K ctx$1.25/M in
#5082
xAI: Grok 3 Minix-ai
AI 22.5131K ctx$0.3000/M in

How we rank AI models

The Design for Online AI Model Leaderboard scores 592 models on a single 0–100 scale built from four weighted dimensions: intelligence (reasoning and knowledge benchmarks), technical capability (coding and tool use), content quality (writing and instruction-following) and value (capability per dollar).

Underlying data is aggregated from the OpenRouter API for pricing and availability, Artificial Analysis for intelligence, coding and agentic indices, and the Hugging Face Open LLM Leaderboard for open-model benchmarks. We refresh these sources daily and layer our own editorial review on top, so a model that benchmarks well but is impractical to deploy will not automatically top the table.

Models are grouped into tiers (Frontier, Professional, Specialist, Efficient, Emerging and Legacy) to make like-for-like comparison easier, and newly released models are flagged so you can see what has just landed.

Leaderboard FAQ

How often is the leaderboard updated?

Pricing, availability and benchmark data are synced daily from our sources, and editorial scores are reviewed whenever a significant new model is released.

How is the overall score calculated?

Each model is graded 0–10 on intelligence, technical capability, content quality and value; those dimensions are weighted and combined into the 0–100 overall score used to rank the table.

Where does the data come from?

From the OpenRouter API, Artificial Analysis and the Hugging Face Open LLM Leaderboard, combined with hands-on editorial testing by the Design for Online team.