Coding

Best models for code generation and debugging.

Updated June 12, 2026

Best models for code generation and debugging.

# Model Score AI Index Context Input / 1M Output / 1M Caps
1Anthropic: Claude Fable 5anthropic New Top Pick94.764.91M$10.00$50.00
2Anthropic: Claude Opus 4.8anthropic New Top Pick In-House Pick92.461.41M$5.00$25.00
3Google: Gemini 3.1 Pro Previewgoogle Best for Agents91.757.21M$2.00$12.00
4OpenAI: GPT-5.5openai Top Pick88.860.21.1M$5.00$30.00
5Anthropic: Claude Opus 4.7anthropic88.357.31M$5.00$25.00
6Anthropic: Claude Sonnet 4.6anthropic In-House Pick84.444.41M$3.00$15.00
7Qwen: Qwen3.7 Maxqwen New83.956.61M$1.25$3.75
8OpenAI: GPT-5.3-Codexopenai83.553.6400K$1.75$14.00
9Google: Gemini 3 Flash Previewgoogle82.5351M$0.5000$3.00
10OpenAI: GPT-5 Codexopenai8244.6400K$1.25$10.00
11Anthropic: Claude 3.7 Sonnet (thinking)anthropic8234.7200K$3.00$15.00
12ERNIE 5.0 Thinking PreviewBaidu8229.1FreeFree
13Qwen: Qwen3 235B A22B Instruct 2507qwen8225262K$0.0900$0.1000
14OpenAI: GPT-5.4 Miniopenai8248.9400K$0.7500$4.50
15Grok 4.20 0309 (Reasoning)xAI8248.5$2.00$6.00
16Google: Gemini 2.5 Pro Preview 05-06google8229.51M$1.25$10.00
17GPT-5.5 (medium)OpenAI Best for Agents8256.7$5.00$30.00
18North Mini CodeCohere New8227.6FreeFree
19DeepSeek: DeepSeek V3.1 Terminusdeepseek8233.9164K$0.2700$0.9500
20OpenAI: o3 Miniopenai8225.9200K$1.10$4.40
21Qwen: Qwen3.6 35B A3Bqwen8243.5262K$0.1500$1.00
22MoonshotAI: Kimi K2 0711moonshotai8226.3131K$0.5700$2.30
23NVIDIA: Nemotron 3 Supernvidia82361M$0.0900$0.4500
24Qwen3 Coder 480B A35B InstructAlibaba8224.8$1.50$7.50
25OpenAI: GPT-5.1openai8247.7400K$1.25$10.00
26GPT-5.5 (low)OpenAI8250.8$5.00$30.00
27DeepSeek: R1deepseek8218.8164K$0.7000$2.50
28Qwen: Qwen3.6 Max Previewqwen8251.8262K$1.04$6.24
29OpenAI: GPT-5.2openai8251.3400K$1.75$14.00
30Mistral: Devstral Mediummistralai8218.7131K$0.4000$2.00
31Kwaipilot: KAT-Coder-Pro V2kwaipilot8243.8256K$0.3000$1.20
32Mistral: Mistral Medium 3.5mistralai8239.2262K$1.50$7.50
33OpenAI: GPT-5.1-Codexopenai8243.1400K$1.25$10.00
34DeepSeek: DeepSeek V3.1deepseek8227.7164K$0.2100$0.7900
35Microsoft: Phi 4microsoft8210.416K$0.0650$0.1400
36o1-previewOpenAI8223.7$16.50$66.00
37Mistral: Devstral 2 2512mistralai8222262K$0.4000$2.00
38Mistral: Devstral Small 1.1mistralai8215.2131K$0.1000$0.3000
39Google: Gemma 4 31Bgoogle8232.3262K$0.1200$0.3500
40OpenAI: GPT-5.1-Codex-Miniopenai8238.6400K$0.2500$2.00
41Qwen: Qwen3 30B A3Bqwen8215.3131K$0.1200$0.5000
42GPT-5.5 (high)OpenAI Best for Coding8258.9$5.00$30.00
43OpenAI: gpt-oss-120bopenai8233.3131K$0.0390$0.1800
44DeepSeek: DeepSeek V3deepseek8216.5131K$0.2002$0.8001
45o1-miniOpenAI8220.4FreeFree
46Nex AGI: DeepSeek V3.1 Nex N1nex-agi8228.1131K$0.1350$0.5000
47xAI: Grok 4x-ai8241.5256K$3.00$15.00
48Qwen: Qwen3.6 Plusqwen82501M$0.3250$1.95
49Google: Gemini 3.5 Flashgoogle New8254.81M$1.50$9.00
50Kwaipilot: KAT-Coder-Pro V1kwaipilot8236256K$0.2070$0.8280
#1NewTop Pick94.7
Anthropic: Claude Fable 5anthropic
AI 64.91M ctx$10.00/M in
#2NewTop PickIn-House Pick92.4
Anthropic: Claude Opus 4.8anthropic
AI 61.41M ctx$5.00/M in
#3Best for Agents91.7
Google: Gemini 3.1 Pro Previewgoogle
AI 57.21M ctx$2.00/M in
#4Top Pick88.8
OpenAI: GPT-5.5openai
AI 60.21.1M ctx$5.00/M in
#588.3
Anthropic: Claude Opus 4.7anthropic
AI 57.31M ctx$5.00/M in
#6In-House Pick84.4
Anthropic: Claude Sonnet 4.6anthropic
AI 44.41M ctx$3.00/M in
#7New83.9
Qwen: Qwen3.7 Maxqwen
AI 56.61M ctx$1.25/M in
#883.5
OpenAI: GPT-5.3-Codexopenai
AI 53.6400K ctx$1.75/M in
#982.5
Google: Gemini 3 Flash Previewgoogle
AI 351M ctx$0.5000/M in
#1082
OpenAI: GPT-5 Codexopenai
AI 44.6400K ctx$1.25/M in
#1182
Anthropic: Claude 3.7 Sonnet (thinking)anthropic
AI 34.7200K ctx$3.00/M in
#1282
ERNIE 5.0 Thinking PreviewBaidu
AI 29.1Free/M in
#1382
Qwen: Qwen3 235B A22B Instruct 2507qwen
AI 25262K ctx$0.0900/M in
#1482
OpenAI: GPT-5.4 Miniopenai
AI 48.9400K ctx$0.7500/M in
#1582
Grok 4.20 0309 (Reasoning)xAI
AI 48.5$2.00/M in
#1682
Google: Gemini 2.5 Pro Preview 05-06google
AI 29.51M ctx$1.25/M in
#17Best for Agents82
GPT-5.5 (medium)OpenAI
AI 56.7$5.00/M in
#18New82
North Mini CodeCohere
AI 27.6Free/M in
#1982
DeepSeek: DeepSeek V3.1 Terminusdeepseek
AI 33.9164K ctx$0.2700/M in
#2082
OpenAI: o3 Miniopenai
AI 25.9200K ctx$1.10/M in
#2182
Qwen: Qwen3.6 35B A3Bqwen
AI 43.5262K ctx$0.1500/M in
#2282
MoonshotAI: Kimi K2 0711moonshotai
AI 26.3131K ctx$0.5700/M in
#2382
NVIDIA: Nemotron 3 Supernvidia
AI 361M ctx$0.0900/M in
#2482
Qwen3 Coder 480B A35B InstructAlibaba
AI 24.8$1.50/M in
#2582
OpenAI: GPT-5.1openai
AI 47.7400K ctx$1.25/M in
#2682
GPT-5.5 (low)OpenAI
AI 50.8$5.00/M in
#2782
DeepSeek: R1deepseek
AI 18.8164K ctx$0.7000/M in
#2882
Qwen: Qwen3.6 Max Previewqwen
AI 51.8262K ctx$1.04/M in
#2982
OpenAI: GPT-5.2openai
AI 51.3400K ctx$1.75/M in
#3082
Mistral: Devstral Mediummistralai
AI 18.7131K ctx$0.4000/M in
#3182
Kwaipilot: KAT-Coder-Pro V2kwaipilot
AI 43.8256K ctx$0.3000/M in
#3282
Mistral: Mistral Medium 3.5mistralai
AI 39.2262K ctx$1.50/M in
#3382
OpenAI: GPT-5.1-Codexopenai
AI 43.1400K ctx$1.25/M in
#3482
DeepSeek: DeepSeek V3.1deepseek
AI 27.7164K ctx$0.2100/M in
#3582
Microsoft: Phi 4microsoft
AI 10.416K ctx$0.0650/M in
#3682
o1-previewOpenAI
AI 23.7$16.50/M in
#3782
Mistral: Devstral 2 2512mistralai
AI 22262K ctx$0.4000/M in
#3882
Mistral: Devstral Small 1.1mistralai
AI 15.2131K ctx$0.1000/M in
#3982
Google: Gemma 4 31Bgoogle
AI 32.3262K ctx$0.1200/M in
#4082
OpenAI: GPT-5.1-Codex-Miniopenai
AI 38.6400K ctx$0.2500/M in
#4182
Qwen: Qwen3 30B A3Bqwen
AI 15.3131K ctx$0.1200/M in
#42Best for Coding82
GPT-5.5 (high)OpenAI
AI 58.9$5.00/M in
#4382
OpenAI: gpt-oss-120bopenai
AI 33.3131K ctx$0.0390/M in
#4482
DeepSeek: DeepSeek V3deepseek
AI 16.5131K ctx$0.2002/M in
#4582
o1-miniOpenAI
AI 20.4Free/M in
#4682
Nex AGI: DeepSeek V3.1 Nex N1nex-agi
AI 28.1131K ctx$0.1350/M in
#4782
xAI: Grok 4x-ai
AI 41.5256K ctx$3.00/M in
#4882
Qwen: Qwen3.6 Plusqwen
AI 501M ctx$0.3250/M in
#49New82
Google: Gemini 3.5 Flashgoogle
AI 54.81M ctx$1.50/M in
#5082
Kwaipilot: KAT-Coder-Pro V1kwaipilot
AI 36256K ctx$0.2070/M in

How we rank AI models

The Design for Online AI Model Leaderboard scores 578 models on a single 0–100 scale built from four weighted dimensions: intelligence (reasoning and knowledge benchmarks), technical capability (coding and tool use), content quality (writing and instruction-following) and value (capability per dollar).

Underlying data is aggregated from the OpenRouter API for pricing and availability, Artificial Analysis for intelligence, coding and agentic indices, and the Hugging Face Open LLM Leaderboard for open-model benchmarks. We refresh these sources daily and layer our own editorial review on top, so a model that benchmarks well but is impractical to deploy will not automatically top the table.

Models are grouped into tiers (Frontier, Professional, Specialist, Efficient, Emerging and Legacy) to make like-for-like comparison easier, and newly released models are flagged so you can see what has just landed.

Leaderboard FAQ

How often is the leaderboard updated?

Pricing, availability and benchmark data are synced daily from our sources, and editorial scores are reviewed whenever a significant new model is released.

How is the overall score calculated?

Each model is graded 0–10 on intelligence, technical capability, content quality and value; those dimensions are weighted and combined into the 0–100 overall score used to rank the table.

Where does the data come from?

From the OpenRouter API, Artificial Analysis and the Hugging Face Open LLM Leaderboard, combined with hands-on editorial testing by the Design for Online team.