Coding

Best models for code generation and debugging.

Updated June 17, 2026

Best models for code generation and debugging.

# Model Score AI Index Context Input / 1M Output / 1M Caps
1Anthropic: Claude Fable 5anthropic New Top Pick92.459.91M$10.00$50.00
2Anthropic: Claude Opus 4.8anthropic New Top Pick In-House Pick90.255.71M$5.00$25.00
3Google: Gemini 3.1 Pro Previewgoogle87.746.51M$2.00$12.00
4OpenAI: GPT-5.5openai Top Pick87.154.81.1M$5.00$30.00
5Anthropic: Claude Opus 4.7anthropic86.753.51M$5.00$25.00
6OpenAI: GPT-5.4openai84.451.41.1M$2.50$15.00
7Google: Gemini 3.5 Flashgoogle New83.550.21M$1.50$9.00
8Anthropic: Claude Sonnet 4.6anthropic In-House Pick82.447.21M$3.00$15.00
9OpenAI: GPT-5.2openai8226400K$1.75$14.00
10Qwen: Qwen3 Coder 30B A3B Instructqwen8213.6160K$0.0700$0.2700
11OpenAI: o3 Miniopenai8219200K$1.10$4.40
12GPT-5.5 (low)OpenAI8241.7$5.00$30.00
13Qwen: Qwen3.7 Plusqwen New82391M$0.3200$1.28
14OpenAI: GPT-5.1-Codexopenai8234.7400K$1.25$10.00
15TNG: DeepSeek R1T2 Chimeratngtech8227164K$0.3000$1.10
16OpenAI: GPT-4 Turboopenai827.9128K$10.00$30.00
17Qwen: Qwen3.6 Max Previewqwen8240262K$1.04$6.24
18StepFun: Step 3.5 Flashstepfun8226262K$0.0900$0.3000
19OpenAI: GPT-5 Codexopenai8236.1400K$1.25$10.00
20OpenAI: o3openai8230.4200K$2.00$8.00
21Qwen: Qwen3.6 Plusqwen8239.61M$0.3250$1.95
22Solar Pro 2 (Preview) (Reasoning)Upstage8212.5FreeFree
23Mistral: Devstral 2 2512mistralai8215.5262K$0.4000$2.00
24Z.ai: GLM 4.5z-ai8219.5131K$0.6000$2.20
25DeepSeek: R1deepseek8220.1164K$0.7000$2.50
26OpenAI: GPT-5.1-Codex-Miniopenai8230.6400K$0.2500$2.00
27Google: Gemini 2.5 Flashgoogle8214.11M$0.3000$2.50
28o1-previewOpenAI8217$16.50$66.00
29MoonshotAI: Kimi K2.5moonshotai8238.1262K$0.3750$2.03
30DeepSeek: DeepSeek V3.1 Terminusdeepseek8221.4164K$0.2700$0.9500
31OpenAI: o4 Miniopenai8225.6200K$1.10$4.40
32Qwen3 Coder 480B A35B InstructAlibaba8218$1.50$7.50
33Nex AGI: DeepSeek V3.1 Nex N1nex-agi8221131K$0.1350$0.5000
34Z.ai: GLM 4.5 Airz-ai8216.5131K$0.1300$0.8500
35Microsoft: Phi 4microsoft824.916K$0.0650$0.1400
36GPT-5.5 (high)OpenAI Best for Coding8253.1$5.00$30.00
37North Mini CodeCohere New8220.6FreeFree
38Kwaipilot: KAT-Coder-Pro V1kwaipilot8228.3256K$0.2070$0.8280
39Google: Gemini 2.5 Progoogle82271M$1.25$10.00
40Inception: Mercury 2inception8225.3128K$0.2500$0.7500
41o1-miniOpenAI8214FreeFree
42OpenAI: GPT-5.2-Codexopenai8240.1400K$1.75$14.00
43Qwen: Qwen2.5 Coder 7B Instructqwen824.533K$0.0300$0.0900
44MoonshotAI: Kimi K2.6moonshotai8242.8262K$0.6800$3.41
45Qwen3 4B 2507 (Reasoning)Alibaba8212FreeFree
46DeepSeek: DeepSeek V3.2 Specialedeepseek8222.2164K$0.2870$0.4310
47DeepSeek: DeepSeek V3deepseek8210.4131K$0.2002$0.8001
48Muse SparkMeta8243.1FreeFree
49MoonshotAI: Kimi K2.7 Codemoonshotai New8241.9262K$0.7400$3.50
50MoonshotAI: Kimi K2 Thinkingmoonshotai8232.7262K$0.6000$2.50
#1NewTop Pick92.4
Anthropic: Claude Fable 5anthropic
AI 59.91M ctx$10.00/M in
#2NewTop PickIn-House Pick90.2
Anthropic: Claude Opus 4.8anthropic
AI 55.71M ctx$5.00/M in
#387.7
Google: Gemini 3.1 Pro Previewgoogle
AI 46.51M ctx$2.00/M in
#4Top Pick87.1
OpenAI: GPT-5.5openai
AI 54.81.1M ctx$5.00/M in
#586.7
Anthropic: Claude Opus 4.7anthropic
AI 53.51M ctx$5.00/M in
#684.4
OpenAI: GPT-5.4openai
AI 51.41.1M ctx$2.50/M in
#7New83.5
Google: Gemini 3.5 Flashgoogle
AI 50.21M ctx$1.50/M in
#8In-House Pick82.4
Anthropic: Claude Sonnet 4.6anthropic
AI 47.21M ctx$3.00/M in
#982
OpenAI: GPT-5.2openai
AI 26400K ctx$1.75/M in
#1082
Qwen: Qwen3 Coder 30B A3B Instructqwen
AI 13.6160K ctx$0.0700/M in
#1182
OpenAI: o3 Miniopenai
AI 19200K ctx$1.10/M in
#1282
GPT-5.5 (low)OpenAI
AI 41.7$5.00/M in
#13New82
Qwen: Qwen3.7 Plusqwen
AI 391M ctx$0.3200/M in
#1482
OpenAI: GPT-5.1-Codexopenai
AI 34.7400K ctx$1.25/M in
#1582
TNG: DeepSeek R1T2 Chimeratngtech
AI 27164K ctx$0.3000/M in
#1682
OpenAI: GPT-4 Turboopenai
AI 7.9128K ctx$10.00/M in
#1782
Qwen: Qwen3.6 Max Previewqwen
AI 40262K ctx$1.04/M in
#1882
StepFun: Step 3.5 Flashstepfun
AI 26262K ctx$0.0900/M in
#1982
OpenAI: GPT-5 Codexopenai
AI 36.1400K ctx$1.25/M in
#2082
OpenAI: o3openai
AI 30.4200K ctx$2.00/M in
#2182
Qwen: Qwen3.6 Plusqwen
AI 39.61M ctx$0.3250/M in
#2282
Solar Pro 2 (Preview) (Reasoning)Upstage
AI 12.5Free/M in
#2382
Mistral: Devstral 2 2512mistralai
AI 15.5262K ctx$0.4000/M in
#2482
Z.ai: GLM 4.5z-ai
AI 19.5131K ctx$0.6000/M in
#2582
DeepSeek: R1deepseek
AI 20.1164K ctx$0.7000/M in
#2682
OpenAI: GPT-5.1-Codex-Miniopenai
AI 30.6400K ctx$0.2500/M in
#2782
Google: Gemini 2.5 Flashgoogle
AI 14.11M ctx$0.3000/M in
#2882
o1-previewOpenAI
AI 17$16.50/M in
#2982
MoonshotAI: Kimi K2.5moonshotai
AI 38.1262K ctx$0.3750/M in
#3082
DeepSeek: DeepSeek V3.1 Terminusdeepseek
AI 21.4164K ctx$0.2700/M in
#3182
OpenAI: o4 Miniopenai
AI 25.6200K ctx$1.10/M in
#3282
Qwen3 Coder 480B A35B InstructAlibaba
AI 18$1.50/M in
#3382
Nex AGI: DeepSeek V3.1 Nex N1nex-agi
AI 21131K ctx$0.1350/M in
#3482
Z.ai: GLM 4.5 Airz-ai
AI 16.5131K ctx$0.1300/M in
#3582
Microsoft: Phi 4microsoft
AI 4.916K ctx$0.0650/M in
#36Best for Coding82
GPT-5.5 (high)OpenAI
AI 53.1$5.00/M in
#37New82
North Mini CodeCohere
AI 20.6Free/M in
#3882
Kwaipilot: KAT-Coder-Pro V1kwaipilot
AI 28.3256K ctx$0.2070/M in
#3982
Google: Gemini 2.5 Progoogle
AI 271M ctx$1.25/M in
#4082
Inception: Mercury 2inception
AI 25.3128K ctx$0.2500/M in
#4182
o1-miniOpenAI
AI 14Free/M in
#4282
OpenAI: GPT-5.2-Codexopenai
AI 40.1400K ctx$1.75/M in
#4382
Qwen: Qwen2.5 Coder 7B Instructqwen
AI 4.533K ctx$0.0300/M in
#4482
MoonshotAI: Kimi K2.6moonshotai
AI 42.8262K ctx$0.6800/M in
#4582
Qwen3 4B 2507 (Reasoning)Alibaba
AI 12Free/M in
#4682
DeepSeek: DeepSeek V3.2 Specialedeepseek
AI 22.2164K ctx$0.2870/M in
#4782
DeepSeek: DeepSeek V3deepseek
AI 10.4131K ctx$0.2002/M in
#4882
Muse SparkMeta
AI 43.1Free/M in
#49New82
MoonshotAI: Kimi K2.7 Codemoonshotai
AI 41.9262K ctx$0.7400/M in
#5082
MoonshotAI: Kimi K2 Thinkingmoonshotai
AI 32.7262K ctx$0.6000/M in

How we rank AI models

The Design for Online AI Model Leaderboard scores 579 models on a single 0–100 scale built from four weighted dimensions: intelligence (reasoning and knowledge benchmarks), technical capability (coding and tool use), content quality (writing and instruction-following) and value (capability per dollar).

Underlying data is aggregated from the OpenRouter API for pricing and availability, Artificial Analysis for intelligence, coding and agentic indices, and the Hugging Face Open LLM Leaderboard for open-model benchmarks. We refresh these sources daily and layer our own editorial review on top, so a model that benchmarks well but is impractical to deploy will not automatically top the table.

Models are grouped into tiers (Frontier, Professional, Specialist, Efficient, Emerging and Legacy) to make like-for-like comparison easier, and newly released models are flagged so you can see what has just landed.

Leaderboard FAQ

How often is the leaderboard updated?

Pricing, availability and benchmark data are synced daily from our sources, and editorial scores are reviewed whenever a significant new model is released.

How is the overall score calculated?

Each model is graded 0–10 on intelligence, technical capability, content quality and value; those dimensions are weighted and combined into the 0–100 overall score used to rank the table.

Where does the data come from?

From the OpenRouter API, Artificial Analysis and the Hugging Face Open LLM Leaderboard, combined with hands-on editorial testing by the Design for Online team.