Coding

Best models for code generation and debugging.

Updated July 2, 2026

Best models for code generation and debugging.

# Model Score AI Index Context Input / 1M Output / 1M Caps
1Anthropic: Claude Fable 5anthropic New Top Pick93.359.91M$10.00$50.00
2Anthropic: Claude Sonnet 5anthropic New In-House Pick9353.41M$2.00$10.00
3Anthropic: Claude Opus 4.8anthropic Top Pick In-House Pick92.355.71M$5.00$25.00
4Google: Gemini 3.1 Pro Previewgoogle89.846.51M$2.00$12.00
5Anthropic: Claude Opus 4.7anthropic89.453.51M$5.00$25.00
6Google: Gemini 3.5 Flashgoogle88.950.21M$1.50$9.00
7OpenAI: GPT-5.5openai Top Pick88.354.81.1M$5.00$30.00
8OpenAI: GPT-5.4openai8751.41.1M$2.50$15.00
9Z.ai: GLM 5.2z-ai New86.151.11M$0.9300$3.00
10Anthropic: Claude Sonnet 4.6anthropic85.147.21M$3.00$15.00
11Qwen: Qwen3.7 Maxqwen83.5461M$1.25$3.75
12Anthropic: Claude Opus 4.6anthropic8343.71M$5.00$25.00
13DeepSeek: DeepSeek V4 Prodeepseek82.144.31M$0.4350$0.8700
14OpenAI: GPT-5.1-Codex-Miniopenai8230.6400K$0.2500$2.00
15DeepSeek: R1deepseek8218.5164K$0.7000$2.50
16ERNIE 5.0 Thinking PreviewBaidu8221.9FreeFree
17Nex AGI: Nex-N2-Pronex-agi New Best for Agents8241262K$0.2500$1.00
18Z.ai: GLM 5z-ai8232.4203K$0.6000$1.92
19OpenAI: GPT-5 Codexopenai8236.1400K$1.25$10.00
20xAI: Grok 3 Minix-ai8222.5131K$0.3000$0.5000
21OpenAI: GPT-5.4 Miniopenai8240400K$0.7500$4.50
22DeepSeek Coder V2 Lite InstructDeepSeek823.1FreeFree
23Nex AGI: DeepSeek V3.1 Nex N1nex-agi8221131K$0.1350$0.5000
24OpenAI: gpt-oss-120b (free)openai8233.3131KFreeFree
25OpenAI: o4 Miniopenai8225.6200K$1.10$4.40
26GPT-5.5 (Non-reasoning)OpenAI8235.4$5.00$30.00
27Kwaipilot: KAT-Coder-Pro V1kwaipilot8234.6256K$0.2070$0.8280
28DeepSeek: DeepSeek V3deepseek8214.2131K$0.2002$0.8001
29Qwen3.6 35B A3B (Non-reasoning)Alibaba8224.2$0.3750$2.25
30Grok Build 0.1 0616xAI New8239.8$1.00$2.00
31Qwen: Qwen3 Max Thinkingqwen8225262K$0.7800$3.90
32DeepSeek: DeepSeek V3.1 Terminusdeepseek8226.3164K$0.2700$0.9500
33Google: Gemini 2.5 Pro Preview 06-05google82231M$1.25$10.00
34MiniMax: MiniMax M2.7minimax8238.1205K$0.1800$0.7200
35Grok 4.20 0309 (Reasoning)xAI8236.5$2.00$6.00
36DeepSeek: DeepSeek V3.2 Specialedeepseek8222.2164K$0.2870$0.4310
37OpenAI: gpt-oss-120bopenai8223.8131K$0.0300$0.1500
38Qwen: Qwen2.5 Coder 7B Instructqwen824.533K$0.0300$0.0900
39GPT-5.5 (medium)OpenAI8250.4$5.00$30.00
40MoonshotAI: Kimi K2 Thinkingmoonshotai8232.7262K$0.6000$2.50
41OpenAI: o1openai8223.4200K$15.00$60.00
42Qwen: Qwen3.6 35B A3Bqwen8231.6262K$0.1400$1.00
43GPT-5.5 Instant (June 2026)OpenAI New8228.9$5.00$30.00
44xAI: Grok 4 Fastx-ai8227.42M$0.2000$0.5000
45Anthropic: Claude Opus 4anthropic8231200K$15.00$75.00
46NVIDIA: Llama 3.1 Nemotron Ultra 253B v1nvidia829.1131K$0.6000$1.80
47Solar Pro 2 (Preview) (Reasoning)Upstage8212.5FreeFree
48DeepSeek: DeepSeek V3.2deepseek8233.4131K$0.2288$0.3432
49OpenAI: gpt-oss-20b (free)openai8224.5131KFreeFree
50OpenAI: GPT-4.1openai8219.41M$2.00$8.00
#1NewTop Pick93.3
Anthropic: Claude Fable 5anthropic
AI 59.91M ctx$10.00/M in
#2NewIn-House Pick93
Anthropic: Claude Sonnet 5anthropic
AI 53.41M ctx$2.00/M in
#3Top PickIn-House Pick92.3
Anthropic: Claude Opus 4.8anthropic
AI 55.71M ctx$5.00/M in
#489.8
Google: Gemini 3.1 Pro Previewgoogle
AI 46.51M ctx$2.00/M in
#589.4
Anthropic: Claude Opus 4.7anthropic
AI 53.51M ctx$5.00/M in
#688.9
Google: Gemini 3.5 Flashgoogle
AI 50.21M ctx$1.50/M in
#7Top Pick88.3
OpenAI: GPT-5.5openai
AI 54.81.1M ctx$5.00/M in
#887
OpenAI: GPT-5.4openai
AI 51.41.1M ctx$2.50/M in
#9New86.1
Z.ai: GLM 5.2z-ai
AI 51.11M ctx$0.9300/M in
#1085.1
Anthropic: Claude Sonnet 4.6anthropic
AI 47.21M ctx$3.00/M in
#1183.5
Qwen: Qwen3.7 Maxqwen
AI 461M ctx$1.25/M in
#1283
Anthropic: Claude Opus 4.6anthropic
AI 43.71M ctx$5.00/M in
#1382.1
DeepSeek: DeepSeek V4 Prodeepseek
AI 44.31M ctx$0.4350/M in
#1482
OpenAI: GPT-5.1-Codex-Miniopenai
AI 30.6400K ctx$0.2500/M in
#1582
DeepSeek: R1deepseek
AI 18.5164K ctx$0.7000/M in
#1682
ERNIE 5.0 Thinking PreviewBaidu
AI 21.9Free/M in
#17NewBest for Agents82
Nex AGI: Nex-N2-Pronex-agi
AI 41262K ctx$0.2500/M in
#1882
Z.ai: GLM 5z-ai
AI 32.4203K ctx$0.6000/M in
#1982
OpenAI: GPT-5 Codexopenai
AI 36.1400K ctx$1.25/M in
#2082
xAI: Grok 3 Minix-ai
AI 22.5131K ctx$0.3000/M in
#2182
OpenAI: GPT-5.4 Miniopenai
AI 40400K ctx$0.7500/M in
#2282
DeepSeek Coder V2 Lite InstructDeepSeek
AI 3.1Free/M in
#2382
Nex AGI: DeepSeek V3.1 Nex N1nex-agi
AI 21131K ctx$0.1350/M in
#2482
OpenAI: gpt-oss-120b (free)openai
AI 33.3131K ctxFree/M in
#2582
OpenAI: o4 Miniopenai
AI 25.6200K ctx$1.10/M in
#2682
GPT-5.5 (Non-reasoning)OpenAI
AI 35.4$5.00/M in
#2782
Kwaipilot: KAT-Coder-Pro V1kwaipilot
AI 34.6256K ctx$0.2070/M in
#2882
DeepSeek: DeepSeek V3deepseek
AI 14.2131K ctx$0.2002/M in
#2982
Qwen3.6 35B A3B (Non-reasoning)Alibaba
AI 24.2$0.3750/M in
#30New82
Grok Build 0.1 0616xAI
AI 39.8$1.00/M in
#3182
Qwen: Qwen3 Max Thinkingqwen
AI 25262K ctx$0.7800/M in
#3282
DeepSeek: DeepSeek V3.1 Terminusdeepseek
AI 26.3164K ctx$0.2700/M in
#3382
Google: Gemini 2.5 Pro Preview 06-05google
AI 231M ctx$1.25/M in
#3482
MiniMax: MiniMax M2.7minimax
AI 38.1205K ctx$0.1800/M in
#3582
Grok 4.20 0309 (Reasoning)xAI
AI 36.5$2.00/M in
#3682
DeepSeek: DeepSeek V3.2 Specialedeepseek
AI 22.2164K ctx$0.2870/M in
#3782
OpenAI: gpt-oss-120bopenai
AI 23.8131K ctx$0.0300/M in
#3882
Qwen: Qwen2.5 Coder 7B Instructqwen
AI 4.533K ctx$0.0300/M in
#3982
GPT-5.5 (medium)OpenAI
AI 50.4$5.00/M in
#4082
MoonshotAI: Kimi K2 Thinkingmoonshotai
AI 32.7262K ctx$0.6000/M in
#4182
OpenAI: o1openai
AI 23.4200K ctx$15.00/M in
#4282
Qwen: Qwen3.6 35B A3Bqwen
AI 31.6262K ctx$0.1400/M in
#43New82
GPT-5.5 Instant (June 2026)OpenAI
AI 28.9$5.00/M in
#4482
xAI: Grok 4 Fastx-ai
AI 27.42M ctx$0.2000/M in
#4582
Anthropic: Claude Opus 4anthropic
AI 31200K ctx$15.00/M in
#4682
NVIDIA: Llama 3.1 Nemotron Ultra 253B v1nvidia
AI 9.1131K ctx$0.6000/M in
#4782
Solar Pro 2 (Preview) (Reasoning)Upstage
AI 12.5Free/M in
#4882
DeepSeek: DeepSeek V3.2deepseek
AI 33.4131K ctx$0.2288/M in
#4982
OpenAI: gpt-oss-20b (free)openai
AI 24.5131K ctxFree/M in
#5082
OpenAI: GPT-4.1openai
AI 19.41M ctx$2.00/M in

How we rank AI models

The Design for Online AI Model Leaderboard scores 592 models on a single 0–100 scale built from four weighted dimensions: intelligence (reasoning and knowledge benchmarks), technical capability (coding and tool use), content quality (writing and instruction-following) and value (capability per dollar).

Underlying data is aggregated from the OpenRouter API for pricing and availability, Artificial Analysis for intelligence, coding and agentic indices, and the Hugging Face Open LLM Leaderboard for open-model benchmarks. We refresh these sources daily and layer our own editorial review on top, so a model that benchmarks well but is impractical to deploy will not automatically top the table.

Models are grouped into tiers (Frontier, Professional, Specialist, Efficient, Emerging and Legacy) to make like-for-like comparison easier, and newly released models are flagged so you can see what has just landed.

Leaderboard FAQ

How often is the leaderboard updated?

Pricing, availability and benchmark data are synced daily from our sources, and editorial scores are reviewed whenever a significant new model is released.

How is the overall score calculated?

Each model is graded 0–10 on intelligence, technical capability, content quality and value; those dimensions are weighted and combined into the 0–100 overall score used to rank the table.

Where does the data come from?

From the OpenRouter API, Artificial Analysis and the Hugging Face Open LLM Leaderboard, combined with hands-on editorial testing by the Design for Online team.