AI Agents

Models optimised for autonomous agent workflows.

Updated June 11, 2026

Models optimised for autonomous agent workflows.

# Model Score AI Index Context Input / 1M Output / 1M Caps
1Anthropic: Claude Fable 5anthropic New Top Pick94.764.91M$10.00$50.00
2Anthropic: Claude Opus 4.8anthropic New Top Pick In-House Pick92.461.41M$5.00$25.00
3Google: Gemini 3.1 Pro Previewgoogle Best for Agents91.757.21M$2.00$12.00
4OpenAI: GPT-5.5openai Top Pick88.860.21.1M$5.00$30.00
5Anthropic: Claude Opus 4.7anthropic88.357.31M$5.00$25.00
6Anthropic: Claude Sonnet 4.6anthropic In-House Pick84.451.71M$3.00$15.00
7Qwen: Qwen3.7 Maxqwen New83.956.61M$1.25$3.75
8OpenAI: GPT-5.3-Codexopenai83.553.6400K$1.75$14.00
9Google: Gemini 3 Flash Previewgoogle82.5351M$0.5000$3.00
10MoonshotAI: Kimi K2 Thinkingmoonshotai8240.9262K$0.6000$2.50
11MiniMax: MiniMax M2.7minimax8249.6205K$0.2700$1.08
12OpenAI: gpt-oss-20bopenai8220.8131K$0.0290$0.1400
13MoonshotAI: Kimi K2.6moonshotai Updated8242.9262K$0.6700$3.39
14OpenAI: o4 Miniopenai8233.1200K$1.10$4.40
15Nova 2.0 Lite (high)Amazon8234.5$0.3000$2.50
16MiniMax: MiniMax M2minimax8236.1205K$0.2550$1.00
17Xiaomi: MiMo-V2-Omnixiaomi8243.4262K$0.4000$2.00
18inclusionAI: Ring-2.6-1Tinclusionai8238.5262K$0.0750$0.6250
19Qwen: Qwen3.5 397B A17Bqwen8245262K$0.3900$2.34
20Anthropic: Claude Opus 4.1anthropic8242200K$15.00$75.00
21Xiaomi: MiMo-V2.5xiaomi82491M$0.1400$0.2800
22Anthropic: Claude 3.7 Sonnetanthropic8230.8200K$3.00$15.00
23Nova 2.0 Lite (medium)Amazon8229.7$0.3000$2.50
24OpenAI: o3 Deep Researchopenai8238.3200K$10.00$40.00
25Xiaomi: MiMo-V2-Proxiaomi8249.21M$1.00$3.00
26Google: Gemini 3.5 Flashgoogle New8243.31M$1.50$9.00
27MiniMax: MiniMax M2.5minimax8241.9205K$0.1500$0.9000
28MoonshotAI: Kimi K2 0711moonshotai8226.3131K$0.5700$2.30
29Tencent: Hy3 preview (free)tencent8241.9262KFreeFree
30OpenAI: GPT-5.2openai8251.3400K$1.75$14.00
31Anthropic: Claude 3.7 Sonnet (thinking)anthropic8234.7200K$3.00$15.00
32Qwen: Qwen3.6 35B A3Bqwen Updated8243.5262K$0.1500$1.00
33Z.ai: GLM 4.6z-ai8232.5203K$0.4300$1.74
34Kwaipilot: KAT-Coder-Pro V2kwaipilot8243.8256K$0.3000$1.20
35Z.ai: GLM 5z-ai8249.8203K$0.6000$1.92
36xAI: Grok 4x-ai8241.5256K$3.00$15.00
37inclusionAI: Ling-2.6-1T (free)inclusionai8233.6262KFreeFree
38DeepSeek: DeepSeek V3.2deepseek8241.7131K$0.2288$0.3432
39OpenAI: o1openai8230.7200K$15.00$60.00
40Qwen: Qwen3.6 27Bqwen8245.8262K$0.2890$2.40
41Anthropic: Claude Sonnet 4.5anthropic8237.11M$3.00$15.00
42Anthropic: Claude Opus 4.6anthropic8252.91M$5.00$25.00
43Google: Gemini 2.5 Progoogle8234.61M$1.25$10.00
44DeepSeek: DeepSeek V4 Prodeepseek8251.51M$0.4350$0.8700
45Anthropic: Claude Opus 4.5anthropic8249.7200K$5.00$25.00
46Qwen: Qwen3.5-122B-A10Bqwen8241.6262K$0.2600$2.08
47Qwen3.5 Omni PlusAlibaba8238.6$0.4000$4.80
48OpenAI: GPT-5 Codexopenai8244.6400K$1.25$10.00
49Z.ai: GLM 5V Turboz-ai8242.9203K$1.20$4.00
50MoonshotAI: Kimi K2.5moonshotai8246.8262K$0.4000$1.90
#1NewTop Pick94.7
Anthropic: Claude Fable 5anthropic
AI 64.91M ctx$10.00/M in
#2NewTop PickIn-House Pick92.4
Anthropic: Claude Opus 4.8anthropic
AI 61.41M ctx$5.00/M in
#3Best for Agents91.7
Google: Gemini 3.1 Pro Previewgoogle
AI 57.21M ctx$2.00/M in
#4Top Pick88.8
OpenAI: GPT-5.5openai
AI 60.21.1M ctx$5.00/M in
#588.3
Anthropic: Claude Opus 4.7anthropic
AI 57.31M ctx$5.00/M in
#6In-House Pick84.4
Anthropic: Claude Sonnet 4.6anthropic
AI 51.71M ctx$3.00/M in
#7New83.9
Qwen: Qwen3.7 Maxqwen
AI 56.61M ctx$1.25/M in
#883.5
OpenAI: GPT-5.3-Codexopenai
AI 53.6400K ctx$1.75/M in
#982.5
Google: Gemini 3 Flash Previewgoogle
AI 351M ctx$0.5000/M in
#1082
MoonshotAI: Kimi K2 Thinkingmoonshotai
AI 40.9262K ctx$0.6000/M in
#1182
MiniMax: MiniMax M2.7minimax
AI 49.6205K ctx$0.2700/M in
#1282
OpenAI: gpt-oss-20bopenai
AI 20.8131K ctx$0.0290/M in
#1382
MoonshotAI: Kimi K2.6moonshotai
AI 42.9262K ctx$0.6700/M in
#1482
OpenAI: o4 Miniopenai
AI 33.1200K ctx$1.10/M in
#1582
Nova 2.0 Lite (high)Amazon
AI 34.5$0.3000/M in
#1682
MiniMax: MiniMax M2minimax
AI 36.1205K ctx$0.2550/M in
#1782
Xiaomi: MiMo-V2-Omnixiaomi
AI 43.4262K ctx$0.4000/M in
#1882
inclusionAI: Ring-2.6-1Tinclusionai
AI 38.5262K ctx$0.0750/M in
#1982
Qwen: Qwen3.5 397B A17Bqwen
AI 45262K ctx$0.3900/M in
#2082
Anthropic: Claude Opus 4.1anthropic
AI 42200K ctx$15.00/M in
#2182
Xiaomi: MiMo-V2.5xiaomi
AI 491M ctx$0.1400/M in
#2282
Anthropic: Claude 3.7 Sonnetanthropic
AI 30.8200K ctx$3.00/M in
#2382
Nova 2.0 Lite (medium)Amazon
AI 29.7$0.3000/M in
#2482
OpenAI: o3 Deep Researchopenai
AI 38.3200K ctx$10.00/M in
#2582
Xiaomi: MiMo-V2-Proxiaomi
AI 49.21M ctx$1.00/M in
#26New82
Google: Gemini 3.5 Flashgoogle
AI 43.31M ctx$1.50/M in
#2782
MiniMax: MiniMax M2.5minimax
AI 41.9205K ctx$0.1500/M in
#2882
MoonshotAI: Kimi K2 0711moonshotai
AI 26.3131K ctx$0.5700/M in
#2982
Tencent: Hy3 preview (free)tencent
AI 41.9262K ctxFree/M in
#3082
OpenAI: GPT-5.2openai
AI 51.3400K ctx$1.75/M in
#3182
Anthropic: Claude 3.7 Sonnet (thinking)anthropic
AI 34.7200K ctx$3.00/M in
#3282
Qwen: Qwen3.6 35B A3Bqwen
AI 43.5262K ctx$0.1500/M in
#3382
Z.ai: GLM 4.6z-ai
AI 32.5203K ctx$0.4300/M in
#3482
Kwaipilot: KAT-Coder-Pro V2kwaipilot
AI 43.8256K ctx$0.3000/M in
#3582
Z.ai: GLM 5z-ai
AI 49.8203K ctx$0.6000/M in
#3682
xAI: Grok 4x-ai
AI 41.5256K ctx$3.00/M in
#3782
inclusionAI: Ling-2.6-1T (free)inclusionai
AI 33.6262K ctxFree/M in
#3882
DeepSeek: DeepSeek V3.2deepseek
AI 41.7131K ctx$0.2288/M in
#3982
OpenAI: o1openai
AI 30.7200K ctx$15.00/M in
#4082
Qwen: Qwen3.6 27Bqwen
AI 45.8262K ctx$0.2890/M in
#4182
Anthropic: Claude Sonnet 4.5anthropic
AI 37.11M ctx$3.00/M in
#4282
Anthropic: Claude Opus 4.6anthropic
AI 52.91M ctx$5.00/M in
#4382
Google: Gemini 2.5 Progoogle
AI 34.61M ctx$1.25/M in
#4482
DeepSeek: DeepSeek V4 Prodeepseek
AI 51.51M ctx$0.4350/M in
#4582
Anthropic: Claude Opus 4.5anthropic
AI 49.7200K ctx$5.00/M in
#4682
Qwen: Qwen3.5-122B-A10Bqwen
AI 41.6262K ctx$0.2600/M in
#4782
Qwen3.5 Omni PlusAlibaba
AI 38.6$0.4000/M in
#4882
OpenAI: GPT-5 Codexopenai
AI 44.6400K ctx$1.25/M in
#4982
Z.ai: GLM 5V Turboz-ai
AI 42.9203K ctx$1.20/M in
#5082
MoonshotAI: Kimi K2.5moonshotai
AI 46.8262K ctx$0.4000/M in

How we rank AI models

The Design for Online AI Model Leaderboard scores 577 models on a single 0–100 scale built from four weighted dimensions: intelligence (reasoning and knowledge benchmarks), technical capability (coding and tool use), content quality (writing and instruction-following) and value (capability per dollar).

Underlying data is aggregated from the OpenRouter API for pricing and availability, Artificial Analysis for intelligence, coding and agentic indices, and the Hugging Face Open LLM Leaderboard for open-model benchmarks. We refresh these sources daily and layer our own editorial review on top, so a model that benchmarks well but is impractical to deploy will not automatically top the table.

Models are grouped into tiers (Frontier, Professional, Specialist, Efficient, Emerging and Legacy) to make like-for-like comparison easier, and newly released models are flagged so you can see what has just landed.

Leaderboard FAQ

How often is the leaderboard updated?

Pricing, availability and benchmark data are synced daily from our sources, and editorial scores are reviewed whenever a significant new model is released.

How is the overall score calculated?

Each model is graded 0–10 on intelligence, technical capability, content quality and value; those dimensions are weighted and combined into the 0–100 overall score used to rank the table.

Where does the data come from?

From the OpenRouter API, Artificial Analysis and the Hugging Face Open LLM Leaderboard, combined with hands-on editorial testing by the Design for Online team.