Tool Use

Models with strong tool-use and function-calling support.

Updated July 4, 2026

Models with strong tool-use and function-calling support.

# Model Score AI Index Context Input / 1M Output / 1M Caps
1Anthropic: Claude Fable 5anthropic New Top Pick93.359.91M$10.00$50.00
2Anthropic: Claude Sonnet 5anthropic New In-House Pick9353.41M$2.00$10.00
3Anthropic: Claude Opus 4.8anthropic Top Pick In-House Pick92.355.71M$5.00$25.00
4Google: Gemini 3.1 Pro Previewgoogle89.846.51M$2.00$12.00
5Anthropic: Claude Opus 4.7anthropic89.453.51M$5.00$25.00
6Google: Gemini 3.5 Flashgoogle88.950.21M$1.50$9.00
7OpenAI: GPT-5.5openai Top Pick88.354.81.1M$5.00$30.00
8OpenAI: GPT-5.4openai8751.41.1M$2.50$15.00
9Z.ai: GLM 5.2z-ai New86.151.11M$0.7700$2.42
10Anthropic: Claude Sonnet 4.6anthropic85.147.21M$3.00$15.00
11Qwen: Qwen3.7 Maxqwen83.5461M$1.25$3.75
12Anthropic: Claude Opus 4.6anthropic8337.81M$5.00$25.00
13DeepSeek: DeepSeek V4 Prodeepseek82.144.31M$0.4350$0.8700
14DeepSeek: DeepSeek V3.2deepseek8224.7131K$0.2288$0.3432
15DeepSeek: DeepSeek V3.1 Terminusdeepseek8221.4164K$0.2700$0.9500
16Qwen: Qwen3 235B A22B Instruct 2507qwen8219.6262K$0.0900$0.1000
17Inception: Mercury 2inception8225.3128K$0.2500$0.7500
18inclusionAI: Ling-2.6-1T (free)inclusionai8233.6262KFreeFree
19Nex AGI: Nex-N2-Pronex-agi New Best for Agents8241262K$0.2500$1.00
20MoonshotAI: Kimi K2 Thinkingmoonshotai8232.7262K$0.6000$2.50
21DeepSeek: DeepSeek V3.1deepseek8220.7164K$0.2100$0.7900
22xAI: Grok 3 Minix-ai8222.5131K$0.3000$0.5000
23Anthropic: Claude 3.7 Sonnet (thinking)anthropic8227.1200K$3.00$15.00
24Xiaomi: MiMo-V2-Omnixiaomi8235262K$0.4000$2.00
25Nova 2.0 Lite (medium)Amazon8219$0.3000$2.50
26OpenAI: GPT-5.3-Codexopenai8244.3400K$1.75$14.00
27Qwen: Qwen3 VL 30B A3B Thinkingqwen8219.7131K$0.1300$1.56
28OpenAI: gpt-oss-20bopenai8214.9131K$0.0290$0.1400
29Qwen: Qwen3 8Bqwen828.3131K$0.1170$0.4550
30OpenAI: o1openai8223.4200K$15.00$60.00
31Z.ai: GLM 5.1z-ai8240.2203K$0.9660$3.04
32Qwen3.6 27B (Non-reasoning)Alibaba8229.3$0.6000$3.60
33Qwen: Qwen3.7 Plusqwen82391M$0.3200$1.28
34Prime Intellect: INTELLECT-3prime-intellect8215.6131K$0.2000$1.10
35xAI: Grok 4 Fastx-ai8216.52M$0.2000$0.5000
36MoonshotAI: Kimi K2 0711moonshotai8219.4131K$0.5700$2.30
37Anthropic: Claude 3.5 Haikuanthropic8212.3200K$0.8000$4.00
38Qwen: Qwen3.5-9Bqwen8221.4262K$0.1000$0.1500
39inclusionAI: Ling-2.6-flashinclusionai8219.3262K$0.0100$0.0300
40Grok Build 0.1 0616xAI New8239.8$1.00$2.00
41MiniMax: MiniMax M2.1minimax8231.4205K$0.3000$1.20
42Amazon: Nova Premier 1.0amazon8212.71M$2.50$12.50
43OpenAI: GPT-4o Audioopenai8212.8128K$2.50$10.00
44xAI: Grok 3x-ai8218.4131K$3.00$15.00
45Mistral: Sabamistralai826.433K$0.2000$0.6000
46Xiaomi: MiMo-V2-Proxiaomi8240.31M$1.00$3.00
47Qwen: Qwen3.6 35B A3Bqwen8231.6262K$0.1400$1.00
48Z.ai: GLM 4.6Vz-ai8216.8131K$0.3000$0.9000
49Qwen: Qwen3 VL 30B A3B Instructqwen8210262K$0.1300$0.5200
50Anthropic: Claude Opus 4.1anthropic8233.7200K$15.00$75.00
#1NewTop Pick93.3
Anthropic: Claude Fable 5anthropic
AI 59.91M ctx$10.00/M in
#2NewIn-House Pick93
Anthropic: Claude Sonnet 5anthropic
AI 53.41M ctx$2.00/M in
#3Top PickIn-House Pick92.3
Anthropic: Claude Opus 4.8anthropic
AI 55.71M ctx$5.00/M in
#489.8
Google: Gemini 3.1 Pro Previewgoogle
AI 46.51M ctx$2.00/M in
#589.4
Anthropic: Claude Opus 4.7anthropic
AI 53.51M ctx$5.00/M in
#688.9
Google: Gemini 3.5 Flashgoogle
AI 50.21M ctx$1.50/M in
#7Top Pick88.3
OpenAI: GPT-5.5openai
AI 54.81.1M ctx$5.00/M in
#887
OpenAI: GPT-5.4openai
AI 51.41.1M ctx$2.50/M in
#9New86.1
Z.ai: GLM 5.2z-ai
AI 51.11M ctx$0.7700/M in
#1085.1
Anthropic: Claude Sonnet 4.6anthropic
AI 47.21M ctx$3.00/M in
#1183.5
Qwen: Qwen3.7 Maxqwen
AI 461M ctx$1.25/M in
#1283
Anthropic: Claude Opus 4.6anthropic
AI 37.81M ctx$5.00/M in
#1382.1
DeepSeek: DeepSeek V4 Prodeepseek
AI 44.31M ctx$0.4350/M in
#1482
DeepSeek: DeepSeek V3.2deepseek
AI 24.7131K ctx$0.2288/M in
#1582
DeepSeek: DeepSeek V3.1 Terminusdeepseek
AI 21.4164K ctx$0.2700/M in
#1682
Qwen: Qwen3 235B A22B Instruct 2507qwen
AI 19.6262K ctx$0.0900/M in
#1782
Inception: Mercury 2inception
AI 25.3128K ctx$0.2500/M in
#1882
inclusionAI: Ling-2.6-1T (free)inclusionai
AI 33.6262K ctxFree/M in
#19NewBest for Agents82
Nex AGI: Nex-N2-Pronex-agi
AI 41262K ctx$0.2500/M in
#2082
MoonshotAI: Kimi K2 Thinkingmoonshotai
AI 32.7262K ctx$0.6000/M in
#2182
DeepSeek: DeepSeek V3.1deepseek
AI 20.7164K ctx$0.2100/M in
#2282
xAI: Grok 3 Minix-ai
AI 22.5131K ctx$0.3000/M in
#2382
Anthropic: Claude 3.7 Sonnet (thinking)anthropic
AI 27.1200K ctx$3.00/M in
#2482
Xiaomi: MiMo-V2-Omnixiaomi
AI 35262K ctx$0.4000/M in
#2582
Nova 2.0 Lite (medium)Amazon
AI 19$0.3000/M in
#2682
OpenAI: GPT-5.3-Codexopenai
AI 44.3400K ctx$1.75/M in
#2782
Qwen: Qwen3 VL 30B A3B Thinkingqwen
AI 19.7131K ctx$0.1300/M in
#2882
OpenAI: gpt-oss-20bopenai
AI 14.9131K ctx$0.0290/M in
#2982
Qwen: Qwen3 8Bqwen
AI 8.3131K ctx$0.1170/M in
#3082
OpenAI: o1openai
AI 23.4200K ctx$15.00/M in
#3182
Z.ai: GLM 5.1z-ai
AI 40.2203K ctx$0.9660/M in
#3282
Qwen3.6 27B (Non-reasoning)Alibaba
AI 29.3$0.6000/M in
#3382
Qwen: Qwen3.7 Plusqwen
AI 391M ctx$0.3200/M in
#3482
Prime Intellect: INTELLECT-3prime-intellect
AI 15.6131K ctx$0.2000/M in
#3582
xAI: Grok 4 Fastx-ai
AI 16.52M ctx$0.2000/M in
#3682
MoonshotAI: Kimi K2 0711moonshotai
AI 19.4131K ctx$0.5700/M in
#3782
Anthropic: Claude 3.5 Haikuanthropic
AI 12.3200K ctx$0.8000/M in
#3882
Qwen: Qwen3.5-9Bqwen
AI 21.4262K ctx$0.1000/M in
#3982
inclusionAI: Ling-2.6-flashinclusionai
AI 19.3262K ctx$0.0100/M in
#40New82
Grok Build 0.1 0616xAI
AI 39.8$1.00/M in
#4182
MiniMax: MiniMax M2.1minimax
AI 31.4205K ctx$0.3000/M in
#4282
Amazon: Nova Premier 1.0amazon
AI 12.71M ctx$2.50/M in
#4382
OpenAI: GPT-4o Audioopenai
AI 12.8128K ctx$2.50/M in
#4482
xAI: Grok 3x-ai
AI 18.4131K ctx$3.00/M in
#4582
Mistral: Sabamistralai
AI 6.433K ctx$0.2000/M in
#4682
Xiaomi: MiMo-V2-Proxiaomi
AI 40.31M ctx$1.00/M in
#4782
Qwen: Qwen3.6 35B A3Bqwen
AI 31.6262K ctx$0.1400/M in
#4882
Z.ai: GLM 4.6Vz-ai
AI 16.8131K ctx$0.3000/M in
#4982
Qwen: Qwen3 VL 30B A3B Instructqwen
AI 10262K ctx$0.1300/M in
#5082
Anthropic: Claude Opus 4.1anthropic
AI 33.7200K ctx$15.00/M in

How we rank AI models

The Design for Online AI Model Leaderboard scores 592 models on a single 0–100 scale built from four weighted dimensions: intelligence (reasoning and knowledge benchmarks), technical capability (coding and tool use), content quality (writing and instruction-following) and value (capability per dollar).

Underlying data is aggregated from the OpenRouter API for pricing and availability, Artificial Analysis for intelligence, coding and agentic indices, and the Hugging Face Open LLM Leaderboard for open-model benchmarks. We refresh these sources daily and layer our own editorial review on top, so a model that benchmarks well but is impractical to deploy will not automatically top the table.

Models are grouped into tiers (Frontier, Professional, Specialist, Efficient, Emerging and Legacy) to make like-for-like comparison easier, and newly released models are flagged so you can see what has just landed.

Leaderboard FAQ

How often is the leaderboard updated?

Pricing, availability and benchmark data are synced daily from our sources, and editorial scores are reviewed whenever a significant new model is released.

How is the overall score calculated?

Each model is graded 0–10 on intelligence, technical capability, content quality and value; those dimensions are weighted and combined into the 0–100 overall score used to rank the table.

Where does the data come from?

From the OpenRouter API, Artificial Analysis and the Hugging Face Open LLM Leaderboard, combined with hands-on editorial testing by the Design for Online team.