SEO

Best models for search engine optimisation tasks.

Updated June 11, 2026

Best models for search engine optimisation tasks.

# Model Score AI Index Context Input / 1M Output / 1M Caps
1Upstage: Solar Pro 3upstage8225.9128K$0.1500$0.6000
2Qwen: Qwen3 30B A3B Thinking 2507qwen8222.4131K$0.0800$0.4000
3Inception: Mercury 2inception8232.8128K$0.2500$0.7500
4Qwen: Qwen3 30B A3B Instruct 2507qwen8215131K$0.0482$0.1931
5Qwen: Qwen3.5-9Bqwen8232.4262K$0.1000$0.1500
6Mistral: Mistral Small 3.2 24Bmistralai8215.1128K$0.0750$0.2000
7Mistral: Mistral Small 4mistralai8218.6262K$0.1500$0.6000
8DeepSeek: R1 0528deepseek8216.4164K$0.5000$2.15
9inclusionAI: Ling-2.6-flash (free)inclusionai8226.2262KFreeFree
10Mistral: Mistral Medium 3mistralai8218.8131K$0.4000$2.00
11Qwen3.5 4B (Non-reasoning)Alibaba8222.6$0.0300$0.1500
12OpenAI: GPT-4o-mini Search Previewopenai8212.6128K$0.1500$0.6000
13Qwen2.5 MaxAlibaba8216.3$1.60$6.40
14Qwen: Qwen-Turboqwen8212131K$0.0325$0.1300
15Qwen3 4B 2507 (Reasoning)Alibaba8218.2FreeFree
16Mistral: Mistral Small 3mistralai8212.733K$0.0500$0.0800
17Qwen3 4B (Reasoning)Alibaba8214.2$0.1100$1.26
18Mistral: Mistral Small Creativemistralai8210.233K$0.1000$0.3000
19Perplexity: Sonarperplexity8215.5127K$1.00$1.00
20inclusionAI: Ling-2.6-flashinclusionai8226.2262K$0.0100$0.0300
21Perplexity: Sonar Pro Searchperplexity8215.5200K$3.00$15.00
22OpenAI: GPT-4o-miniopenai8212.6128K$0.1500$0.6000
23Google: Gemini 2.5 Flash Lite Preview 09-2025google8219.41M$0.1000$0.4000

How we rank AI models

The Design for Online AI Model Leaderboard scores 577 models on a single 0–100 scale built from four weighted dimensions: intelligence (reasoning and knowledge benchmarks), technical capability (coding and tool use), content quality (writing and instruction-following) and value (capability per dollar).

Underlying data is aggregated from the OpenRouter API for pricing and availability, Artificial Analysis for intelligence, coding and agentic indices, and the Hugging Face Open LLM Leaderboard for open-model benchmarks. We refresh these sources daily and layer our own editorial review on top, so a model that benchmarks well but is impractical to deploy will not automatically top the table.

Models are grouped into tiers (Frontier, Professional, Specialist, Efficient, Emerging and Legacy) to make like-for-like comparison easier, and newly released models are flagged so you can see what has just landed.

Leaderboard FAQ

How often is the leaderboard updated?

Pricing, availability and benchmark data are synced daily from our sources, and editorial scores are reviewed whenever a significant new model is released.

How is the overall score calculated?

Each model is graded 0–10 on intelligence, technical capability, content quality and value; those dimensions are weighted and combined into the 0–100 overall score used to rank the table.

Where does the data come from?

From the OpenRouter API, Artificial Analysis and the Hugging Face Open LLM Leaderboard, combined with hands-on editorial testing by the Design for Online team.