SEO

Best models for search engine optimisation tasks.

Updated June 9, 2026

Best models for search engine optimisation tasks.

# Model Score AI Index Context Input / 1M Output / 1M Caps
1OpenAI: GPT-4o-mini Search Previewopenai8212.6128K$0.1500$0.6000
2Qwen2.5 MaxAlibaba8216.3$1.60$6.40
3Qwen: Qwen-Turboqwen8212131K$0.0325$0.1300
4Qwen3 4B 2507 (Reasoning)Alibaba8218.2FreeFree
5Mistral: Mistral Small 3mistralai8212.733K$0.0500$0.0800
6Qwen3 4B (Reasoning)Alibaba8214.2$0.1100$1.26
7Mistral: Mistral Small Creativemistralai8210.233K$0.1000$0.3000
8Perplexity: Sonarperplexity8215.5127K$1.00$1.00
9inclusionAI: Ling-2.6-flashinclusionai8226.2262K$0.0100$0.0300
10Perplexity: Sonar Pro Searchperplexity8215.5200K$3.00$15.00
11OpenAI: GPT-4o-miniopenai8212.6128K$0.1500$0.6000
12Google: Gemini 2.5 Flash Lite Preview 09-2025google8231.11M$0.1000$0.4000
13Upstage: Solar Pro 3upstage8225.9128K$0.1500$0.6000
14Qwen: Qwen3 30B A3B Thinking 2507qwen8222.4131K$0.0800$0.4000
15Inception: Mercury 2inception8232.8128K$0.2500$0.7500
16Qwen: Qwen3 30B A3B Instruct 2507qwen8215131K$0.0482$0.1931
17Qwen: Qwen3.5-9Bqwen8232.4262K$0.1000$0.1500
18Mistral: Mistral Small 3.2 24Bmistralai8215.1128K$0.0750$0.2000
19Mistral: Mistral Small 4mistralai8227.8262K$0.1500$0.6000
20DeepSeek: R1 0528deepseek8216.4164K$0.5000$2.15
21inclusionAI: Ling-2.6-flash (free)inclusionai8226.2262KFreeFree
22Mistral: Mistral Medium 3mistralai8218.8131K$0.4000$2.00
23Qwen3.5 4B (Non-reasoning)Alibaba8222.6$0.0300$0.1500

How we rank AI models

The Design for Online AI Model Leaderboard scores 576 models on a single 0–100 scale built from four weighted dimensions: intelligence (reasoning and knowledge benchmarks), technical capability (coding and tool use), content quality (writing and instruction-following) and value (capability per dollar).

Underlying data is aggregated from the OpenRouter API for pricing and availability, Artificial Analysis for intelligence, coding and agentic indices, and the Hugging Face Open LLM Leaderboard for open-model benchmarks. We refresh these sources daily and layer our own editorial review on top, so a model that benchmarks well but is impractical to deploy will not automatically top the table.

Models are grouped into tiers (Frontier, Professional, Specialist, Efficient, Emerging and Legacy) to make like-for-like comparison easier, and newly released models are flagged so you can see what has just landed.

Leaderboard FAQ

How often is the leaderboard updated?

Pricing, availability and benchmark data are synced daily from our sources, and editorial scores are reviewed whenever a significant new model is released.

How is the overall score calculated?

Each model is graded 0–10 on intelligence, technical capability, content quality and value; those dimensions are weighted and combined into the 0–100 overall score used to rank the table.

Where does the data come from?

From the OpenRouter API, Artificial Analysis and the Hugging Face Open LLM Leaderboard, combined with hands-on editorial testing by the Design for Online team.