Meta: Llama 4 Scout

Meta: Llama 4 Scout

meta-llama · Released Apr 5, 2025
42
Our Score

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input (text and image) and multilingual output (text and code) across 12 supported languages. Designed for assistant-style interaction and visual reasoning, Scout uses 16 experts per forward pass and features a context length of 10 million tokens, with a training corpus of ~40 trillion tokens. Built for high efficiency and local or commercial deployment, Llama 4 Scout incorporates early fusion for seamless modality integration. It is instruction-tuned for use in multilingual chat, captioning, and image understanding tasks. Released under the Llama 4 Community License, it was last trained on data up to August 2024 and launched publicly on April 5, 2025.

$0.08 / 1M Input Price
$0.30 / 1M Output Price
327,680 tokens Context Window
16,384 tokens Max Output

Capabilities

Tool Use Function Calling Vision

Architecture

ModalityText + Image → Text
TokenizerLlama4

Performance Indices

Source: Artificial Analysis

13.5 Intelligence Index
6.7 Coding Index
8.5 Agentic Index
14 Math Index

Benchmark Scores

Evaluations

GPQA Diamond 58.7%
Graduate-level scientific reasoning
HLE 4.3%
Humanity's Last Exam
MMLU Pro 75.2%
Multi-task language understanding
LiveCodeBench 29.9%
Live coding evaluation
SciCode 17%
Scientific computing
MATH 500 84.4%
Mathematical problem-solving
AIME 28.3%
Competition mathematics
AIME 2025 14%
Competition mathematics (2025)
IFBench 39.5%
Instruction following
LCR 25.8%
Long-context reasoning
TerminalBench Hard 1.5%
Agentic terminal tasks
τ²-Bench 15.5%
Conversational agent benchmark

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID meta-llama/llama-4-scout
Providermeta-llama
Model FamilyLlama 4
Release Date April 5, 2025
Context Length327,680 tokens
Max Completion16,384 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.08 $0.000080
Output $0.30 $0.000300

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

99.8%
Avg Uptime
93ms
Best Latency (TTFT)
36 tok/s
Best Throughput
4/4
Active Endpoints
Available via: DeepInfra, Groq, Novita, Google

Leaderboard Categories