Meta: Llama 3.1 8B Instruct

Meta: Llama 3.1 8B Instruct

meta-llama · Released Jul 23, 2024 Efficient
Intelligence #252 / 544
33.9 Our Score
Speed #40 / 252
180.0 tokens / sec
Input #120 / 544
$0.020 per 1M tokens
Output #120 / 544
$0.050 per 1M tokens
Context #396 / 544
16,384 tokens

Analysis Summary

Meta: Llama 3.1 8B Instruct sits in the Efficient tier on our leaderboard, ranked #252 of 544 published models on overall intelligence. At $0.020 input and $0.050 output per 1M tokens, it is among the most expensive on the market. It offers a compact context window and supports tool use and function calling.

Editorial notes

Llama 3.1 8B Instruct is an ultra-cheap open-weight model from Meta with tool use support, but limited reasoning and coding capability make it suitable only for lightweight tasks.

Assessed May 5, 2026

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

Performance Profile

Intelligence2.3Technical1.2Value7.5Content3
Intelligence 2.3/10
Technical 1.2/10
Content 3/10
Value 7.5/10

Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This 8B instruct-tuned version is fast and efficient. It has demonstrated strong performance compared to..

8B Parameters

Capabilities

Tool Use Function Calling

Architecture Detail

Instruct Typellama3

Performance Indices

Source: Artificial Analysis

11.8 Intelligence Index
4.9 Coding Index
8.6 Agentic Index
4.3 Math Index

Benchmark Scores

Intelligence

GPQA Diamond 25.9% Graduate-level scientific reasoning
HLE 5.1% Humanity's Last Exam
MMLU Pro 47.6% Multi-task language understanding
MATH 500 51.9% Mathematical problem-solving
AIME 7.7% Competition mathematics
AIME 2025 4.3% Competition mathematics (2025)
SciCode 13.2% Scientific computing

Technical

LiveCodeBench 11.6% Live coding evaluation
TerminalBench Hard 0.8% Agentic terminal tasks
τ²-Bench 16.4% Conversational agent benchmark

Content

IFBench 28.6% Instruction following
LCR 15.7% Long-context reasoning

Benchmark data from Artificial Analysis and Hugging Face

How does Meta: Llama 3.1 8B Instruct stack up?

Compare side-by-side with other efficient models.

Compare Models

Model Information

OpenRouter ID meta-llama/llama-3.1-8b-instruct
Providermeta-llama
Model FamilyLlama 3
Release Date July 23, 2024
Context Length16,384 tokens
Max Completion16,384 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.02 $0.000020
Output $0.05 $0.000050

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

88.5%
Avg Uptime
97ms
Best Latency (TTFT)
323 tok/s
Best Throughput
7/8
Active Endpoints
Available via: Novita, DeepInfra, Nebius, Groq, Friendli, Cerebras, Cloudflare, WandB

Leaderboard Categories