Meta: Llama 3.3 70B Instruct (free)

Meta: Llama 3.3 70B Instruct (free)

meta-llama · Released Dec 6, 2024
36
Our Score

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks. Supported languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. Model Card

128,000 tokens Context Window
128,000 tokens Max Output
70B Parameters

Capabilities

Tool Use Function Calling

Architecture

ModalityText → Text
TokenizerLlama3
Instruct Typellama3
Parameters70B

Performance Indices

Source: Artificial Analysis

14.5 Intelligence Index
10.7 Coding Index
14.8 Agentic Index
7.7 Math Index

Benchmark Scores

Evaluations

GPQA Diamond 49.8%
Graduate-level scientific reasoning
HLE 4%
Humanity's Last Exam
MMLU Pro 71.3%
Multi-task language understanding
LiveCodeBench 28.8%
Live coding evaluation
SciCode 26%
Scientific computing
MATH 500 77.3%
Mathematical problem-solving
AIME 30%
Competition mathematics
AIME 2025 7.7%
Competition mathematics (2025)
IFBench 47.1%
Instruction following
LCR 15%
Long-context reasoning
TerminalBench Hard 3%
Agentic terminal tasks
τ²-Bench 26.6%
Conversational agent benchmark

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID meta-llama/llama-3.3-70b-instruct:free
Providermeta-llama
Model FamilyLlama 3
Release Date December 6, 2024
Context Length128,000 tokens
Max Completion128,000 tokens
Status Active

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

99.4%
Avg Uptime
670ms
Best Latency (TTFT)
34 tok/s
Best Throughput
1/2
Active Endpoints
Available via: OpenInference, Venice