Home > AI Models > Meta: Llama 3.3 70B Instruct

Meta: Llama 3.3 70B Instruct

Name: Meta: Llama 3.3 70B Instruct Review
Item: Meta: Llama 3.3 70B Instruct
Rating: 2.8
Author: Design for Online

Meta: Llama 3.3 70B Instruct

meta-llama · Released Dec 6, 2024 Legacy

Intelligence #301 / 556

28.0 Our Score

Speed

— Not reported

Input #189 / 557

$0.100 per 1M tokens

Output #211 / 557

$0.320 per 1M tokens

Context #220 / 557

131,072 tokens

Meta: Llama 3.3 70B Instruct sits in the Legacy tier on our leaderboard, ranked #301 of 556 published models on overall intelligence. At $0.100 input and $0.320 output per 1M tokens, it is among the most expensive on the market. It offers a standard large context window and supports tool use and function calling.

Editorial notes

Llama 3.3 70B Instruct from Meta offers tool use, function calling, and a 131K context at very low cost ($0.10 input per million tokens), though no benchmark data is available in this listing; strong value for open-weight deployments.

Assessed May 14, 2026

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

Reasoning: No
Input
Output
Context: 131,072 tokens
Max output: 16,384 tokens
Tokenizer: Llama3
Released: Dec 6, 2024

Modality data from OpenRouter; may understate provider-native audio/video/image output.

Performance Profile

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model..

70B Parameters

Capabilities

Tool Use Function Calling

Architecture Detail

Instruct Type llama3

How does Meta: Llama 3.3 70B Instruct stack up?

Compare side-by-side with other legacy models.

Compare Models

Model Information

OpenRouter ID	`meta-llama/llama-3.3-70b-instruct`
Provider	meta-llama
Model Family	Llama 3
Release Date	December 6, 2024
Context Length	131,072 tokens
Max Completion	16,384 tokens
Status	Active

Pricing

Token Type	Cost per 1M tokens	Cost per 1K tokens
Input	$0.10	$0.000100
Output	$0.32	$0.000320

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

97.6%

Avg Uptime

179ms

Best Latency (TTFT)

156 tok/s

Best Throughput

13/15

Active Endpoints

Available via: DeepInfra, Inceptron, Nebius, AkashML, Novita, Parasail, Cloudflare, SambaNova +5 more

Leaderboard Categories

General Tool Use

External Resources

View on OpenRouter API access, playground, and provider details

API Quickstart Sample code and integration guide

Meta: Llama 3.3 70B Instruct

Performance Profile

Capabilities

Architecture Detail

Model Information

Pricing

Live Performance

Leaderboard Categories

External Resources

Explore Related Models