Meta: Llama 3.3 70B Instruct

Meta: Llama 3.3 70B Instruct

meta-llama · Released Dec 6, 2024
40
Our Score

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks. Supported languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. Model Card

$0.10 / 1M Input Price
$0.32 / 1M Output Price
131,072 tokens Context Window
16,384 tokens Max Output
70B Parameters

Capabilities

Tool Use Function Calling

Architecture

ModalityText → Text
TokenizerLlama3
Instruct Typellama3
Parameters70B

Model Information

OpenRouter ID meta-llama/llama-3.3-70b-instruct
Providermeta-llama
Model FamilyLlama 3
Release Date December 6, 2024
Context Length131,072 tokens
Max Completion16,384 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.10 $0.000100
Output $0.32 $0.000320

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

97.1%
Avg Uptime
179ms
Best Latency (TTFT)
147 tok/s
Best Throughput
15/17
Active Endpoints
Available via: DeepInfra, Inceptron, Nebius, AkashML, Novita, Parasail, Crusoe, Cloudflare +6 more

Leaderboard Categories