Home > AI Models > Meta: Llama 3.2 11B Vision Instruct

Meta: Llama 3.2 11B Vision Instruct

Name: Meta: Llama 3.2 11B Vision Instruct Review
Item: Meta: Llama 3.2 11B Vision Instruct
Author: Design for Online Editorial

Meta: Llama 3.2 11B Vision Instruct

meta-llama · Released Sep 25, 2024 Professional

Intelligence #10 / 565

82.0 Our Score

Speed #167 / 262

74.1 tokens / sec

Input #295 / 565

$0.245 per 1M tokens

Output #192 / 565

$0.245 per 1M tokens

Context #228 / 565

131,072 tokens

Meta: Llama 3.2 11B Vision Instruct sits in the Professional tier on our leaderboard, ranked #10 of 565 published models on overall intelligence. At $0.245 input and $0.245 output per 1M tokens, it is among the most expensive on the market. It offers a standard large context window and supports vision.

Editorial notes

Llama 3.2 11B Vision brings multimodal input at a low price point, but its intelligence index of 8.7 and weak coding scores make it unsuitable for complex business reasoning or agentic tasks.

Assessed May 28, 2026

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

Reasoning: No
Input
Output
Context: 131,072 tokens
Max output: 16,384 tokens
Tokenizer: Llama3
Released: Sep 25, 2024

Modality data from OpenRouter; may understate provider-native audio/video/image output.

Performance Profile

How Meta: Llama 3.2 11B Vision Instruct compares

Meta: Llama 3.2 11B Vision Instruct ranks #330 of 370 AI models we track for overall intelligence, #275 of 307 for coding, #265 of 282 for agentic tasks. Its 131K-token context window is larger than 60% of the models we list. At $0.25 per million input tokens it is cheaper than 48% of comparable models.

About Meta: Llama 3.2 11B Vision Instruct

Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks combining visual and textual data. It excels in tasks such as image captioning and..

11B Parameters

Capabilities

Vision

Architecture Detail

Instruct Type llama3

Performance Indices

Source: Artificial Analysis

8.7 Intelligence Index

4.2 Coding Index

7.7 Agentic Index

1.7 Math Index

Benchmark Scores

GPQA Diamond 22.1% Graduate-level scientific reasoning

HLE 5.2% Humanity's Last Exam

MMLU Pro 46.4% Multi-task language understanding

MATH 500 51.6% Mathematical problem-solving

AIME 9.3% Competition mathematics

AIME 2025 1.7% Competition mathematics (2025)

SciCode 11.2% Scientific computing

LiveCodeBench 11% Live coding evaluation

TerminalBench Hard 0.8% Agentic terminal tasks

τ²-Bench 14.6% Conversational agent benchmark

IFBench 30.4% Instruction following

LCR 11.7% Long-context reasoning

Benchmark data from Artificial Analysis and Hugging Face

How does Meta: Llama 3.2 11B Vision Instruct stack up?

Compare side-by-side with other professional models.

Compare Models

Model Information

OpenRouter ID	`meta-llama/llama-3.2-11b-vision-instruct`
Provider	meta-llama
Model Family	Llama 3
Release Date	September 25, 2024
Context Length	131,072 tokens
Max Completion	16,384 tokens
Status	Active

Pricing

Token Type	Cost per 1M tokens	Cost per 1K tokens
Input	$0.25	$0.000245
Output	$0.25	$0.000245

Live Performance

Live endpoint metrics, refreshed every 30 minutes.

100%

Avg Uptime

368ms

Best Latency (TTFT)

37 tok/s

Best Throughput

1/1

Active Endpoints

Available via: DeepInfra

Leaderboard Categories

Content Writing

External Resources

View on OpenRouter API access, playground, and provider details

API Quickstart Sample code and integration guide

Frequently asked questions about Meta: Llama 3.2 11B Vision Instruct

How much does Meta: Llama 3.2 11B Vision Instruct cost?

Meta: Llama 3.2 11B Vision Instruct costs $0.25 per million input tokens and $0.25 per million output tokens.

What is the context window of Meta: Llama 3.2 11B Vision Instruct?

Meta: Llama 3.2 11B Vision Instruct has a context window of 131,072 tokens (131K).

Is Meta: Llama 3.2 11B Vision Instruct good for coding?

On our coding benchmark index, Meta: Llama 3.2 11B Vision Instruct ranks #275 of 307 models, placing it in the broader range of the field for code generation and debugging.

What can Meta: Llama 3.2 11B Vision Instruct do?

Meta: Llama 3.2 11B Vision Instruct supports image/vision input.

Who created Meta: Llama 3.2 11B Vision Instruct?

Meta: Llama 3.2 11B Vision Instruct is developed by Meta and was released on September 25, 2024.

Meta: Llama 3.2 11B Vision Instruct

Meta: Llama 3.2 11B Vision Instruct

Analysis Summary

Performance Profile

How Meta: Llama 3.2 11B Vision Instruct compares

About Meta: Llama 3.2 11B Vision Instruct

Capabilities

Architecture Detail

Performance Indices

Benchmark Scores

Intelligence

Technical

Content

Model Information

Pricing

Live Performance

Leaderboard Categories

External Resources

Frequently asked questions about Meta: Llama 3.2 11B Vision Instruct

How much does Meta: Llama 3.2 11B Vision Instruct cost?

What is the context window of Meta: Llama 3.2 11B Vision Instruct?

Is Meta: Llama 3.2 11B Vision Instruct good for coding?

What can Meta: Llama 3.2 11B Vision Instruct do?

Who created Meta: Llama 3.2 11B Vision Instruct?

Meta: Llama 3.2 11B Vision Instruct

Performance Profile

How Meta: Llama 3.2 11B Vision Instruct compares

About Meta: Llama 3.2 11B Vision Instruct

Capabilities

Architecture Detail

Performance Indices

Benchmark Scores

Intelligence

Technical

Content

Model Information

Pricing

Live Performance

Leaderboard Categories

External Resources

Explore Related Models

Frequently asked questions about Meta: Llama 3.2 11B Vision Instruct

How much does Meta: Llama 3.2 11B Vision Instruct cost?

What is the context window of Meta: Llama 3.2 11B Vision Instruct?

Is Meta: Llama 3.2 11B Vision Instruct good for coding?

What can Meta: Llama 3.2 11B Vision Instruct do?

Who created Meta: Llama 3.2 11B Vision Instruct?