AllenAI: Molmo2 8B

AllenAI: Molmo2 8B

allenai · Released Jan 9, 2026
22
Our Score

Molmo2-8B is an open vision-language model developed by the Allen Institute for AI (Ai2) as part of the Molmo2 family, supporting image, video, and multi-image understanding and grounding. It is based on Qwen3-8B and uses SigLIP 2 as its vision backbone, outperforming other open-weight, open-data models on short videos, counting, and captioning, while remaining competitive on long-video tasks.

$0.20 / 1M Input Price
$0.20 / 1M Output Price
36,864 tokens Context Window
36,864 tokens Max Output
8B Parameters

Capabilities

Vision

Architecture

ModalityText + Image + Video → Text
TokenizerOther
Parameters8B

Performance Indices

Source: Artificial Analysis

7.3 Intelligence Index
4.4 Coding Index

Benchmark Scores

Evaluations

GPQA Diamond 42.5%
Graduate-level scientific reasoning
HLE 4.4%
Humanity's Last Exam
SciCode 13.3%
Scientific computing
IFBench 26.9%
Instruction following

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID allenai/molmo-2-8b
Providerallenai
Release Date January 9, 2026
Context Length36,864 tokens
Max Completion36,864 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.20 $0.000200
Output $0.20 $0.000200

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

100%
Avg Uptime
428ms
Best Latency (TTFT)
9 tok/s
Best Throughput
1/1
Active Endpoints
Available via: Parasail