Google: Gemini 2.5 Flash Lite

Google: Gemini 2.5 Flash Lite

google · Released Jul 22, 2025 Efficient
46.1
Our Score

Performance Profile

Intelligence2.9Technical1.8Value8.3Content4
Intelligence 2.9/10
Technical 1.8/10
Content 4/10
Value 8.3/10

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, "thinking" (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the Reasoning API parameter to selectively trade off cost for intelligence.

$0.10 / 1M
Input Price
$0.40 / 1M
Output Price
1M tokens
Context Window
65,535 tokens
Max Output

Capabilities

Tool Use Function Calling Vision

Architecture

ModalityText + Image + File + Audio + Video → Text
TokenizerGemini

Performance Indices

Source: Artificial Analysis

12.7 Intelligence Index
7.4 Coding Index
10.7 Agentic Index
35.3 Math Index

Benchmark Scores

Intelligence

GPQA Diamond 47.4% Graduate-level scientific reasoning
HLE 3.7% Humanity's Last Exam
MMLU Pro 72.4% Multi-task language understanding
MATH 500 92.6% Mathematical problem-solving
AIME 50% Competition mathematics
AIME 2025 35.3% Competition mathematics (2025)
SciCode 17.7% Scientific computing

Technical

LiveCodeBench 40% Live coding evaluation
TerminalBench Hard 2.3% Agentic terminal tasks
τ²-Bench 19% Conversational agent benchmark

Content

IFBench 31.5% Instruction following
LCR 31.3% Long-context reasoning

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID google/gemini-2.5-flash-lite
Providergoogle
Model FamilyGemini 2
Release Date July 22, 2025
Context Length1,048,576 tokens
Max Completion65,535 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.10 $0.000100
Output $0.40 $0.000400

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

98.2%
Avg Uptime
496ms
Best Latency (TTFT)
128 tok/s
Best Throughput
2/2
Active Endpoints
Available via: Google, Google AI Studio

Leaderboard Categories