Google: Gemini 2.5 Flash Lite Preview 09-2025

Google: Gemini 2.5 Flash Lite Preview 09-2025

google · Released Sep 25, 2025
46
Our Score

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, "thinking" (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the Reasoning API parameter to selectively trade off cost for intelligence.

$0.10 / 1M Input Price
$0.40 / 1M Output Price
1M tokens Context Window
65,536 tokens Max Output

Capabilities

Tool Use Function Calling Vision

Architecture

ModalityText + Image + File + Audio + Video → Text
TokenizerGemini

Performance Indices

Source: Artificial Analysis

19.4 Intelligence Index
14.5 Coding Index
19 Agentic Index
46.7 Math Index

Benchmark Scores

Evaluations

GPQA Diamond 65.1%
Graduate-level scientific reasoning
HLE 4.6%
Humanity's Last Exam
MMLU Pro 79.6%
Multi-task language understanding
LiveCodeBench 64.1%
Live coding evaluation
SciCode 28.5%
Scientific computing
AIME 2025 46.7%
Competition mathematics (2025)
IFBench 41.8%
Instruction following
LCR 48%
Long-context reasoning
TerminalBench Hard 7.6%
Agentic terminal tasks
τ²-Bench 30.4%
Conversational agent benchmark

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID google/gemini-2.5-flash-lite-preview-09-2025
Providergoogle
Model FamilyGemini 2
Release Date September 25, 2025
Context Length1,048,576 tokens
Max Completion65,536 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.10 $0.000100
Output $0.40 $0.000400

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

93.8%
Avg Uptime
403ms
Best Latency (TTFT)
206 tok/s
Best Throughput
2/2
Active Endpoints
Available via: Google AI Studio, Google

Leaderboard Categories