Google: Gemini 3.1 Flash Lite Preview
Google's Gemini 3.1 Flash Lite Preview offers broad multimodal capability — including audio and video input — with solid reasoning benchmarks and a massive 1M token context window at a low price, making it a strong value option for content and document workflows.
Assessment date: March 12, 2026
Our methodology takes into account a range of factors including pricing, functionality, capabilities, benchmark performance, and real-world applicability. Rankings are reviewed and updated regularly as new models are released. Issues with our rankings? Contact us
Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume use cases. It outperforms Gemini 2.5 Flash Lite on overall quality and approaches Gemini 2.5 Flash performance across key capabilities. Improvements span audio input/ASR, RAG snippet ranking, translation, data extraction, and code completion. Supports full thinking levels (minimal, low, medium, high) for fine-grained cost/performance trade-offs. Priced at half the cost of Gemini 3 Flash.
Capabilities
Architecture
| Modality | Text + Image + File + Audio + Video → Text |
| Tokenizer | Gemini |
Performance Indices
Source: Artificial Analysis
This model was released recently. Independent benchmark evaluations are typically completed within days of release — these figures are preliminary and are likely to be updated as testing is finalised.
Benchmark Scores
Evaluations
Benchmark data from Artificial Analysis and Hugging Face
Model Information
Pricing
| Token Type | Cost per 1M tokens | Cost per 1K tokens |
|---|---|---|
| Input | $0.25 | $0.000250 |
| Output | $1.50 | $0.001500 |
Live Performance
Live endpoint metrics — refreshed every 30 minutes.
Leaderboard Categories
External Resources
Data sourced from OpenRouter API, Artificial Analysis and Hugging Face Open LLM Leaderboard. Scores are editorially curated by our team.
Last updated: March 13, 2026 7:52 pm