Home > AI Models > Google: Gemma 3 4B

Google: Gemma 3 4B

Name: Google: Gemma 3 4B Review
Item: Google: Gemma 3 4B
Author: Design for Online Editorial

NEWKimi K3in at #9 NEWKAT-Coder-Air V2.5in at #560 NEWKAT-Coder-Pro V2.5in at #568 NEWMuse Spark 1.1in at #392 NEWUncensoredin at #487 NEWGPT-5.6 Terrain at #11 NEWGPT-5.6 Sol Proin at #416 NEWGPT-5.6 Solin at #2

Google: Gemma 3 4B

google · Released Mar 13, 2025

Intelligence #286 / 612

31.6 our score

Speed #276 / 287

29.8 tok/s

Input Price #158 / 612

$0.050 per 1M tokens

Output Price #144 / 612

$0.100 per 1M tokens

Context #262 / 612

131,072 tokens

Gemma 3 4B is Google's smallest Gemma 3 variant, offering vision support and a large context window at a very low price. Its reasoning, coding and agentic benchmarks are among the lowest in its generation, reflecting its compact size.

It is best suited to lightweight, high-volume tasks such as basic tagging, short-form content drafts, or simple image description where cost per call is the deciding factor. It is not appropriate for coding assistance or any task requiring careful reasoning.

Businesses running large-scale, low-complexity pipelines may find its price attractive, but should not expect quality comparable to larger models.

Assessed July 10, 2026

Editorial notes

Gemma 3 4B from Google is a small, low-cost vision model suited only to simple tasks given its limited reasoning and coding scores.

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

DFO Verdict

Gemma 3 4B from Google is a small, low-cost vision model suited only to simple tasks given its limited reasoning and coding scores.

#286 of 612 overall

How Google: Gemma 3 4B compares

Google: Gemma 3 4B ranks #386 of 393 AI models we track for overall intelligence, #154 of 157 for coding, #285 of 300 for agentic tasks. Its 131K-token context window is larger than 57% of the models we list. At $0.05 per million input tokens it is cheaper than 74% of comparable models.

Position in the field

Intelligence: smarter than 53% of models #286

Speed: faster than 4% of models #276

Price: cheaper than 74% of models #158

Context: larger than 57% of models #262

worst in fieldmedianbest in field

Price vs frontier peers · $ per 1M tokens

Google: Gemma 3 4B $0.05 in $0.10 out

Anthropic: Claude Fable 5 $10.00 in $50.00 out

Anthropic: Claude Opus 4.8 $5.00 in $25.00 out

Google: Gemini 3.1 Pro Preview $2.00 in $12.00 out

Dark bar = input · light bar = output, scaled to the priciest peer.

Context window vs peers · tokens

Google: Gemini 3.1 Pro Preview 1M

Anthropic: Claude Fable 5 1M

Anthropic: Claude Opus 4.8 1M

Google: Gemma 3 4B 131K

1M tokens ≈ 8 full-length novels or ~2,500 pages of business documents in a single request.

Performance profile

Strongest on value. The pulled-in intelligence corner is the trade-off, and if the shape matters more than the price, this is your model.

Compare shapes side-by-side →

Pricing

Token Type	Cost per 1M tokens	Cost per 1K tokens
Input	$0.05	$0.000050
Output	$0.10	$0.000100

What would Google: Gemma 3 4B cost your business?

Pick the job that looks most like yours, then fine-tune with the sliders. Estimates update live.

A website chatbot handling around 100 customer conversations a day, a few short messages each.

Requests per month 3,000

One request is one message, email, draft or automation call.

Size of each request 1,200 tokens

$0/mo Google: Gemma 3 4B

$0/mo Anthropic: Claude Fable 5

$0/mo MoonshotAI: Kimi K3 · best value

Full calculator with 612 models → Price Calculator

DFO AI AUTOMATION

These numbers get smaller with the right architecture.

We route routine calls to cheap models and save Google: Gemma 3 4B for the hard ones. Most clients cut their estimate by 60-80%.

Talk to our team

About Google: Gemma 3 4B

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities,..

Frequently asked questions about Google: Gemma 3 4B

How much does Google: Gemma 3 4B cost?

Google: Gemma 3 4B costs $0.05 per million input tokens and $0.10 per million output tokens.

What is the context window of Google: Gemma 3 4B?

Google: Gemma 3 4B has a context window of 131,072 tokens (131K).

Is Google: Gemma 3 4B good for coding?

On our coding benchmark index, Google: Gemma 3 4B ranks #154 of 157 models, placing it in the broader range of the field for code generation and debugging.

What can Google: Gemma 3 4B do?

Google: Gemma 3 4B supports image/vision input.

Who created Google: Gemma 3 4B?

Google: Gemma 3 4B is developed by Google and was released on March 13, 2025.

Performance profile

Intelligence 0.2

Technical 0.3

Content 3

Value 8

Reasoning: No
Input
Output
Context: 131,072 tokens
Max output: 16,384 tokens
Tokenizer: Gemini
Released: Mar 13, 2025

Modality data from OpenRouter; may understate provider-native audio/video/image output.

Model information

Provider google

OpenRouter ID google/gemma-3-4b-it

Status Active

Capabilities

Vision

Live performance · 30 min refresh

100% Avg uptime

506ms Best latency

20 tok/s Best throughput

1/1 Active endpoints

External resources View on OpenRouter API access, playground & provider details API Quickstart Sample code and integration guide

Data sourced from the OpenRouter API, Artificial Analysis, the Hugging Face Open LLM Leaderboard and our own internal testing. Scores are editorially curated by our team.

Last updated: July 19, 2026 10:00 am

Issues with our rankings? Contact us

Google: Gemma 3 4B

DFO Verdict

How Google: Gemma 3 4B compares

Pricing

What would Google: Gemma 3 4B cost your business?

About Google: Gemma 3 4B

Explore Related Models

Frequently asked questions about Google: Gemma 3 4B

How much does Google: Gemma 3 4B cost?

What is the context window of Google: Gemma 3 4B?

Is Google: Gemma 3 4B good for coding?

What can Google: Gemma 3 4B do?

Who created Google: Gemma 3 4B?