IBM: Granite 4.1 8B

IBM: Granite 4.1 8B

ibm-granite · Released Apr 30, 2026 Efficient New
Intelligence #222 / 557
38.2 Our Score
Speed #80 / 259
128.6 tokens / sec
Input #152 / 560
$0.050 per 1M tokens
Output #136 / 560
$0.100 per 1M tokens
Context #222 / 560
131,072 tokens

Analysis Summary

IBM: Granite 4.1 8B sits in the Efficient tier on our leaderboard, ranked #222 of 557 published models on overall intelligence. At $0.050 input and $0.100 output per 1M tokens, it is among the most expensive on the market. It offers a standard large context window and supports tool use and function calling.

Editorial notes

IBM Granite 4.1 8B is a low-cost compact model with tool use, but benchmark scores are very limited across reasoning and coding, restricting it to simple automation tasks.

Assessed May 14, 2026

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

Performance Profile

Intelligence2.3Technical2.5Value8Content2.5
Intelligence 2.3/10
Technical 2.5/10
Content 2.5/10
Value 8/10

Granite 4.1 8B is a dense, decoder-only 8-billion-parameter language model from IBM, part of the Granite 4.1 family. It supports a 131K-token context window and is designed for enterprise tasks..

8B Parameters

Capabilities

Tool Use Function Calling

Performance Indices

Source: Artificial Analysis

12.4 Intelligence Index
7.3 Coding Index
27.8 Agentic Index

Benchmark Scores

Intelligence

GPQA Diamond 43.3% Graduate-level scientific reasoning
HLE 3.8% Humanity's Last Exam
SciCode 21.8% Scientific computing

Technical

τ²-Bench 27.8% Conversational agent benchmark

Content

IFBench 38.6% Instruction following
LCR 12% Long-context reasoning

Benchmark data from Artificial Analysis and Hugging Face

How does IBM: Granite 4.1 8B stack up?

Compare side-by-side with other efficient models.

Compare Models

Model Information

OpenRouter ID ibm-granite/granite-4.1-8b
Provideribm-granite
Release Date April 30, 2026
Context Length131,072 tokens
Max Completion131,072 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.05 $0.000050
Output $0.10 $0.000100

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

100%
Avg Uptime
203ms
Best Latency (TTFT)
79 tok/s
Best Throughput
1/1
Active Endpoints
Available via: WandB

Leaderboard Categories