xAI: Grok 4

xAI: Grok 4

x-ai · Released Jul 9, 2025 Professional
Intelligence #14 / 590
82.0 Our Score
Speed #248 / 278
41.6 tokens / sec
Input #529 / 590
$3.00 per 1M tokens
Output #537 / 590
$15.00 per 1M tokens
Context #177 / 590
256,000 tokens

Analysis Summary

Grok 4 is xAI's flagship reasoning model, and its benchmark profile is striking: math index at 92.7, livecodebench at 0.819, GPQA at 0.877, and HLE at 0.239 place it among the strongest performers in the field on technical tasks. The agentic index of 56.4 and tau2 of 0.749 confirm reliable multi-step tool use. Vision, function calling, and a 256K context window round out a very capable feature set.

For businesses, Grok 4 is a strong fit for autonomous coding agents, complex mathematical or scientific analysis, and long-document reasoning. The combination of vision and tool use makes it versatile across content, data, and engineering workflows. Its main constraint is cost: at $3 input and $15 output, it is expensive and best reserved for high-value tasks.

Teams that need frontier-level coding and reasoning performance, and can justify the price, will find Grok 4 a compelling option. Pair it with a cheaper model for routine or high-volume calls to manage spend effectively.

Assessed June 30, 2026

Editorial notes

xAI Grok 4 delivers exceptional coding and math performance, strong agentic capability, vision support, and a 256K context window, at a premium price point.

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

Performance Profile

Intelligence5.5Technical7.3Value6Content7
Intelligence 5.5/10
Technical 7.3/10
Content 7/10
Value 6/10

How xAI: Grok 4 compares

XAI: Grok 4 ranks #63 of 385 AI models we track for overall intelligence, #72 of 293 for agentic tasks. Its 256K-token context window is larger than 70% of the models we list. At $3.00 per million input tokens it is cheaper than 10% of comparable models.

About xAI: Grok 4

Grok 4 is xAI's latest reasoning model with a 256k context window. It supports parallel tool calling, structured outputs, and both image and text inputs. Note that reasoning is not..

Capabilities

Tool Use Function Calling Vision

Performance Indices

Source: Artificial Analysis

33.3 Intelligence Index
56.4 Agentic Index
92.7 Math Index

Benchmark Scores

Intelligence

GPQA Diamond 87.7% Graduate-level scientific reasoning
HLE 23.9% Humanity's Last Exam
MMLU Pro 86.6% Multi-task language understanding
MATH 500 99% Mathematical problem-solving
AIME 94.3% Competition mathematics
AIME 2025 92.7% Competition mathematics (2025)
SciCode 45.7% Scientific computing

Technical

LiveCodeBench 81.9% Live coding evaluation
TerminalBench Hard 37.9% Agentic terminal tasks
τ²-Bench 74.9% Conversational agent benchmark

Content

IFBench 53.7% Instruction following
LCR 68% Long-context reasoning

Benchmark data from Artificial Analysis and Hugging Face

How does xAI: Grok 4 stack up?

Compare side-by-side with other professional models.

Compare Models

Model Information

OpenRouter ID x-ai/grok-4
Providerx-ai
Release Date July 9, 2025
Context Length256,000 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $3.00 $0.003000
Output $15.00 $0.015000

Leaderboard Categories

Frequently asked questions about xAI: Grok 4

How much does xAI: Grok 4 cost?

xAI: Grok 4 costs $3.00 per million input tokens and $15.00 per million output tokens.

What is the context window of xAI: Grok 4?

xAI: Grok 4 has a context window of 256,000 tokens (256K).

What can xAI: Grok 4 do?

xAI: Grok 4 supports image/vision input, tool use, and function calling.

Who created xAI: Grok 4?

xAI: Grok 4 is developed by xAI and was released on July 9, 2025.