xAI: Grok 4

xAI: Grok 4

x-ai · Released Jul 9, 2025 Professional
Intelligence #10 / 576
82.0 Our Score
Speed #240 / 271
41.6 tokens / sec
Input #517 / 577
$3.00 per 1M tokens
Output #525 / 577
$15.00 per 1M tokens
Context #170 / 577
256,000 tokens

Analysis Summary

Grok 4 is xAI's flagship model, with an intelligence index of 41.5 and a coding index of 40.5, placing it in the very strong tier of the current field. Its agentic index of 56.4 and terminalbench score of 0.379 are among the higher values in the database, and math performance at 92.7 is world-class. Vision, tool use, and function calling are all supported, with a 256K context window.

For businesses, Grok 4 is well suited to complex coding tasks, autonomous agent workflows, long-document analysis, and multimodal use cases. Its livecodebench score of 0.819 and tau2 of 0.749 confirm strong real-world coding and agentic reliability. Instruction following and long-context performance are both above average.

At $3 input and $15 output per million tokens, it is priced at the premium tier. Teams running high-value engineering, research, or agentic workflows where accuracy and depth justify the cost will find Grok 4 a capable and well-rounded choice from an established provider.

Assessed June 6, 2026

Editorial notes

Grok 4 from xAI delivers very strong reasoning, coding, and agentic performance with vision support and a 256K context.

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

Performance Profile

Intelligence6.3Technical6.9Value6Content7
Intelligence 6.3/10
Technical 6.9/10
Content 7/10
Value 6/10

How xAI: Grok 4 compares

XAI: Grok 4 ranks #55 of 378 AI models we track for overall intelligence, #36 of 315 for coding, #63 of 289 for agentic tasks. Its 256K-token context window is larger than 71% of the models we list. At $3.00 per million input tokens it is cheaper than 10% of comparable models.

About xAI: Grok 4

Grok 4 is xAI's latest reasoning model with a 256k context window. It supports parallel tool calling, structured outputs, and both image and text inputs. Note that reasoning is not..

Capabilities

Tool Use Function Calling Vision

Performance Indices

Source: Artificial Analysis

41.5 Intelligence Index
40.5 Coding Index
56.4 Agentic Index
92.7 Math Index

Benchmark Scores

Intelligence

GPQA Diamond 87.7% Graduate-level scientific reasoning
HLE 23.9% Humanity's Last Exam
MMLU Pro 86.6% Multi-task language understanding
MATH 500 99% Mathematical problem-solving
AIME 94.3% Competition mathematics
AIME 2025 92.7% Competition mathematics (2025)
SciCode 45.7% Scientific computing

Technical

LiveCodeBench 81.9% Live coding evaluation
TerminalBench Hard 37.9% Agentic terminal tasks
τ²-Bench 74.9% Conversational agent benchmark

Content

IFBench 53.7% Instruction following
LCR 68% Long-context reasoning

Benchmark data from Artificial Analysis and Hugging Face

How does xAI: Grok 4 stack up?

Compare side-by-side with other professional models.

Compare Models

Model Information

OpenRouter ID x-ai/grok-4
Providerx-ai
Release Date July 9, 2025
Context Length256,000 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $3.00 $0.003000
Output $15.00 $0.015000

Leaderboard Categories

Frequently asked questions about xAI: Grok 4

How much does xAI: Grok 4 cost?

xAI: Grok 4 costs $3.00 per million input tokens and $15.00 per million output tokens.

What is the context window of xAI: Grok 4?

xAI: Grok 4 has a context window of 256,000 tokens (256K).

Is xAI: Grok 4 good for coding?

On our coding benchmark index, xAI: Grok 4 ranks #36 of 315 models, placing it in the top quartile of the field for code generation and debugging.

What can xAI: Grok 4 do?

xAI: Grok 4 supports image/vision input, tool use, and function calling.

Who created xAI: Grok 4?

xAI: Grok 4 is developed by xAI and was released on July 9, 2025.