OpenAI: GPT-5.1-Codex

OpenAI: GPT-5.1-Codex

openai · Released Nov 13, 2025 Professional
76.5
Our Score

Performance Profile

Intelligence7.7Technical7.3Value6.5Content7.5
Intelligence 7.7/10
Technical 7.3/10
Content 7.5/10
Value 6.5/10

GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks. The model supports building projects from scratch, feature development, debugging, large-scale refactoring, and code review. Compared to GPT-5.1, Codex is more steerable, adheres closely to developer instructions, and produces cleaner, higher-quality code outputs. Reasoning effort can be adjusted with the reasoning.effort parameter. It adapts reasoning effort dynamically—providing fast responses for small tasks while sustaining extended multi-hour runs for large projects. The model is trained to perform structured code reviews, catching critical flaws by reasoning over dependencies and validating behavior against tests. It also supports multimodal inputs such as images or screenshots for UI development and integrates tool use for search, dependency installation, and environment setup. Codex is intended specifically for agentic coding applications.

$1.25 / 1M
Input Price
$10.00 / 1M
Output Price
400,000 tokens
Context Window
128,000 tokens
Max Output

Capabilities

Tool Use Function Calling Vision

Architecture

ModalityText + Image → Text
TokenizerGPT

Performance Indices

Source: Artificial Analysis

43.1 Intelligence Index
36.6 Coding Index
58.9 Agentic Index
95.7 Math Index

Benchmark Scores

Intelligence

GPQA Diamond 86% Graduate-level scientific reasoning
HLE 23.4% Humanity's Last Exam
MMLU Pro 86% Multi-task language understanding
AIME 2025 95.7% Competition mathematics (2025)
SciCode 40.2% Scientific computing

Technical

LiveCodeBench 84.9% Live coding evaluation
TerminalBench Hard 34.8% Agentic terminal tasks
τ²-Bench 83% Conversational agent benchmark

Content

IFBench 70% Instruction following
LCR 67.3% Long-context reasoning

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID openai/gpt-5.1-codex
Provideropenai
Release Date November 13, 2025
Context Length400,000 tokens
Max Completion128,000 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $1.25 $0.001250
Output $10.00 $0.010000

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

98.8%
Avg Uptime
1,504ms
Best Latency (TTFT)
47 tok/s
Best Throughput
1/2
Active Endpoints
Available via: OpenAI, Azure

Leaderboard Categories