OpenAI: GPT-4o Audio

OpenAI: GPT-4o Audio

openai · Released Aug 15, 2025 Professional
Intelligence #14 / 590
82.0 Our Score
Speed #259 / 279
32.9 tokens / sec
Input #518 / 592
$2.50 per 1M tokens
Output #504 / 592
$10.00 per 1M tokens
Context #340 / 592
128,000 tokens

Analysis Summary

GPT-4o Audio is a specialised multimodal variant from OpenAI, released August 2025, with native audio-in and audio-out capability. Its intelligence index of 12.8 and coding index of 13.1 place it well below the current general-purpose tier, and its math index of 6 confirms it is not designed for analytical or reasoning-heavy tasks.

The model's value is in its modality: for businesses building voice interfaces, call centre automation, real-time transcription pipelines, or audio-driven customer experiences, it fills a gap that text-only models cannot. Tool use and function calling are supported, enabling integration into broader agentic systems where audio is a first-class input.

At $2.50 input and $10 output per million tokens, it is priced at the premium end relative to its general reasoning capability. Teams should adopt it specifically for audio-native use cases rather than as a general-purpose model. For any workflow that does not require audio I/O, a cheaper and more capable text model will outperform it on every dimension.

Assessed June 30, 2026

Editorial notes

GPT-4o Audio from OpenAI adds native audio input and output to a mid-tier reasoning base, suited to voice-first applications but limited in coding and general reasoning depth.

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

Performance Profile

Intelligence2.6Technical2.5Value6.3Content0
Intelligence 2.6/10
Technical 2.5/10
Content 0/10
Value 6.3/10

How OpenAI: GPT-4o Audio compares

OpenAI: GPT-4o Audio ranks #195 of 385 AI models we track for overall intelligence, #110 of 139 for coding. Its 128K-token context window is larger than 43% of the models we list. At $2.50 per million input tokens it is cheaper than 13% of comparable models.

About OpenAI: GPT-4o Audio

The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement allows the model to detect nuances within audio recordings and add depth to generated user experiences. Audio outputs..

Capabilities

Tool Use Function Calling

Performance Indices

Source: Artificial Analysis

12.8 Intelligence Index
13.1 Coding Index
6 Math Index

Benchmark Scores

Intelligence

GPQA Diamond 54.3% Graduate-level scientific reasoning
HLE 3.3% Humanity's Last Exam
MMLU Pro 74.8% Multi-task language understanding
MATH 500 75.9% Mathematical problem-solving
AIME 15% Competition mathematics
SciCode 33.3% Scientific computing

Technical

LiveCodeBench 30.9% Live coding evaluation

Benchmark data from Artificial Analysis and Hugging Face

How does OpenAI: GPT-4o Audio stack up?

Compare side-by-side with other professional models.

Compare Models

Model Information

OpenRouter ID openai/gpt-4o-audio-preview
Provideropenai
Model FamilyGPT-4o
Release Date August 15, 2025
Context Length128,000 tokens
Max Completion16,384 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $2.50 $0.002500
Output $10.00 $0.010000

Leaderboard Categories

Frequently asked questions about OpenAI: GPT-4o Audio

How much does OpenAI: GPT-4o Audio cost?

OpenAI: GPT-4o Audio costs $2.50 per million input tokens and $10.00 per million output tokens.

What is the context window of OpenAI: GPT-4o Audio?

OpenAI: GPT-4o Audio has a context window of 128,000 tokens (128K).

Is OpenAI: GPT-4o Audio good for coding?

On our coding benchmark index, OpenAI: GPT-4o Audio ranks #110 of 139 models, placing it in the broader range of the field for code generation and debugging.

What can OpenAI: GPT-4o Audio do?

OpenAI: GPT-4o Audio supports tool use and function calling.

Who created OpenAI: GPT-4o Audio?

OpenAI: GPT-4o Audio is developed by OpenAI and was released on August 15, 2025.