Home > AI Models > OpenAI: GPT-4o Audio

OpenAI: GPT-4o Audio

Name: OpenAI: GPT-4o Audio Review
Item: OpenAI: GPT-4o Audio
Author: Design for Online Editorial

NEWKimi K3in at #9 NEWKAT-Coder-Air V2.5in at #560 NEWKAT-Coder-Pro V2.5in at #568 NEWMuse Spark 1.1in at #392 NEWUncensoredin at #487 NEWGPT-5.6 Terrain at #11 NEWGPT-5.6 Sol Proin at #416 NEWGPT-5.6 Solin at #2

OpenAI: GPT-4o Audio

openai · Released Aug 15, 2025

Intelligence #9 / 612

82.0 our score

Speed #270 / 287

32.9 tok/s

Input Price #531 / 612

$2.50 per 1M tokens

Output Price #518 / 612

$10.00 per 1M tokens

Context #360 / 612

128,000 tokens

GPT-4o Audio extends OpenAI's GPT-4o line with native audio input and output, useful for voice-based applications, but its reasoning and coding benchmarks are weak compared to current-generation models.

Businesses building voice assistants or audio transcription-plus-response workflows may find value in its multimodal audio handling, but it should not be relied on for complex analysis, coding, or agentic tasks.

Pricing is high relative to its measured capability, so it is best reserved for audio-specific use cases rather than general-purpose deployment.

Assessed July 10, 2026

Editorial notes

GPT-4o Audio adds speech input and output from OpenAI but shows weak reasoning and coding scores relative to current models, at a premium price.

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

DFO Verdict

GPT-4o Audio adds speech input and output from OpenAI but shows weak reasoning and coding scores relative to current models, at a premium price.

#9 of 612 overall

Benchmark scores

GPQA Diamond 54.3%

HLE 3.3%

MMLU Pro 74.8%

MATH 500 75.9%

AIME 15%

SciCode 33.3%

LiveCodeBench 30.9%

Magenta = intelligence · Ink = technical/agentic · Cyan = content & long-context · Grey = community benchmarks. Data: Artificial Analysis, Hugging Face.

12.8 Intelligence Index·13.1 Coding Index·6 Math Index

How OpenAI: GPT-4o Audio compares

OpenAI: GPT-4o Audio ranks #201 of 393 AI models we track for overall intelligence, #121 of 157 for coding. Its 128K-token context window is larger than 41% of the models we list. At $2.50 per million input tokens it is cheaper than 13% of comparable models.

Position in the field

Intelligence: smarter than 99% of models #9

Speed: faster than 6% of models #270

Price: cheaper than 13% of models #531

Context: larger than 41% of models #360

worst in fieldmedianbest in field

Price vs frontier peers · $ per 1M tokens

OpenAI: GPT-4o Audio $2.50 in $10.00 out

Anthropic: Claude Fable 5 $10.00 in $50.00 out

Anthropic: Claude Opus 4.8 $5.00 in $25.00 out

Google: Gemini 3.1 Pro Preview $2.00 in $12.00 out

Dark bar = input · light bar = output, scaled to the priciest peer.

Context window vs peers · tokens

Google: Gemini 3.1 Pro Preview 1M

Anthropic: Claude Fable 5 1M

Anthropic: Claude Opus 4.8 1M

OpenAI: GPT-4o Audio 128K

1M tokens ≈ 8 full-length novels or ~2,500 pages of business documents in a single request.

Performance profile

Strongest on value. The pulled-in content corner is the trade-off, and if the shape matters more than the price, this is your model.

Compare shapes side-by-side →

Pricing

Token Type	Cost per 1M tokens	Cost per 1K tokens
Input	$2.50	$0.002500
Output	$10.00	$0.010000

What would OpenAI: GPT-4o Audio cost your business?

Pick the job that looks most like yours, then fine-tune with the sliders. Estimates update live.

A website chatbot handling around 100 customer conversations a day, a few short messages each.

Requests per month 3,000

One request is one message, email, draft or automation call.

Size of each request 1,200 tokens

$0/mo OpenAI: GPT-4o Audio

$0/mo Anthropic: Claude Fable 5

$0/mo Z.ai: GLM 5.2 · best value

Full calculator with 612 models → Price Calculator

DFO AI AUTOMATION

These numbers get smaller with the right architecture.

We route routine calls to cheap models and save OpenAI: GPT-4o Audio for the hard ones. Most clients cut their estimate by 60-80%.

Talk to our team

About OpenAI: GPT-4o Audio

The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement allows the model to detect nuances within audio recordings and add depth to generated user experiences. Audio outputs..

Embed this ranking

Writing about this model? Add the badge to your site. It always shows the current rank and score, and links back to this page.

<a href="https://designforonline.com/ai-models/openai-gpt-4o-audio/"><img src="https://designforonline.com/?aiml_badge=openai-gpt-4o-audio&theme=dark" alt="OpenAI: GPT-4o Audio, ranked #9 on the Design for Online AI Leaderboard" width="400" height="76"></a>

<a href="https://designforonline.com/ai-models/openai-gpt-4o-audio/"><img src="https://designforonline.com/?aiml_badge=openai-gpt-4o-audio&theme=light" alt="OpenAI: GPT-4o Audio, ranked #9 on the Design for Online AI Leaderboard" width="400" height="76"></a>

Frequently asked questions about OpenAI: GPT-4o Audio

How much does OpenAI: GPT-4o Audio cost?

OpenAI: GPT-4o Audio costs $2.50 per million input tokens and $10.00 per million output tokens.

What is the context window of OpenAI: GPT-4o Audio?

OpenAI: GPT-4o Audio has a context window of 128,000 tokens (128K).

Is OpenAI: GPT-4o Audio good for coding?

On our coding benchmark index, OpenAI: GPT-4o Audio ranks #121 of 157 models, placing it in the broader range of the field for code generation and debugging.

What can OpenAI: GPT-4o Audio do?

OpenAI: GPT-4o Audio supports tool use and function calling.

Who created OpenAI: GPT-4o Audio?

OpenAI: GPT-4o Audio is developed by OpenAI and was released on August 15, 2025.

Performance profile

Intelligence 2.6

Technical 2.2

Content 0

Value 6.3

Reasoning: No
Input
Output
Context: 128,000 tokens
Max output: 16,384 tokens
Tokenizer: GPT
Released: Aug 15, 2025

Modality data from OpenRouter; may understate provider-native audio/video/image output.

Model information

Provider openai

Model family GPT-4o

OpenRouter ID openai/gpt-4o-audio-preview

Status Active

Capabilities

Tool Use Function Calling

Ranked in

Tool Use

External resources View on OpenRouter API access, playground & provider details API Quickstart Sample code and integration guide

Data sourced from the OpenRouter API, Artificial Analysis, the Hugging Face Open LLM Leaderboard and our own internal testing. Scores are editorially curated by our team.

Last updated: July 19, 2026 8:38 pm

Issues with our rankings? Contact us

OpenAI: GPT-4o Audio

DFO Verdict

Benchmark scores

How OpenAI: GPT-4o Audio compares

Pricing

What would OpenAI: GPT-4o Audio cost your business?

About OpenAI: GPT-4o Audio

Explore Related Models

Embed this ranking

Frequently asked questions about OpenAI: GPT-4o Audio

How much does OpenAI: GPT-4o Audio cost?

What is the context window of OpenAI: GPT-4o Audio?

Is OpenAI: GPT-4o Audio good for coding?

What can OpenAI: GPT-4o Audio do?

Who created OpenAI: GPT-4o Audio?