OpenAI: GPT-4o Audio
Analysis Summary
GPT-4o Audio is a specialised multimodal variant from OpenAI, released August 2025, with native audio-in and audio-out capability. Its intelligence index of 12.8 and coding index of 13.1 place it well below the current general-purpose tier, and its math index of 6 confirms it is not designed for analytical or reasoning-heavy tasks.
The model's value is in its modality: for businesses building voice interfaces, call centre automation, real-time transcription pipelines, or audio-driven customer experiences, it fills a gap that text-only models cannot. Tool use and function calling are supported, enabling integration into broader agentic systems where audio is a first-class input.
At $2.50 input and $10 output per million tokens, it is priced at the premium end relative to its general reasoning capability. Teams should adopt it specifically for audio-native use cases rather than as a general-purpose model. For any workflow that does not require audio I/O, a cheaper and more capable text model will outperform it on every dimension.
Assessed June 30, 2026
Editorial notes
GPT-4o Audio from OpenAI adds native audio input and output to a mid-tier reasoning base, suited to voice-first applications but limited in coding and general reasoning depth.
Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?
Performance Profile
How OpenAI: GPT-4o Audio compares
OpenAI: GPT-4o Audio ranks #195 of 385 AI models we track for overall intelligence, #110 of 139 for coding. Its 128K-token context window is larger than 43% of the models we list. At $2.50 per million input tokens it is cheaper than 13% of comparable models.
About OpenAI: GPT-4o Audio
The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement allows the model to detect nuances within audio recordings and add depth to generated user experiences. Audio outputs..
Capabilities
Performance Indices
Source: Artificial Analysis
Benchmark Scores
Intelligence
Technical
Benchmark data from Artificial Analysis and Hugging Face
How does OpenAI: GPT-4o Audio stack up?
Compare side-by-side with other professional models.
Model Information
Pricing
| Token Type | Cost per 1M tokens | Cost per 1K tokens |
|---|---|---|
| Input | $2.50 | $0.002500 |
| Output | $10.00 | $0.010000 |
Leaderboard Categories
External Resources
Explore Related Models
Frequently asked questions about OpenAI: GPT-4o Audio
How much does OpenAI: GPT-4o Audio cost?
OpenAI: GPT-4o Audio costs $2.50 per million input tokens and $10.00 per million output tokens.
What is the context window of OpenAI: GPT-4o Audio?
OpenAI: GPT-4o Audio has a context window of 128,000 tokens (128K).
Is OpenAI: GPT-4o Audio good for coding?
On our coding benchmark index, OpenAI: GPT-4o Audio ranks #110 of 139 models, placing it in the broader range of the field for code generation and debugging.
What can OpenAI: GPT-4o Audio do?
OpenAI: GPT-4o Audio supports tool use and function calling.
Who created OpenAI: GPT-4o Audio?
OpenAI: GPT-4o Audio is developed by OpenAI and was released on August 15, 2025.
Data sourced from OpenRouter API, Artificial Analysis and Hugging Face Open LLM Leaderboard. Scores are editorially curated by our team.
Last updated: July 2, 2026 8:38 pm