OpenAI: GPT-4o Audio

OpenAI: GPT-4o Audio

openai · Released Aug 15, 2025
32
Our Score

The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement allows the model to detect nuances within audio recordings and add depth to generated user experiences. Audio outputs are currently not supported. Audio tokens are priced at $40 per million input and $80 per million output audio tokens.

$2.50 / 1M Input Price
$10.00 / 1M Output Price
128,000 tokens Context Window
16,384 tokens Max Output

Capabilities

Tool Use Function Calling

Architecture

ModalityText + Audio → Text + Audio
TokenizerGPT

Performance Indices

Source: Artificial Analysis

12.8 Intelligence Index
13.1 Coding Index
6 Math Index

Benchmark Scores

Evaluations

GPQA Diamond 54.3%
Graduate-level scientific reasoning
HLE 3.3%
Humanity's Last Exam
MMLU Pro 74.8%
Multi-task language understanding
LiveCodeBench 30.9%
Live coding evaluation
SciCode 33.3%
Scientific computing
MATH 500 75.9%
Mathematical problem-solving
AIME 15%
Competition mathematics

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID openai/gpt-4o-audio-preview
Provideropenai
Model FamilyGPT-4o
Release Date August 15, 2025
Context Length128,000 tokens
Max Completion16,384 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $2.50 $0.002500
Output $10.00 $0.010000

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

644ms
Best Latency (TTFT)
32 tok/s
Best Throughput
0/1
Active Endpoints
Available via: OpenAI