Hermes 4 – Llama-3.1 405B (Non-reasoning)

Hermes 4 – Llama-3.1 405B (Non-reasoning)

Nous Research · Released Aug 27, 2025 Professional
Intelligence #10 / 576
82.0 Our Score
Speed #238 / 271
42.3 tokens / sec
Input #433 / 577
$1.00 per 1M tokens
Output #429 / 577
$3.00 per 1M tokens
Context
— Not reported

Analysis Summary

Hermes 4 built on Llama-3.1 405B (Non-reasoning) is Nous Research's large-scale fine-tune of Meta's flagship open-weight base, targeting general instruction following and content generation. Its MMLU-Pro score of 0.729 and GPQA of 0.536 place it in a capable mid-tier bracket, and its LiveCodeBench score of 0.546 shows meaningful coding ability for a non-reasoning variant.

For businesses, this model suits content generation, summarisation, and structured writing tasks where a large, well-tuned model is needed but extended chain-of-thought reasoning is not. Its agentic index is limited, so it is not well suited to multi-step tool-use workflows. At $1 input and $3 output per million tokens, it sits at a moderate price point for a 405B-class model.

Teams needing a capable, large non-reasoning model for content pipelines will find it a workable option, though the reasoning variant of the same family offers substantially stronger performance on harder tasks.

Assessed June 6, 2026

Editorial notes

Hermes 4 Llama-3.1 405B (Non-reasoning) from Nous Research delivers strong MMLU-Pro and GPQA scores with competitive coding capability, priced at $1/$3 per million tokens.

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

Performance Profile

Intelligence3.1Technical2.8Value6Content3.1
Intelligence 3.1/10
Technical 2.8/10
Content 3.1/10
Value 6/10

How Hermes 4 – Llama-3.1 405B (Non-reasoning) compares

Hermes 4 – Llama-3.1 405B (Non-reasoning) ranks #199 of 378 AI models we track for overall intelligence, #157 of 315 for coding, #192 of 289 for agentic tasks. At $1.00 per million input tokens it is cheaper than 25% of comparable models.

Performance Indices

Source: Artificial Analysis

17.6 Intelligence Index
18.1 Coding Index
18.2 Agentic Index
15.3 Math Index

Benchmark Scores

Intelligence

GPQA Diamond 53.6% Graduate-level scientific reasoning
HLE 4.2% Humanity's Last Exam
MMLU Pro 72.9% Multi-task language understanding
AIME 2025 15.3% Competition mathematics (2025)
SciCode 34.6% Scientific computing

Technical

LiveCodeBench 54.6% Live coding evaluation
TerminalBench Hard 9.8% Agentic terminal tasks
τ²-Bench 26.6% Conversational agent benchmark

Content

IFBench 34.8% Instruction following
LCR 20% Long-context reasoning

Benchmark data from Artificial Analysis and Hugging Face

How does Hermes 4 – Llama-3.1 405B (Non-reasoning) stack up?

Compare side-by-side with other professional models.

Compare Models

Model Information

ProviderNous Research
Release Date August 27, 2025
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $1.00 $0.001000
Output $3.00 $0.003000

Leaderboard Categories

Frequently asked questions about Hermes 4 – Llama-3.1 405B (Non-reasoning)

How much does Hermes 4 – Llama-3.1 405B (Non-reasoning) cost?

Hermes 4 – Llama-3.1 405B (Non-reasoning) costs $1.00 per million input tokens and $3.00 per million output tokens.

Is Hermes 4 – Llama-3.1 405B (Non-reasoning) good for coding?

On our coding benchmark index, Hermes 4 – Llama-3.1 405B (Non-reasoning) ranks #157 of 315 models, placing it in the broader range of the field for code generation and debugging.

Who created Hermes 4 – Llama-3.1 405B (Non-reasoning)?

Hermes 4 – Llama-3.1 405B (Non-reasoning) is developed by Nous Research and was released on August 27, 2025.