Microsoft: Phi 4
Microsoft's Phi-4 is a compact, low-cost model with surprisingly strong MMLU-Pro and math scores for its size, but its very small 16K context window and limited agentic capability make it unsuitable for most professional business tasks. A reasonable option for lightweight, cost-sensitive deployments.
Assessment date: March 12, 2026
Our methodology takes into account a range of factors including pricing, functionality, capabilities, benchmark performance, and real-world applicability. Rankings are reviewed and updated regularly as new models are released. Issues with our rankings? Contact us
Microsoft Research Phi-4 is designed to perform well in complex reasoning tasks and can operate efficiently in situations with limited memory or where quick responses are needed. At 14 billion parameters, it was trained on a mix of high-quality synthetic datasets, data from curated websites, and academic materials. It has undergone careful improvement to follow instructions accurately and maintain strong safety standards. It works best with English language inputs. For more information, please see Phi-4 Technical Report
Architecture
| Modality | Text → Text |
| Tokenizer | Other |
Performance Indices
Source: Artificial Analysis
Benchmark Scores
Evaluations
Benchmark data from Artificial Analysis and Hugging Face
Model Information
Pricing
| Token Type | Cost per 1M tokens | Cost per 1K tokens |
|---|---|---|
| Input | $0.06 | $0.000060 |
| Output | $0.14 | $0.000140 |
Live Performance
Live endpoint metrics — refreshed every 30 minutes.
External Resources
Data sourced from OpenRouter API, Artificial Analysis and Hugging Face Open LLM Leaderboard. Scores are editorially curated by our team.
Last updated: March 13, 2026 7:52 pm