Step3 VL 10B
Analysis Summary
Step3 VL 10B sits in the Emerging tier on our leaderboard, ranked #443 of 525 published models on overall intelligence. At $0.000 input and $0.000 output per 1M tokens, it is among the most expensive on the market.
Editorial notes
Step3 VL 10B from StepFun is a compact vision-language model with limited general reasoning and agentic capability, best suited to lightweight multimodal inference tasks.
Assessed April 26, 2026
Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?
Performance Profile
Performance Indices
Source: Artificial Analysis
Benchmark Scores
Intelligence
Technical
Content
Benchmark data from Artificial Analysis and Hugging Face
How does Step3 VL 10B stack up?
Compare side-by-side with other emerging models.
Model Information
| Provider | StepFun |
| Release Date | January 20, 2026 |
| Status | Active |
Explore Related Models
Data sourced from OpenRouter API, Artificial Analysis and Hugging Face Open LLM Leaderboard. Scores are editorially curated by our team.
Last updated: April 25, 2026 8:38 pm