Back to Models
StepFun: Step 3.7 Flash
step-3.7-flash
Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model. It pairs a 196B-parameter language backbone with a vision encoder for native image and video understanding, activating roughly 11B parameters...
Modalities
Input
textimagevideo
Output
text
Pricing
Cost per 1 million tokens
Input
$0.24
Output
$1.38
Model Specs
Context Window
256,000Max Output
256,000Release Date
2026-05-29Knowledge Cutoff
2026-01-01Capabilities
Reasoning
Tool Calling
Vision
Last Updated: 2026-05-29
Provider: