Step 3.7 Flash

StepFuntextimagevideo

StepFun's speed champion (May 2026, Apache 2.0 open weights): the fastest measured output on Artificial Analysis (382 tok/s, ~1s to first token). A 198B MoE (~11B active) with native image and video understanding, selectable reasoning levels and $0.04 cached input. Built for high-volume work where speed beats frontier reasoning.

Step 3.7 Flash strengths

  • Fastest measured output (382 tok/s)
  • Native image + video understanding
  • Apache 2.0 open weights
  • Very cheap, $0.04 cached input
  • Selectable reasoning levels

Pricing & context

Context window256K tokens
Input price /1M$0.20
Output price /1M$1.15
Modalitiestext, image, video

Cost guide: a typical call of about 10K input + 2K output tokens costs roughly $0.004 at list prices. Worth modelling against cheaper tiers before committing high-volume traffic.

When to choose Step 3.7 Flash

Step 3.7 Flash is best for high-volume, cost-sensitive agents, search workflows and structured outputs where speed matters most. If your workload is more cost-sensitive, weigh it against gpt-oss-120b (≈$0.03 input /1M) first.

Step 3.7 Flash FAQ

How much does Step 3.7 Flash cost?

Step 3.7 Flash is priced at $0.20 per 1M input tokens and $1.15 per 1M output tokens (public API list price), with a 256K tokens context window. A typical call of about 10K input and 2K output tokens costs roughly $0.004.

What is Step 3.7 Flash best for?

Step 3.7 Flash by StepFun is best for high-volume, cost-sensitive agents, search workflows and structured outputs where speed matters most.

How does Step 3.7 Flash pricing compare to gpt-oss-120b?

Step 3.7 Flash input costs $0.20 per 1M tokens versus ≈$0.03 for gpt-oss-120b, roughly 6.7x more expensive on input. Output is $1.15 vs ≈$0.15.

Is Step 3.7 Flash multimodal?

Step 3.7 Flash supports text, image, video.

Other models

All models →
01Claude Fable 5Anthropic$10.00
02GPT-5.5OpenAI$5.00
03Claude Opus 4.8Anthropic$5.00
04Gemini 3.1 ProGoogle$2.00 (under 200K; $4.00 above)