Step 3.7 Flash: Pricing & Context Window (2026)

StepFun's speed champion (May 2026, Apache 2.0 open weights): the fastest measured output on Artificial Analysis (382 tok/s, ~1s to first token). A 198B MoE (~11B active) with native image and video understanding, selectable reasoning levels and $0.04 cached input. Built for high-volume work where speed beats frontier reasoning.

Step 3.7 Flash strengths

Fastest measured output (382 tok/s)
Native image + video understanding
Apache 2.0 open weights
Very cheap, $0.04 cached input
Selectable reasoning levels

Pricing & context

Context window	256K tokens
Input price /1M	$0.20
Output price /1M	$1.15
Modalities	text, image, video

Cost guide: a typical call of about 10K input + 2K output tokens costs roughly $0.004 at list prices. Worth modelling against cheaper tiers before committing high-volume traffic.

When to choose Step 3.7 Flash

Step 3.7 Flash is best for high-volume, cost-sensitive agents, search workflows and structured outputs where speed matters most. If your workload is more cost-sensitive, weigh it against gpt-oss-120b (≈$0.03 input /1M) first.

Step 3.7 Flash FAQ

How much does Step 3.7 Flash cost?

Step 3.7 Flash is priced at $0.20 per 1M input tokens and $1.15 per 1M output tokens (public API list price), with a 256K tokens context window. A typical call of about 10K input and 2K output tokens costs roughly $0.004.

What is Step 3.7 Flash best for?

Step 3.7 Flash by StepFun is best for high-volume, cost-sensitive agents, search workflows and structured outputs where speed matters most.

How does Step 3.7 Flash pricing compare to gpt-oss-120b?

Step 3.7 Flash input costs $0.20 per 1M tokens versus ≈$0.03 for gpt-oss-120b, roughly 6.7x more expensive on input. Output is $1.15 vs ≈$0.15.

Is Step 3.7 Flash multimodal?

Step 3.7 Flash supports text, image, video.

01	Claude Fable 5	Anthropic	$10.00	→
02	GPT-5.5	OpenAI	$5.00	→
03	Claude Opus 4.8	Anthropic	$5.00	→
04	Gemini 3.1 Pro	Google	$2.00 (under 200K; $4.00 above)	→