DeepSeek V4 Flash

DeepSeektext

Cheaper, faster V4 tier with a ~98% cache-hit discount; extremely economical for repeated prompts.

DeepSeek V4 Flash strengths

  • Among the cheapest capable models
  • Massive cache savings
  • 1M context
  • Good general performance

Pricing & context

Context window1M tokens (384K max output)
Input price /1M$0.14 ($0.0028 cache hit)
Output price /1M$0.28
Modalitiestext

Cost guide: a typical call of ~10K input + 2K output tokens runs roughly $0.14 ($0.0028 cache hit) × 0.01 + $0.28 × 0.002 — worth modelling against cheaper tiers before committing high-volume traffic.

When to choose DeepSeek V4 Flash

DeepSeek V4 Flash is best for High-volume, repetitive workloads and budget pipelines needing long context. If your workload is more cost-sensitive, weigh it against Llama 4 Scout (~$0.08 (varies by host) input /1M) first.

DeepSeek V4 Flash FAQ

How much does DeepSeek V4 Flash cost?

DeepSeek V4 Flash is priced at $0.14 ($0.0028 cache hit) per 1M input tokens and $0.28 per 1M output tokens (public API list price), with a 1M tokens (384K max output) context window.

What is DeepSeek V4 Flash best for?

DeepSeek V4 Flash by DeepSeek is best for High-volume, repetitive workloads and budget pipelines needing long context.

Is DeepSeek V4 Flash multimodal?

DeepSeek V4 Flash supports text.

Tools that use DeepSeek V4 Flash

Other models

All models →
01GPT-5.5OpenAI$5.00
02GPT-5.4OpenAI$2.50
03GPT-5.4 miniOpenAI$0.75
04Claude Opus 4.8Anthropic$5.00