Gemini 3.5 Flash
Launched at Google I/O 2026 (May 19). Fast multimodal model balancing speed and capability.
Gemini 3.5 Flash strengths
- Fast multimodal processing
- 1M context
- Good value for agentic use
- Native audio/video understanding
Pricing & context
| Context window | 1M tokens |
| Input price /1M | $1.50 |
| Output price /1M | $9.00 |
| Modalities | text, image, audio, video |
Cost guide: a typical call of ~10K input + 2K output tokens runs roughly $1.50 × 0.01 + $9.00 × 0.002 — worth modelling against cheaper tiers before committing high-volume traffic.
When to choose Gemini 3.5 Flash
Gemini 3.5 Flash is best for High-volume multimodal apps and agents needing speed with large context. If your workload is more cost-sensitive, weigh it against Llama 4 Scout (~$0.08 (varies by host) input /1M) first.
Gemini 3.5 Flash FAQ
How much does Gemini 3.5 Flash cost?
Gemini 3.5 Flash is priced at $1.50 per 1M input tokens and $9.00 per 1M output tokens (public API list price), with a 1M tokens context window.
What is Gemini 3.5 Flash best for?
Gemini 3.5 Flash by Google is best for High-volume multimodal apps and agents needing speed with large context.
Is Gemini 3.5 Flash multimodal?
Gemini 3.5 Flash supports text, image, audio, video.
Tools that use Gemini 3.5 Flash
Other models
All models →| 01 | GPT-5.5 | OpenAI | $5.00 | → |
| 02 | GPT-5.4 | OpenAI | $2.50 | → |
| 03 | GPT-5.4 mini | OpenAI | $0.75 | → |
| 04 | Claude Opus 4.8 | Anthropic | $5.00 | → |