gpt-oss-120b

OpenAItext

OpenAI's open-weights model (Apache 2.0): a 117B MoE with 5.1B active that runs on a single 80GB GPU via native MXFP4, with configurable reasoning effort, full chain-of-thought access and native tool use. Distinct from the GPT-5.x API line — this one you can self-host and fine-tune commercially. Hosted inference is among the cheapest anywhere (prices vary by provider).

gpt-oss-120b strengths

  • OpenAI open weights (Apache 2.0)
  • Runs on a single 80GB GPU
  • Configurable reasoning + full CoT
  • Native tool use
  • Extremely cheap hosted inference

Pricing & context

Context window131K tokens
Input price /1M≈$0.03
Output price /1M≈$0.15
Modalitiestext

Cost guide: a typical call of about 10K input + 2K output tokens costs roughly $0.001 at list prices. Worth modelling against cheaper tiers before committing high-volume traffic.

When to choose gpt-oss-120b

gpt-oss-120b is best for cost-sensitive high-volume workloads, on-prem or privacy-constrained deployments, and custom fine-tunes. If your workload is more cost-sensitive, weigh it against Tencent Hy3 ($0.063 input /1M) first.

gpt-oss-120b FAQ

How much does gpt-oss-120b cost?

gpt-oss-120b is priced at ≈$0.03 per 1M input tokens and ≈$0.15 per 1M output tokens (public API list price), with a 131K tokens context window. A typical call of about 10K input and 2K output tokens costs roughly $0.001.

What is gpt-oss-120b best for?

gpt-oss-120b by OpenAI is best for cost-sensitive high-volume workloads, on-prem or privacy-constrained deployments, and custom fine-tunes.

How does gpt-oss-120b pricing compare to Mistral Large 3?

gpt-oss-120b input costs ≈$0.03 per 1M tokens versus $0.50 for Mistral Large 3, roughly 16.7x less expensive on input. Output is ≈$0.15 vs $1.50.

Is gpt-oss-120b multimodal?

gpt-oss-120b supports text.

Tools that use gpt-oss-120b

Other models

All models →
01Claude Fable 5Anthropic$10.00
02GPT-5.5OpenAI$5.00
03Claude Opus 4.8Anthropic$5.00
04Gemini 3.1 ProGoogle$2.00 (under 200K; $4.00 above)