Verify critical details — pricing, licensing, availability — with the model's source before business decisions. Full methodology →

Models · OpenAI

gpt-oss-120b

Model family: gpt-oss

Size

large (117.0B params)

Context

131,072 tokens

Released

2025-08-04

Openness

open-weight

License

Apache License 2.0 (+ gpt-oss usage policy) · commercial: yes

Cost tier

mixed

Rating

4.0 ★ — A genuinely strong open reasoning model — near o4-mini quality, single-GPU, clean Apache 2.0, full chain-of-thought and tool use. Held to 4.0 by being text-only and needing an 80GB GPU rather than consumer hardware.

Modalities

text

Capabilities

chat, coding, function-calling, instruction-following, long-context, math, reasoning, tool-use

Access

api-first-party, api-third-party, local-runtime-ollama, local-runtime-vllm, weights-download-hf

llm
open-weight
commercial-friendly
large
reasoning
coding
self-hostable
us-based
apache-2-0
mixture-of-experts

Quick Take

OpenAI's open comeback: an Apache 2.0 reasoning model that nears o4-mini quality, runs on a single 80GB GPU, and you can download, fine-tune, and self-host freely.

Plain-English Description

gpt-oss-120b, released in August 2025, was a notable moment — OpenAI's first open-weight model since GPT-2, and a real one. Released under the permissive Apache 2.0 license, it brings capability that used to be API-only into weights you can download, fine-tune, and run on your own hardware. OpenAI positions it as near-parity with its o4-mini reasoning model on core benchmarks.

It's a mixture-of-experts model: 117 billion parameters total, but only about 5.1 billion active per token, which is how it manages to fit on a single 80GB datacenter GPU. It's built for reasoning and agentic work — adjustable reasoning effort (low/medium/high), full chain-of-thought you can inspect, and native tool use including function calling, web browsing, and Python execution. One limitation to note: it's text-only, with no image or audio input.

For a business that wants strong, self-hostable reasoning with full data control and a clean license, this is one of the more credible options — and it carries the weight of OpenAI's name, which matters to some buyers evaluating open models.

Best For

Self-hosted reasoning and agentic workloads where data must stay in-house.
Organizations that want OpenAI-lineage capability they can own and fine-tune.
Single-GPU (80GB) deployments needing near-o4-mini reasoning at no per-token cost.
Building agents with tool use, code execution, and inspectable chain-of-thought.

Not For

Laptop or consumer-GPU deployment — it needs an 80GB card; use gpt-oss-20b.
Multimodal tasks — it's text-only.
Teams that want the absolute frontier — the closed GPT-5.5 goes higher.
Buyers who'd rather not manage inference infrastructure at all.

License — Plain-English Summary

Apache 2.0 — unrestricted commercial use, modification, fine-tuning, and redistribution, no royalties or user-count carve-outs; keep the notices and flag significant changes. OpenAI attaches a short "gpt-oss usage policy" describing acceptable use, which doesn't restrict commercial deployment but is worth reading. Self-hosted, the model keeps all data in-house. As open licenses go, this is among the cleanest — on par with Gemma 4 and the Apache-licensed Qwen models.

How It Compares

Against gpt-oss-20b, the 120b is far more capable but needs a datacenter GPU rather than a laptop. Against the closed GPT-5.4, gpt-oss-120b is the self-hostable option — less peak capability, full ownership and no per-token cost. Against other open flagships like Gemma 4 31B and the Apache-licensed Qwen models, gpt-oss-120b competes on reasoning and agentic tooling under an equally clean license, though those rivals add multimodality that gpt-oss lacks.

Cost

Self-hosted cost: $0.00 beyond compute
Notes: Free to self-host under Apache 2.0; also served by OpenAI and third parties per-token. Reasoning effort is adjustable (low / medium / high).

Hardware requirements

Min VRAM: 80 GB
Recommended VRAM: 80 GB
Runs on laptop: No
Notes: Designed to fit a single 80GB datacenter GPU (H100 / MI300X) thanks to MXFP4 quantization of the expert weights. Not a laptop model — use gpt-oss-20b for that.

Comparable models

Commercial-use conditions

Apache 2.0 permits unrestricted commercial use, modification, fine-tuning, and redistribution. OpenAI also attaches a short "gpt-oss usage policy" covering acceptable use; it doesn't restrict commercial deployment but is worth a read.