What o3 actually is (and why it changed the reasoning-model menu)
o3 is OpenAI's third generation of reasoning model in the post-GPT-4 era — successor to o1 (Sep 2024) and o3-mini (Jan 2025), preceded by the experimental o3-preview that OpenAI quietly retired ahead of o3 GA. The release brought two things that o1 didn't have: production-grade pricing ($2/$8 vs o1's ~$15/$60), and full feature surface (streaming, prompt caching, the standard chat completions endpoint).
Functionally, o3 is a reasoning model in the same lineage as o1 and o1-pro — it produces a long internal chain of thought before writing the visible answer. Reasoning tokens bill at the output rate ($8/M on o3) and are not returned to you. The key behavioral difference from non-reasoning models (GPT-5, Claude, Gemini): you don't scaffold the chain of thought yourself. Prompt o3 with the bare problem statement; the model handles the reasoning structure internally.
o3 has been the production default for reasoning-class workloads on OpenAI since mid-2025. It supplanted o1 entirely (o1 is deprecated for new code as of June 2026) and made o1-pro a narrow-niche tool for the hardest single-call decisions where another 5-10% quality gain justifies a 60× price premium.