What o1-pro actually is (and why it's so expensive)
o1-pro is OpenAI's highest-effort reasoning configuration. The base o1 model (released September 2024) introduced 'reasoning models' — models that produce a long internal chain of thought before writing the visible answer. o1-pro takes the same base model and runs it with substantially more reasoning compute per query: more tokens of internal reasoning, longer wall-clock per response, higher quality on the hardest tasks at correspondingly higher cost.
Pricing-wise, o1-pro is in a tier by itself. $150 input / $600 output per 1M is roughly 60× more expensive than GPT-5 ($1.25/$10) on input and 60× more expensive on output. A 1,000-in / 500-out call costs `0.001 × $150 + 0.0005 × $600 = $0.15 + $0.30 = $0.45` per call — and that's before reasoning tokens.
Reasoning tokens are the hidden cost. o1-pro can burn 10,000-50,000+ reasoning tokens on a complex problem before writing the visible answer. Those tokens bill at the output rate ($600/M) but are not returned to you. A query with 30,000 reasoning tokens + 500 visible output tokens bills `30,500 × $600/M = $18.30` for the output portion alone. A single hard query can cost $5-20.
OpenAI launched o1-pro at this price point because the unit economics of pro-tier reasoning compute are extreme — the model effectively runs the same prompt many times internally to find the best answer. As reasoning-model architecture matures, prices have fallen (o3 at $2/$8 per 1M is dramatically cheaper than o1-pro for many of the same workloads). Expect o1-pro's price to drop or the model to be deprecated in favor of higher-quality successors over 2026-2027.