The GPT-5 cost formula
Every GPT-5 call uses the same per-token math. No platform fee, no per-call fee, no minimum invoice. You pay for tokens in and tokens out, at the chosen model's per-1M rate:
``` cost = (input_tokens / 1,000,000) × input_price_per_M + (output_tokens / 1,000,000) × output_price_per_M ```
Two adjustments stack on top. Prompt-cache hits — portions of your input prefix that OpenAI cached because you sent them recently — bill at the cached-input rate (10% of the standard input price across every GPT-5 tier). Long stable system prompts and reused tool schemas are the typical winners. The Batch API takes a flat 50% off both input and output for asynchronous jobs delivered within 24 hours. The discounts compose: a cached + batched GPT-5.5 call pays $0.50/1M cached input divided by 2 = $0.25/1M on the cached portion, and $30/1M output divided by 2 = $15/1M on output.
On GPT-5.5 Pro, reasoning tokens generated internally before the visible response bill at the $180/1M output rate, the same as the answer text. A query that triggers 3,000 reasoning tokens to produce a 500-token answer bills 3,500 output tokens. Budget for a 3-8x reasoning multiplier on Pro if the task is non-trivial. Standard GPT-5.5 and GPT-5.4 do not surface chain-of-thought; their output bill matches the response length.