The cost formula (memorize this one)
Every OpenAI API call follows the same math. There is no platform fee, no per-call fee, no minimum. You pay for what you send and what you get back, at the model's per-1M-token rate:
``` cost = (input_tokens / 1,000,000) × input_price_per_M + (output_tokens / 1,000,000) × output_price_per_M ```
Two adjustments stack on top. First, prompt-cache hits — portions of your input prefix that OpenAI has cached because you sent them in a recent prior call — bill at the cached-input rate (~10% of standard input). Long system prompts and stable tool schemas are the typical winners; the cache is opportunistic across most SDKs and does not need code changes to activate. Second, the Batch API takes 50% off both input and output in exchange for a 24-hour-or-less delivery window. The two discounts stack: a cached, batched call on gpt-5.5 bills at $0.25 input ÷ 2 = $0.125/1M and $30 output ÷ 2 = $15/1M for the cached + batched portion. The structure of your prompts determines how much of each discount you can capture in practice.
Reasoning tokens on the o-series bill at the output rate even though they are not returned to you — a model that 'thinks' for 4,000 tokens before producing a 200-token answer bills 4,200 output tokens. Plan a 5-10x output budget on reasoning-heavy tasks.