Pricing: Gemini 2.5 Pro is cheaper, but only inside the 200K context bracket
**GPT-4o lists at $2.50/1M input and $10/1M output.** That's the same input price as GPT-5.4 and 40% of GPT-5.5's input price — GPT-4o is solidly mid-tier in the 2026 OpenAI lineup.
**Gemini 2.5 Pro lists at $1.25/1M input and $10/1M output** for prompts under 200K tokens. That's half GPT-4o's input price at the same output price — a clean win on cost for any workload that fits in 200K of context.
**Above 200K context, Gemini's pricing doubles on input ($2.50/1M) and goes 1.5x on output ($15/1M).** This matters: the headline 2M-context window is real capability, but it isn't free — using it costs more per token than using a shorter prompt. Plan your context window usage with this in mind.
**Cache discount on Gemini 2.5 Pro is 75% off** cache read — drops cached input to $0.31/1M (short context) or $0.625/1M (long context). Aggressive, and second only to Anthropic's 90% cache-read discount on Claude.
**OpenAI's 50% prompt-cache hit discount on GPT-4o** drops cached input to $1.25/1M — bringing it close to Gemini's uncached price. Caching helps both, but Gemini's discount is structurally bigger.
**On a typical 5K-input, 1K-output call**: GPT-4o uncached costs $0.0225. Gemini 2.5 Pro uncached (short context) costs $0.01625 — 28% cheaper. Cached, both narrow to a few hundredths of a cent per call. At 100K calls/day, that's a $7-8K/year difference uncached, dropping to noise cached. **Cost is rarely the deciding factor** at the scale most teams operate; capability differences matter more.