Claude Opus 4.x: exact pricing and what you are paying for
Claude Opus 4.x is Anthropic's top-tier model for complex reasoning, extended agentic tasks, and high-stakes code generation. As of June 2026, it is priced at $15.00 per million input tokens and $75.00 per million output tokens via the Anthropic Messages API (source). The model supports a 200,000-token context window — one of the largest among commercially available frontier models — making it attractive for tasks that require ingesting full codebases, long contracts, or extended research documents.
To put those prices in human terms: a single 4,000-token input prompt (roughly 3,000 words of context) costs $0.06 at standard rate. A 2,000-token output response (around 1,500 words) costs $0.15. A complete round-trip — one request in, one response out — at those sizes costs about $0.21. For an application making 10,000 such calls per month, that is $2,100 per month before any optimization. That number is why the sections below on caching, batching, and model tiering matter so much at Opus prices.
Opus 4.x operates under Anthropic's standard API rate limits: 2,000 requests per minute and 80,000,000 input tokens per minute for Tier 4 accounts. Tier 1 accounts (new API keys) start significantly lower — 50 requests per minute and 50,000 input tokens per minute — and advance through usage milestones. The full rate limit schedule is documented in Anthropic's rate limits reference.
Opus 4.x is also available through Anthropic's Message Batches API at 50% off both input and output tokens for workloads that can tolerate up to 24-hour processing windows. That brings the effective rate to $7.50 per million input and $37.50 per million output — still above standard GPT-5 pricing, but a meaningful cut for latency-tolerant Opus workloads.