Azure OpenAI vs OpenAI direct — why the quota systems aren't comparable
**OpenAI direct** rate-limits accounts on a single global tier ladder (Free → Tier 5) tied to cumulative paid usage and days-since-first-payment. The same Tier 5 account inherits the same RPM/TPM ceilings on every model in every region. Promotion is automatic; the only knobs are 'wait' or 'sign an enterprise contract'.
**Azure OpenAI** rate-limits *subscriptions* on a per-region, per-model, per-deployment-type basis. The same subscription can have **5,000,000 TPM on gpt-4.1 Global Standard in East US** and simultaneously **300,000 TPM on gpt-4.1 Data Zone Standard in Sweden Central** — no inheritance between them. The throughput you can deliver is the sum across all subscriptions × regions × deployment types your application is wired to.
Microsoft introduced **Quota Tiers** in late 2025 — seven tiers (Tier 0 through Tier 6) that scale automatically with consumption, layered on top of the per-region/per-model/per-SKU scoping. A Tier 1 subscription with gpt-5.5 sees **0 TPM defaults** until it requests allocation; a Tier 5 subscription sees **10M TPM Global Standard** out of the box. Enterprise Agreement (EA) or MCA-E customers get assigned to higher tiers based on Microsoft relationship status, not just consumption.
**Practical implication**: on OpenAI direct, capacity is bound to who you are. On Azure, capacity is bound to where (region) × what (model) × how (SKU) × who (tier). Architecting around Azure means picking the SKU and region mix that fits your needs, not waiting on a single tier promotion.