What RPM, TPM, and TPD Mean — and Which One Is Actually Constraining You
OpenAI enforces three independent rate limit dimensions for embedding API calls. RPM (requests per minute) limits how many API calls you can make regardless of their size. TPM (tokens per minute) limits the total token volume across all requests in a rolling one-minute window. TPD (tokens per day) limits total token consumption in a 24-hour period. When you hit any one of these limits, the API returns a 429 Too Many Requests error with a response header indicating which limit was exceeded and when it resets.
The binding constraint in practice depends on your batching strategy. The OpenAI embeddings API accepts batches of up to 2048 input strings in a single request. A developer sending one string per request with frequent calls will hit RPM before TPM — at 3,000 RPM with an average of 100 tokens per request, they max out at 300,000 tokens per minute, well below the 1,000,000 TPM ceiling. The same developer batching 200 strings per request (each ~100 tokens) would need only 15 requests per minute to generate 300,000 tokens per minute, using 0.5% of their RPM but 30% of their TPM. In general, maximizing batch size is the right strategy to avoid RPM limits — the embedding API is designed for batch inputs, and single-string requests are inefficient both in cost and quota utilization.
TPD is the most relevant constraint for teams doing large initial corpus indexing runs on Tier 1 accounts. At 3,000,000 tokens per day and an average document chunk size of 300 tokens, a Tier 1 account can index approximately 10,000 document chunks per day through the synchronous API. This is meaningful: a corpus of 100,000 chunks would take 10 days to index at Tier 1 without the Batch API. Teams that need to accelerate this either need to upgrade their tier (which requires spending to reach Tier 2 and above) or use the Batch API, which has a separate daily quota. The Batch API approach is almost always the right answer for initial indexing workloads, as discussed in the Batch API section below.