429 Too Many Requests - What It Means & How to Fix

Q: How long should I wait after a 429 error?

Check the Retry-After header in the 429 response - it tells you exactly how many seconds to wait (or provides a date/time). If no Retry-After header is present, use exponential backoff: start with 1 second, then 2, 4, 8, 16, etc., with some random jitter added. Most APIs also include rate limit headers (X-RateLimit-Remaining, X-RateLimit-Reset) that help you pace your requests.

Q: Can I increase my API rate limit?

Yes, most API providers offer higher rate limits on paid plans or upon request. Contact the API provider about upgrading to a higher tier. Many providers also offer dedicated API plans for high-volume users. In the meantime, optimize your usage by caching responses, batching requests, and using webhooks instead of polling.

Q: What is exponential backoff?

Exponential backoff is a retry strategy where you double the wait time between each retry attempt. For example: wait 1 second after the first failure, 2 seconds after the second, 4 seconds after the third, 8 seconds after the fourth, and so on. Adding random jitter (a small random delay) prevents multiple clients from retrying at the exact same time, which would cause another burst of requests.

ⓘ Quick Definition

The user has sent too many requests in a given amount of time ("rate limiting"). The server is rejecting requests to protect itself from being overwhelmed. This response is the standard way servers enforce rate limits, and it should include a Retry-After header indicating how long the client should wait before making another request.

⏰ When It Occurs

A 429 error occurs when you exceed the server's configured rate limit. Most APIs set limits on how many requests a client can make within a time window (e.g., 100 requests per minute, 1000 requests per hour). Once you exceed this threshold, every subsequent request is rejected with a 429 until the rate limit window resets.

This is a deliberate protective mechanism. Without rate limiting, a single client could overwhelm the server, causing degraded performance for all users. You'll see this when scraping websites, making rapid API calls, or when your application has a bug causing excessive requests.

⚠ Common Causes

Exceeding API rate limits - Making more requests than the API's documented limit allows
Aggressive web scraping - Crawling a website too quickly without respecting rate limits or robots.txt
Bot activity - Automated scripts or bots making rapid-fire requests
Misconfigured polling interval - Polling an API endpoint too frequently (e.g., every 100ms instead of every 10s)
Burst of requests from a single client - Sending many concurrent requests at once instead of spacing them out
DDoS mitigation triggering - Server's DDoS protection flagging your legitimate traffic as an attack
Shared IP hitting rate limits - Multiple users behind the same IP (NAT, VPN, corporate network) collectively exceeding limits
Missing request caching/debouncing - Not caching responses or debouncing user actions, causing redundant requests

Platform-Specific Notes:

Nginx Uses limit_req module with limit_req_zone directive. Configure burst and nodelay for flexible rate limiting.

Apache Uses mod_ratelimit or mod_evasive for rate limiting. Configure per-IP or per-session limits.

Cloudflare Rate Limiting rules can be configured in the dashboard. Also triggers automatically via DDoS protection and Bot Management.

Node.js Common middleware: express-rate-limit, rate-limiter-flexible. Configure per route or globally with Redis-backed stores for distributed systems.

🛠 How to Fix

Check the Retry-After header - The response should tell you exactly how many seconds to wait before retrying
Implement exponential backoff - Double the wait time between retries: 1s, 2s, 4s, 8s, 16s... with random jitter
Cache API responses to reduce calls - Store responses locally and reuse them instead of making duplicate requests
Debounce rapid client requests - Prevent rapid-fire requests from user interactions (e.g., search-as-you-type)
Use API pagination efficiently - Request larger page sizes to get more data per request instead of many small requests
Request a higher rate limit - Contact the API provider about upgrading to a plan with higher limits
Distribute requests across time windows - Spread your requests evenly over time instead of sending bursts
Use webhooks instead of polling - Subscribe to event notifications instead of repeatedly checking for changes

💻 HTTP Example

# Client exceeds rate limit (101st request in 1 minute)
GET /api/data HTTP/1.1
Host: api.example.com
Authorization: Bearer eyJhbGciOiJIUzI1NiIs...

# Server Response
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 30
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1708444800

{
  "error": "Too Many Requests",
  "message": "Rate limit exceeded. You've made 101 requests in the last 60 seconds (limit: 100).",
  "retry_after": 30,
  "statusCode": 429
}

# Exponential backoff pseudocode:
# attempt 1: wait 1s + random(0-500ms)
# attempt 2: wait 2s + random(0-500ms)
# attempt 3: wait 4s + random(0-500ms)
# attempt 4: wait 8s + random(0-500ms)

🔗 Related Status Codes

Frequently Asked Questions

How long should I wait after a 429 error? +

Check the Retry-After header in the 429 response - it tells you exactly how many seconds to wait (e.g., Retry-After: 30 means wait 30 seconds) or provides a specific date/time. If no Retry-After header is present, use exponential backoff: start with 1 second, then double it each retry (2s, 4s, 8s, 16s...) with some random jitter added. Most APIs also include rate limit headers like X-RateLimit-Reset that tell you exactly when your rate limit window resets.

Can I increase my API rate limit? +

Yes, most API providers offer higher rate limits on paid or enterprise plans. Contact the API provider about upgrading to a higher tier. Many providers also have dedicated plans for high-volume users or partners. In the meantime, optimize your current usage by caching API responses, batching requests where supported, using pagination efficiently, and switching from polling to webhooks for real-time data.

What is exponential backoff? +

Exponential backoff is a retry strategy where you progressively double the wait time between each retry attempt. For example: wait 1 second after the first failure, 2 seconds after the second, 4 seconds after the third, 8 seconds after the fourth, and so on, usually up to a maximum cap (e.g., 60 seconds). Adding random "jitter" (a small random delay) prevents the "thundering herd" problem where multiple clients all retry at the exact same moment, causing another spike of requests. This is considered a best practice for handling 429 and 503 errors.

Too Many Requests