If you see "Claude API Error: Rate limit reached", stop retrying long enough to identify the route owner. Direct Anthropic API requests should follow the HTTP 429 response headers and Console limits; Claude Code should start with /status and active credential checks; Bedrock, Vertex AI, and gateways belong to their own quota and log surfaces.
| Surface you are using | Likely owner | First move | Proof signal | Next step |
|---|---|---|---|---|
| Direct Anthropic API | Anthropic workspace and model limit | Pause by retry-after, then reduce request shape | HTTP 429, rate_limit_error, anthropic-ratelimit headers | Retry one smaller request on the same route |
| Claude Code with an API key | The API key's workspace | Run /status and confirm the key path | Claude Code status plus API logs or 429 body | Lower Claude Code API usage or inspect Console limits |
| Claude Code with subscription auth | Plan/session window | Do not inspect API headers first | Message mentions session, period, or plan window | Use the Claude Code usage-limit guide |
| AWS Bedrock or Vertex AI | Cloud provider project | Open the provider quota page | Provider 429, ThrottlingException, region quota | Adjust provider quota or regional capacity |
| Third-party gateway | Gateway tenant or upstream route | Check gateway logs before blaming Anthropic | Gateway 429 or tenant policy hit | Apply gateway limits or contact provider support |
| Burst or acceleration control | Your traffic shape | Smooth the ramp and queue requests | Spike in RPS/concurrency while quota remains | Back off, buffer, and verify the same route |
The stop rule is simple: do not rotate keys, upgrade a plan, switch providers, or run a retry loop until the owner is proven. Changing routes mid-debug resets the signal and makes support evidence weaker.
First, identify the route that produced the limit
The phrase "rate limit reached" is not enough. It tells you a request was blocked by some limit, not who owns that limit. A server calling api.anthropic.com has a different evidence path from Claude Code running under an API key. Claude Code using subscription auth has a different evidence path again. A Bedrock, Vertex AI, reverse proxy, or gateway request can return limit-like wording while Anthropic Console still looks healthy.
Start with three questions. Which credential processed the failing request? Which dashboard owns that credential? Can you reproduce once on the same route without changing model, provider, prompt, or region? If the answer changes between attempts, you are no longer debugging the same failure.

Use the existing Claude Code mixed-error router for 500/529/plan-window branches when the terminal line is not a true API 429: Claude Code 500 vs 529 vs Rate Limit. For Claude Code-specific rate-limit handling, keep the route-specific page nearby: Claude Code rate limit.
Direct Anthropic API: trust headers before guessing
For direct Anthropic API traffic, HTTP 429 maps to the official rate_limit_error class. The useful evidence is not a forum comment or a static tier table; it is the response body, retry-after when present, and the anthropic-ratelimit header families. Requests per minute, input tokens per minute, and output tokens per minute can each be the tight bucket. Monthly spend or account usage can still look available while a rolling minute bucket blocks the next call.
The next request should be smaller and slower, not louder. Respect retry-after, reduce concurrency, cap max output, split large jobs, cache stable context, and retry one request on the same model and route. If the repeated request still fails, keep the request id and headers instead of changing variables again.

Claude Code: check the active route before changing plans
Claude Code adds a route layer. If ANTHROPIC_API_KEY is set, your terminal can be using API-key billing even when you think you are using a Pro or Max subscription. Run /status, inspect the active auth path, and decide whether the failing request belongs to the API key workspace or the subscription session. For billing ownership details, use Claude Code API key vs subscription billing; for setup boundaries, use Claude Code API configuration.
Do not upgrade a subscription to fix an API-key 429. Do not inspect Anthropic API headers for a subscription session window. Route ownership decides the next evidence source.
Why usage can look available while the next request is blocked
Rate limits are usually rolling buckets, not a single monthly counter. A dashboard can show remaining budget while a short window is exhausted. A long context request can hit input-token pressure; a verbose answer can hit output-token pressure; many small parallel calls can hit request-per-minute pressure. Acceleration controls can also slow a sudden ramp even if the long-term quota remains.
That is why the article's first move is not "buy more" or "wait forever." It is to identify the bucket and relieve pressure there. If RPM is tight, queue requests. If input tokens are tight, split the job or cache stable context. If output tokens are tight, lower max output and ask for staged answers. If the provider route owns the cap, move the proof to that provider's quota page.
Fix the next request without making the signal worse
Make one controlled change. Add exponential backoff with jitter. Limit parallel workers per workspace, model, and route. Keep a retry budget so the client stops before it creates a second incident. Log response status, request id, route owner, model, workspace, region, retry-after, and remaining/reset headers when available.
The verification request should stay on the same route. Same route means the same provider, credential, workspace or project, model, region, and workload shape. If you change three variables and the error disappears, you have a workaround but not a diagnosis.
Provider or gateway-owned limits
When Claude is accessed through Bedrock, Vertex AI, or a gateway, Anthropic Console may not be the source of truth for the failing envelope. Bedrock can own service quotas and regional throughput. Vertex AI can own project, location, and model quota. A gateway can own tenant policy, upstream routing, per-key throttles, or its own safety queue.
The right question is: which system accepted the credential and emitted the 429? If it was not api.anthropic.com, gather logs from the provider or gateway before opening an Anthropic support case.
Prevention
Build a small limiter before the next incident. Track requests, input tokens, output tokens, and failures by route owner. Use queues for bursts, prompt caching for repeated context, and per-model budgets for long-running jobs. Alert on approaching reset windows rather than only on hard failures. Keep provider and gateway limits in the same runbook so operators know which dashboard to open first.
Escalation packet
Escalate after route-specific checks fail and one same-route reproduction still returns the limit. Send the exact message, timestamp and timezone, request id if present, response headers, model, workspace or project, region, route owner, recent traffic change, current status-page result, and the smallest reproduction request. Do not send API keys, tokens, private user content, or speculative root-cause claims.

FAQ
Is Claude API Error: Rate limit reached always a direct API 429?
No. Direct Anthropic API 429 is the cleanest case, but Claude Code, providers, gateways, and burst controls can surface nearby wording. The owner decides the fix.
Should I rotate my API key?
Not first. Rotating a key before route proof can hide the original evidence and may move the request to a different owner. Confirm the active route, then decide whether key scope is actually involved.
Why does my usage page still show capacity?
Because remaining monthly spend or plan usage is not the same as a rolling RPM, input-token, output-token, burst, or provider quota window.
What should I do if Claude Status is green?
Treat status as one signal, not the whole answer. If status is green, continue route proof: headers for direct API, /status for Claude Code, provider dashboards for Bedrock or Vertex AI, and gateway logs for proxy traffic.
When should I contact support?
Contact the owner only after one same-route reproduction still fails and you have the escalation packet. Good evidence is faster than a long narrative.
