anthropic-ratelimit-* response headers and the official type field on every error body — that one habit collapses 80% of "Claude API is failing" tickets into a 5-minute fix. Use current model IDs (claude-opus-4-7, claude-sonnet-4-6, claude-haiku-4-5-20251001), enable prompt caching on system prompts >1024 tokens for ~90% cost cuts, and never call the API from the browser — proxy server-side.If your Claude API calls are returning errors, hanging, or silently dropping — here's the diagnostic checklist an AI integration dev actually uses in production.
x-api-key, not Authorization: Bearer (that's OpenAI). You also need anthropic-version: 2023-06-01 and content-type: application/json. Missing the version header is the #1 silent killer I see on client sites.retry-after header and batch long prompts instead of parallel-firing them.stream: true with proper reader.releaseLock() cleanup, or move the call to a queued background worker. CORS will also fail silently if you're calling from the browser — never do that; always proxy server-side.max_tokens too low (response gets truncated mid-JSON) or passing the full conversation on every turn without prompt caching. Enable cache_control: {"type":"ephemeral"} on system prompts >1024 tokens for a 90% cost cut on repeat calls./v1/messages with curl + a known-good key. If curl works, it's your code.type field.anthropic-ratelimit-* response headers for tier limits.claude-sonnet-4-5, not a deprecated alias).invalid_request_error — malformed JSON or bad paramsauthentication_error — key is wrong, revoked, or org mismatchpermission_error — your workspace can't access that modelrate_limit_error — slow down, check retry-afterapi_error — Anthropic's side, retry with backoffoverloaded_error — capacity issue, retry in 2–10s$100/hr · No retainer · North County San Diego · Remote anywhere
Text 858-461-8054Same operator, adjacent surfaces. If you came in on a Claude debug query, these are the next 2-minute scans worth your time:
Pin a dated ID. As of May 2026 the live IDs are claude-opus-4-7, claude-sonnet-4-6, and claude-haiku-4-5-20251001. Older aliases like claude-3-5-sonnet-latest and claude-3-sonnet-20240229 will return model_not_found. Avoid -latest aliases for billing and behavior predictability.
Every response includes anthropic-ratelimit-requests-remaining, anthropic-ratelimit-input-tokens-remaining, anthropic-ratelimit-output-tokens-remaining, and on 429 a retry-after in seconds. Log all four in production. Tier-1 default is roughly 50 RPM and 40K ITPM on Sonnet — a single 30K-token doc trips ITPM even with one request.
Add cache_control: {"type":"ephemeral"} to system-prompt or large content blocks >1024 tokens. Cache writes cost 1.25x normal input, but cache reads cost 0.1x — so any prompt reused 2+ times within 5 minutes is a net win. Multi-turn agents and RAG pipelines see 80-95% input-cost reduction. Pin the model and the cache key — model ID change invalidates cache.
invalid_request_error (malformed JSON / params), authentication_error (key wrong / revoked / org mismatch), permission_error (workspace can't access that model), rate_limit_error (slow down, check retry-after), api_error (Anthropic side, retry with backoff), overloaded_error (capacity, retry in 2-10s with jitter). The type field is in the response body — log it, don't just log the HTTP status.
No — Anthropic blocks browser-origin requests by default for security (your API key would be exposed in DevTools network tab). Always proxy through a server: Next.js API route, Cloudflare Worker, Vercel edge function, Express endpoint. The server holds the key, the browser only talks to your server. This is the same pattern as every other LLM API.
Don't see what you were looking for?
Text PJ a sentence about what you actually need — I'll build you a free custom shareable on the house. No email, no funnel, no SOW.
📲 Text PJ — free shareable