Configure Helicone to automatically retry failed LLM requests, overcoming rate limits and server issues using intelligent exponential backoff.
429
(Too Many Requests), 500
(Internal Server Error), or 503
(Service Unavailable).
Helicone-Retry-Enabled: true
header to your requests:
"5"
for up to 5 retries"2"
doubles the wait time after each attempt"1000"
for 1 second minimum wait"10000"
caps wait time at 10 seconds"Helicone-Retry-Num": "3"
not "Helicone-Retry-Num": 3
Need more help?