Configure Helicone to automatically retry failed LLM requests, overcoming rate limits and server issues using intelligent exponential backoff.
429
(Too Many Requests) and 500
(Internal Server Error).
For more information on error codes, see the OpenAI API error codes documentation.
Learn About Exponential Backoff
Helicone-Retry-Enabled
to true
.
Parameter | Description | Default Value |
---|---|---|
helicone-retry-num | Number of retries | 5 |
helicone-retry-factor | The exponential backoff factor used to increaase the wait time between subsequent retries. The default is usually 2 . | 2 |
helicone-retry-min-timeout | Minimum timeout (in milliseconds) between retries | 1000 |
helicone-retry-max-timeout | Maximum timeout (in milliseconds) between retries | 10000 |
"helicone-retry-num": "3"
.Need more help?