Webhooks: Real-Time LLM Integration & Automation
Leverage Helicone’s powerful webhook system to automate your LLM workflows. Instantly react to events, trigger actions, and integrate with external tools for enhanced AI observability and management. Perfect for developers building robust LLM applications.
March 2025 Update: We’ve enhanced our webhook implementation to provide a
unified request_response_url
field that contains both request and response
data in a single object. This improves performance and simplifies data
retrieval. Learn more.
Top use cases
- Scoring: Score requests based on custom logic.
- Data ETL: Moving data from one system to another.
- Automations / Alerts: Trigger actions automatically, such as sending a Slack notification or triggering a webhook to an external tool.
Setting up webhooks
Head over to the webhooks page to set up a webhook.
Webhook configuration UI
Add the webhook URL and select the events you want to trigger on.
You will want to copy the HMAC key and add it to your webhook environment to validate the signature of the webhook request.
Configure your webhook route
We recommend for startups to use Cloudflare workers or Vercel edge functions for webhooks, they are simple to setup and scale very well.
We have a prebuilt Cloudflare worker that you can use as a starting point.
The webhook endpoint is a POST route that accepts the following JSON body:
POST /webhook
The body of the request will contain the following fields:
request_id
: The request ID of the request that triggered the webhook.user_id
: The identifier of the user who made the request (if available).request_body
: The body of the request that triggered the webhook.response_body
: The body of the response that triggered the webhook.request_response_url
: The URL to fetch the full request and response data.
Example - NextJS
Webhook Configuration
When setting up a webhook, you can configure the following options:
- Destination URL: The URL where webhook payloads will be sent.
- Sample Rate: Control what percentage of requests trigger webhooks (0-100%).
- Property Filters: Only send webhooks for requests with specific properties.
- Include Data: Toggle whether to include additional data like costs and S3 links (enabled by default).
Standard Webhook Payload
By default, webhooks send a minimal payload with just the request ID and truncated request/response bodies:
Enhanced Webhook Payload
When the includeData
option is enabled, webhooks include additional useful information:
The enhanced payload provides:
- User Identifier: The
user_id
field helps track which user or entity made the request (only included when explicitly set in the original request) - Combined S3 URL: A single URL that provides access to both the complete request and response data
- Model Information: The model and provider used
- Cost Data: Calculated cost based on token usage
- Token Counts: Prompt, completion, and total token counts
- Latency: Request-to-response time in milliseconds
Metadata Fields Explained
The metadata
object contains valuable information about the request:
- cost: Estimated cost of the request in USD, calculated based on the model’s pricing and token usage
- promptTokens: Number of tokens in the prompt/request
- completionTokens: Number of tokens in the completion/response
- totalTokens: Total number of tokens used in the request (promptTokens + completionTokens)
- latencyMs: Time taken to process the request in milliseconds
These metadata fields are particularly useful for:
- Cost tracking and budget management
- Performance monitoring and optimization
- Usage analytics and reporting
- Identifying potential issues with specific requests
This additional data makes it easier to track costs and analyze performance without making additional API calls.
Working with the Combined Request/Response URL
The request_response_url
field provides a pre-signed S3 URL that contains both the complete request and response data in a single JSON object. This approach offers several advantages:
- Complete Data Access: Get the full, untruncated request and response data, including all fields and metadata.
- Single Request: Retrieve both request and response with a single HTTP call.
- Structured Format: The data is returned in a structured JSON format that’s easy to parse and process.
Example: Fetching and Processing the Combined Data
Here’s how to fetch and process the data from the request_response_url
:
Common Use Cases
The combined request/response data is particularly useful for:
- Advanced Analytics: Analyze the full request and response to extract insights about your LLM usage.
- Cost Tracking: Access detailed token usage information to track costs across different models and requests.
- Quality Monitoring: Evaluate the quality of responses based on the complete context of the request.
- Data Archiving: Store the complete interaction data for compliance or historical analysis.
Webhook Security
Securing your webhook implementation is critical to protect sensitive data and prevent unauthorized access. Follow these best practices:
Signature Verification
Always validate the webhook signature to ensure requests are coming from Helicone:
Secret Management
- Store your webhook secret securely using environment variables or a secrets manager
- Never hardcode the secret in your application code
- Rotate the webhook secret periodically for enhanced security
HTTPS Only
- Only use HTTPS endpoints for your webhooks
- Configure proper TLS/SSL settings on your server
- Ensure certificates are valid and up-to-date
Rate Limiting
Implement rate limiting on your webhook endpoint to protect against potential abuse:
Access Control
- Restrict access to your webhook processing logic
- Implement proper authentication for any systems that access webhook data
- Use the principle of least privilege for services processing webhook data
Logging and Monitoring
- Log all webhook requests (excluding sensitive data)
- Monitor for unusual patterns or failed signature verifications
- Set up alerts for potential security incidents
By implementing these security measures, you can ensure your webhook integration remains secure and reliable.
Troubleshooting Webhooks
Common Issues
-
Missing or Invalid Signature
- Ensure you’re using the correct HMAC key provided in the Helicone dashboard.
- Verify that you’re calculating the signature using the entire request body.
-
URL Expiration
- The
request_response_url
is a pre-signed URL that expires after 30 minutes. Make sure to fetch the data promptly after receiving the webhook.
- The
-
Large Payloads
- Remember that request and response bodies in the webhook payload are truncated if they exceed 10KB. Always use the
request_response_url
for complete data.
- Remember that request and response bodies in the webhook payload are truncated if they exceed 10KB. Always use the
-
Webhook Timeouts
- Webhook delivery will time out after 2 minutes. Ensure your endpoint responds quickly, and consider using a queue for processing long-running tasks.
Debugging Tips
-
Local Testing
- Use tools like ngrok or Cloudflare Tunnel to expose your local development server for webhook testing.
-
Logging
- Implement comprehensive logging in your webhook handler to track received payloads and any processing errors.
-
Retry Logic
- Consider implementing retry logic in your webhook consumer to handle temporary failures when fetching the
request_response_url
data.
- Consider implementing retry logic in your webhook consumer to handle temporary failures when fetching the
-
Webhook Monitoring
- Monitor webhook deliveries in the Helicone dashboard to identify any patterns of failures or issues.
Using Webhook Metadata for Analytics
The metadata included in webhook payloads provides valuable insights for monitoring and analyzing your LLM usage. Here are some common ways to leverage this data:
Cost Tracking
Performance Monitoring
Token Usage Analysis
By integrating these analytics into your webhook handler, you can gain real-time insights into your LLM usage patterns, costs, and performance metrics.
User Tracking with user_id
The user_id
field in webhook payloads enables powerful user-specific analytics and monitoring:
Setting the user_id
To ensure the user_id
is included in your webhook payloads:
-
When making requests through Helicone, include the
Helicone-User-Id
header: -
Alternatively, include the user ID in your request properties:
Important: The
user_id
field will only be included in webhook payloads when it has been explicitly set using one of the methods above. If no user ID is provided in the original request, this field will be omitted from the webhook payload.
The user_id
field makes it possible to build user-specific analytics, implement per-user rate limiting, and track usage patterns across your application.
Fetching Historical Requests
In addition to receiving real-time webhooks, you may need to access historical request data. Helicone provides an API for retrieving past requests:
Using the Helicone API
You can fetch historical requests using the Helicone API:
Common Use Cases for Historical Data
- Audit Trails: Create comprehensive audit logs of all LLM interactions for compliance purposes.
- Usage Reports: Generate periodic reports on usage patterns and costs.
- Training Data Collection: Gather high-quality examples for fine-tuning models.
- Retroactive Analysis: Analyze past interactions to identify patterns or issues.
Combining Webhooks and Historical Data
For a complete solution, you can use webhooks for real-time processing and the API for historical data:
This dual approach ensures you have both immediate access to new data and the ability to analyze historical trends.
Was this page helpful?