Monitor error rates and costs to catch issues before they impact users. Helicone’s alert system provides real-time notifications when your LLM applications experience problems, helping you maintain reliability and control costs.

Alert Types

Error Rate Alerts

Track the percentage of failed requests over a time window. Essential for maintaining application reliability. Use cases:
  • Detect provider outages or rate limiting issues
  • Catch breaking changes in prompts or model behavior
  • Monitor deployment health after updates
  • Identify patterns in user inputs causing failures
Example: Alert when error rate exceeds 5% over 30 minutes with at least 20 requests.

Cost Alerts

Monitor spending to prevent budget overruns and detect unusual usage patterns. Use cases:
  • Prevent unexpected bills from runaway processes
  • Track per-environment spending (dev/staging/prod)
  • Detect potential abuse or misconfiguration
  • Monitor cost trends for specific features or users
Example: Alert when daily spending exceeds $1000 or when hourly spending exceeds $100.

Creating Alerts

Navigate to Settings → Alerts in your Helicone dashboard to create new alerts.
Helicone alerts dashboard with list of configured alerts

Helicone Alerts Dashboard showing configured alerts and their status

1

Select alert type and threshold

Alert creation interface showing configuration options

Creating a new alert in Helicone

Error Rate Alerts:
  • Percentage threshold: 1-100% (5-10% recommended for production)
  • Tracks ratio of failed requests to total requests
  • Failed requests include 4xx/5xx errors and timeouts
Cost Alerts:
  • Dollar amount threshold (e.g., 50,50, 500, $1000)
  • Tracks cumulative spend within time window
  • Includes all model costs across providers
2

Configure time window

Choose how long to evaluate the metric:
  • 5-15 minutes: Immediate detection, higher false positive rate
  • 30-60 minutes: Balanced approach (recommended for most apps)
  • 2-4 hours: Sustained issues only, fewer false positives
  • Daily/Weekly: Budget tracking and long-term trends
Shorter windows detect issues faster but may trigger during brief spikes. Longer windows reduce noise but delay detection.
3

Set minimum request threshold

Prevent false positives during low traffic periods:
  • Development: 5-10 requests minimum
  • Staging: 10-20 requests minimum
  • Production: 20-50 requests minimum
Alerts only trigger when both the threshold AND minimum requests are met.
Always set a minimum request count to avoid alert fatigue. A single failed request during low traffic shouldn’t trigger a 100% error rate alert.
4

Configure notifications

Choose where alerts are sent:
  • Email: Add any email address (immediate delivery)
  • Slack: Select connected channels (#alerts, #engineering, etc.)
  • Multiple recipients: Add several emails or channels per alert
Start with conservative thresholds (higher error %, longer windows) and tighten based on actual patterns. This prevents alert fatigue while you learn your app’s normal behavior.
Cost alert configuration showing threshold and time window settings

Example of a configured cost alert

Notification Channels

Dashboard

All alerts appear in your Helicone dashboard with real-time status updates. When an alert triggers, you can immediately see affected requests and investigate the issue.
Dashboard view when an alert has been triggered showing affected requests

Alert triggered view in the dashboard

Email Notifications

Add any email address to receive alerts. Emails include:
  • Alert type and threshold that triggered
  • Current metric value and trend
  • Direct link to affected requests in dashboard
  • Time window and request count
Email notification showing alert details and link to dashboard

Example alert notification email

Slack Integration

When creating or editing an alert:
  1. Select Slack as the notification method
  2. Click Connect Slack button that appears
  3. Authorize Helicone in your Slack workspace
  4. Select a channel from the dropdown (#alerts, #engineering, etc.)
After connecting, you can simply select any channel from your workspace. Slack messages include the same details as emails with rich formatting and direct links to view affected requests.

Configuration Examples

Production Monitoring

# Critical error detection
metric: error_rate
threshold: 10%
time_window: 10min
min_requests: 20
notify: [#incidents, oncall@company.com]

# Sustained error monitoring  
metric: error_rate
threshold: 5%
time_window: 30min
min_requests: 50
notify: #engineering

# Daily cost tracking
metric: cost
threshold: $1000
time_window: 24h
notify: [#finance, cto@company.com]

Development Environment

# Loose thresholds for dev
metric: error_rate
threshold: 25%
time_window: 60min
min_requests: 5
notify: dev-team@company.com

# Weekly budget check
metric: cost
threshold: $100
time_window: 7d
notify: #dev-alerts

Advanced Features (Coming Soon)

Soon you’ll be able to create massively customizable alerts:
  • Custom aggregations - Alert on any metric (P95 latency, token usage, specific error codes)
  • Advanced filters - Combine multiple custom properties with AND/OR logic
  • Complex thresholds - Percentage changes, rolling averages, anomaly detection
  • Custom webhooks - Send alerts to any endpoint
These features will enable precise monitoring for specific user segments, features, or any custom criteria you define.