Introduction
LiteLLM is an self-hosted interface for calling LLM APIs.Integration Steps
1
2
3
4
Use LiteLLM with Helicone
Add the
helicone/ prefix to any model name to logg requests for Helicone:5
Complete Working Examples
Basic Completion
Streaming Responses
Custom Properties and Session Tracking
Add metadata to track and filter your requests:Provider Selection and Fallback
Helicone’s AI Gateway supports automatic failover between providers:Advanced Features
Caching
Enable caching to reduce costs and latency for repeated requests:Rate Limiting
Apply rate limiting policies to control request rates:Related Documentation
AI Gateway Overview
Learn about Helicone’s AI Gateway features and capabilities
Provider Routing
Configure intelligent routing and automatic failover
Model Registry
Browse all available models and providers
Custom Properties
Add metadata to track and filter your requests
Sessions
Track multi-turn conversations and user sessions
Rate Limiting
Configure rate limits for your applications
Caching
Reduce costs and latency with intelligent caching
LiteLLM Documentation
Official LiteLLM documentation