Routers are the core concept of the Helicone AI Gateway. Unlike alternative solutions, it allows you to configure multiple independent routing policies within a single gateway deployment, each with its own load balancing strategy, provider configuration, and middleware settings.
Getting Started
What are Routers?
Routers define independent routing policies within the AI Gateway, each with its own configuration for:
Load balancing strategies - How requests are distributed across providers
Provider selection - Which LLM providers are available for each router
Middleware settings - Caching, rate limiting, retries, and other features
URL endpoints - Each router gets its own URL path for requests
Think of routers as separate “virtual gateways” within a single deployment - each optimized for different use cases, environments, or teams.
Understanding Router URLs
Each router you define becomes part of the URL path when making requests to the gateway. This design allows a single deployed gateway to serve multiple routing configurations.
URL Format: http://your-gateway-host/router/{router-name}/{api-path}
Production Router
Development Router
# Using the 'production' router
curl -X POST http://localhost:8080/router/production/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4",
"messages": [{"role": "user", "content": "Hello!"}]
}'
SDK Configuration
Configure your OpenAI SDK to use a specific router by setting the base URL:
import openai
# Production router
client = openai.OpenAI(
base_url = "http://localhost:8080/router/production" ,
api_key = "sk-placeholder" # Required by SDK, but gateway handles real auth
)
# Development router
dev_client = openai.OpenAI(
base_url = "http://localhost:8080/router/development" ,
api_key = "sk-placeholder" # Required by SDK, but gateway handles real auth
)
Basic Router Configuration
You can configure any number of named routers for different use cases.
routers :
# Any router name you want
my-router :
load-balance :
chat :
strategy : latency
providers :
- openai
- anthropic
# Additional named routers
production :
load-balance :
chat :
strategy : weighted
providers :
- provider : anthropic
weight : '0.7'
- provider : openai
weight : '0.3'
cache :
directive : "max-age=3600"
development :
load-balance :
chat :
strategy : latency
providers :
- ollama
- openai
Common Use Cases
Multi-Environment Deployment Team-Specific Configurations Use case: Different routers for different environments, all from a single gateway deployment.
routers :
production :
load-balance :
chat :
strategy : latency
providers :
- openai
- anthropic
- gemini
cache :
directive : "max-age=1800"
staging :
load-balance :
chat :
strategy : weighted
providers :
- provider : openai
weight : '0.8'
- provider : anthropic
weight : '0.2'
development :
load-balance :
chat :
strategy : latency
providers :
- ollama
- openai
Usage:
Production: http://localhost:8080/router/production
Staging: http://localhost:8080/router/staging
Development: http://localhost:8080/router/development
Use case: Different routers for different environments, all from a single gateway deployment.
routers :
production :
load-balance :
chat :
strategy : latency
providers :
- openai
- anthropic
- gemini
cache :
directive : "max-age=1800"
staging :
load-balance :
chat :
strategy : weighted
providers :
- provider : openai
weight : '0.8'
- provider : anthropic
weight : '0.2'
development :
load-balance :
chat :
strategy : latency
providers :
- ollama
- openai
Usage:
Production: http://localhost:8080/router/production
Staging: http://localhost:8080/router/staging
Development: http://localhost:8080/router/development
Use case: Different teams with their own router configurations and resource limits.
routers :
ml-team :
load-balance :
chat :
strategy : latency
providers :
- openai
- anthropic
rate-limit :
per-api-key :
capacity : 1000
refill-frequency : 1s
frontend-team :
load-balance :
chat :
strategy : weighted
providers :
- provider : anthropic
weight : '1.0'
cache :
directive : "max-age=3600"
rate-limit :
per-api-key :
capacity : 100
refill-frequency : 1s
Usage:
ML Team: http://localhost:8080/router/ml-team
Frontend Team: http://localhost:8080/router/frontend-team
Reference
Router Naming Rules
Router names must follow this regex pattern: ^[A-Za-z0-9_-]{1,12}$
Length: 1-12 characters
Allowed characters: Letters (A-Z, a-z), numbers (0-9), hyphens (-), underscores (_)
No spaces or special characters
Valid names: production
, dev
, team-a
, cost_opt
, v2
, A
, my_router
Invalid names: very-long-router-name
(too long), team@prod
(@ not allowed), router with spaces
(spaces not allowed)
URL Path Structure
The AI Gateway supports three different routing patterns:
Router-Based Routing (Recommended)
Pattern: /router/{name}/{api-path}
Examples:
/router/production/chat/completions
/router/development/chat/completions
/router/cost-optimized/embeddings
Features:
Full router configuration support
Load balancing, caching, retries
Router-specific rate limiting
Model mapping
Pattern: /{provider}/{api-path}
Examples:
/openai/chat/completions
/anthropic/messages
/gemini/v1beta/generateContent
Features:
Direct proxy to specific provider
No load balancing or router features
Global rate limiting only
Pattern: /ai/{api-path}
Examples:
Features:
Automatic provider detection from model name
Limited to OpenAI-compatible endpoints
No router-specific configuration
Responses are generated using AI and may contain mistakes.