Langfuse Integration

Introduction

Langfuse is an open-source LLM observability and analytics platform that provides tracing, monitoring, and analytics for LLM applications.

This integration requires only two changes to your existing Langfuse code - updating the base URL and API key.

Integration Steps

Create a .env file in your project:

HELICONE_API_KEY=sk-helicone-...

Install Langfuse packages

pip install langfuse python-dotenv

Create a Langfuse OpenAI client using Helicone

Use Langfuse’s OpenAI client wrapper with Helicone’s base URL:

import os
from dotenv import load_dotenv
from langfuse.openai import openai

# Load environment variables
load_dotenv()

# Create an OpenAI client with Helicone's base URL
client = openai.OpenAI(
    api_key=os.getenv("HELICONE_API_KEY"),
    base_url="https://ai-gateway.helicone.ai/"
)

Make requests with Langfuse tracing

Your existing Langfuse code continues to work without any changes:

# Make a chat completion request
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Tell me a fun fact about space."}
    ],
    name="fun-fact-request"  # Optional: Name of the generation in Langfuse
)

# Print the assistant's reply
print(response.choices[0].message.content)

Request/response bodies
Latency metrics
Token usage and costs
Model performance analytics
Error tracking
LLM traces and spans in Langfuse
Session tracking

While you’re here, why not give us a star on GitHub? It helps us a lot!

Complete Working Example

#!/usr/bin/env python3

import os
from dotenv import load_dotenv
from langfuse.openai import openai

# Load environment variables
load_dotenv()

# Create an OpenAI client with Helicone's base URL
client = openai.OpenAI(
    api_key=os.getenv("HELICONE_API_KEY"),
    base_url="https://ai-gateway.helicone.ai/"
)

# Make a chat completion request
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Tell me a fun fact about space."}
    ],
    name="fun-fact-request"  # Optional: Name of the generation in Langfuse
)

# Print the assistant's reply
print(response.choices[0].message.content)

Streaming Responses

Langfuse supports streaming responses with full observability:

# Streaming example
stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "Write a short story about a robot learning to code."}
    ],
    stream=True,
    name="streaming-story"
)

print("🤖 Assistant (streaming):")
for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="", flush=True)
print("\n")

Nested Example

import os
from dotenv import load_dotenv
from langfuse import observe
from langfuse.openai import openai

load_dotenv()

client = openai.OpenAI(
    base_url="https://ai-gateway.helicone.ai/",
    api_key=os.getenv("HELICONE_API_KEY"),
)

@observe()  # This decorator enables tracing of the function
def analyze_text(text: str):
    # First LLM call: Summarize the text
    summary_response = summarize_text(text)
    summary = summary_response.choices[0].message.content

    # Second LLM call: Analyze the sentiment of the summary
    sentiment_response = analyze_sentiment(summary)
    sentiment = sentiment_response.choices[0].message.content

    return {
        "summary": summary,
        "sentiment": sentiment
    }

@observe()  # Nested function to be traced
def summarize_text(text: str):
    return client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "You summarize texts in a concise manner."},
            {"role": "user", "content": f"Summarize the following text:\n{text}"}
        ],
        name="summarize-text"
    )

@observe()  # Nested function to be traced
def analyze_sentiment(summary: str):
    return client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "You analyze the sentiment of texts."},
            {"role": "user", "content": f"Analyze the sentiment of the following summary:\n{summary}"}
        ],
        name="analyze-sentiment"
    )

# Example usage
text_to_analyze = "OpenAI's GPT-4 model has significantly advanced the field of AI, setting new standards for language generation."
analyze_text(text_to_analyze)

AI Gateway Overview

Learn about Helicone’s AI Gateway features and capabilities

Provider Routing

Configure intelligent routing and automatic failover

Model Registry

Browse all available models and providers

Custom Properties

Add metadata to track and filter your requests

Sessions

Track multi-turn conversations and user sessions

Rate Limiting

Configure rate limits for your applications

Getting Started

AI Gateway

Observability & Analytics

Prompt Management

Legacy Integrations

References

Introduction

Integration Steps

Complete Working Example

Streaming Responses

Nested Example

AI Gateway Overview

Provider Routing

Model Registry

Custom Properties

Sessions

Rate Limiting

Getting Started

AI Gateway

Observability & Analytics

Prompt Management

Legacy Integrations

References

​Introduction

​Integration Steps

​Complete Working Example

​Streaming Responses

​Nested Example

​Related Documentation

AI Gateway Overview

Provider Routing

Model Registry

Custom Properties

Sessions

Rate Limiting

Introduction

Integration Steps

Complete Working Example

Streaming Responses

Nested Example

Related Documentation