Introduction

OpenAI’s Realtime API enables low-latency, multi-modal conversational experiences with support for text and audio as both input and output. By integrating with Helicone, you can monitor performance, analyze interactions, and gain valuable insights into your real-time conversations.

Integration Steps

1

Create an Account and Generate an API Key

Log into Helicone or create a new account. Once logged in, generate a Helicone API key.

Keep your API keys secure and do not expose them publicly.

2

Set Environment Variables

Set your API keys as environment variables:

# For OpenAI
export OPENAI_API_KEY=<your-openai-api-key>
export HELICONE_API_KEY=<your-helicone-api-key>

# For Azure
export AZURE_API_KEY=<your-azure-api-key>
export AZURE_RESOURCE=<your-azure-resource>
export AZURE_DEPLOYMENT=<your-azure-deployment>
export HELICONE_API_KEY=<your-helicone-api-key>
3

Install Required Dependencies

Install the required dependencies:

npm install ws dotenv
4

Configure WebSocket Connection

You can connect to the Realtime API through Helicone using either OpenAI or Azure as your provider. Here are examples for both:

OpenAI Provider

import WebSocket from "ws";
import { config } from "dotenv";

config();

const url = "wss://api.helicone.ai/v1/gateway/oai/realtime?model=gpt-4o-realtime-preview-2024-12-17";

const ws = new WebSocket(url, {
  headers: {
    "Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
    "Helicone-Auth": `Bearer ${process.env.HELICONE_API_KEY}`,
    // Optional Helicone properties
    "Helicone-Session-Id": `session_${Date.now()}`,
    "Helicone-User-Id": "user_123",
  },
});

ws.on("open", function open() {
  console.log("Connected to server");
  // Initialize session with desired configuration
  ws.send(JSON.stringify({
    type: "session.update",
    session: {
      modalities: ["text", "audio"],
      instructions: "You are a helpful AI assistant...",
      voice: "alloy",
      input_audio_format: "pcm16",
      output_audio_format: "pcm16",
    }
  }));
});

Azure Provider

import WebSocket from "ws";
import { config } from "dotenv";

config();

const url = `wss://api.helicone.ai/v1/gateway/oai/realtime?resource=${process.env.AZURE_RESOURCE}&deployment=${process.env.AZURE_DEPLOYMENT}`;

const ws = new WebSocket(url, {
  headers: {
    "Authorization": `Bearer ${process.env.AZURE_API_KEY}`,
    "Helicone-Auth": `Bearer ${process.env.HELICONE_API_KEY}`,
    // Optional Helicone properties
    "Helicone-Session-Id": `session_${Date.now()}`,
    "Helicone-User-Id": "user_123",
  },
});

ws.on("open", function open() {
  console.log("Connected to server");
  // Initialize session with desired configuration
  ws.send(JSON.stringify({
    type: "session.update",
    session: {
      modalities: ["text", "audio"],
      instructions: "You are a helpful AI assistant...",
      voice: "alloy",
      input_audio_format: "pcm16",
      output_audio_format: "pcm16",
    }
  }));
});
5

Handle WebSocket Events

Add event handlers for the WebSocket connection:

ws.on("message", function incoming(message) {
  try {
    const response = JSON.parse(message.toString());
    console.log("Received:", response);

    // Handle specific event types
    switch (response.type) {
      case "input_audio_buffer.speech_started":
        console.log("Speech detected!");
        break;
      case "input_audio_buffer.speech_stopped":
        console.log("Speech ended. Processing...");
        break;
      case "conversation.item.input_audio_transcription.completed":
        console.log("Transcription:", response.transcript);
        break;
      case "error":
        console.error("Error:", response.error.message);
        break;
    }
  } catch (error) {
    console.error("Error parsing message:", error);
  }
});

ws.on("error", function error(err) {
  console.error("WebSocket error:", err);
});

// Handle cleanup
process.on("SIGINT", () => {
  console.log("\nClosing connection...");
  ws.close();
  process.exit(0);
});

Session Management

Helicone provides additional headers to help you manage and analyze your sessions:

const headers = {
  // Required headers
  Authorization: `Bearer ${apiKey}`,
  "Helicone-Auth": `Bearer ${process.env.HELICONE_API_KEY}`,

  // Optional headers for better analytics
  "Helicone-Session-Id": `session_${Date.now()}`,
  "Helicone-User-Id": "user_123",
  "Helicone-Session-Name": "VoiceAssistant",
  "Helicone-Session-Path": "/voice-assistant/conversation",
};

These headers help you:

  • Group related API calls
  • Track user sessions
  • Analyze conversation patterns
  • Monitor usage and performance

The Realtime API is stateful and maintains the conversation state throughout the WebSocket connection. If the connection is lost, you’ll need to recreate the session and reinitialize the conversation state.