Replaying LLM Sessions - Helicone OSS LLM Observability

Understanding how changes impact your AI agents in real-world interactions is crucial. By replaying LLM sessions with Helicone, you can apply modifications to actual AI agent sessions, providing valuable insights that traditional isolated testing may miss.

Use Cases

Optimize AI Agents: Enhance agent performance by testing modifications on real session data.
Debug Complex Interactions: Identify issues that only arise during full session interactions.
Accelerate Development: Streamline your AI agent development process by efficiently testing changes.

Record Sessions with Helicone Metadata

Instrument your AI agent’s LLM calls to include Helicone session metadata for tracking and logging.Example: Setting Up Session Metadata

Setting Up Session Metadata

const { Configuration, OpenAIApi } = require("openai");
const { randomUUID } = require("crypto");

// Generate unique session identifiers
const sessionId = randomUUID();
const sessionName = "AI Debate";
const sessionPath = "/debate/climate-change";

// Initialize OpenAI client with Helicone baseURL and auth header
const configuration = new Configuration({
  apiKey: process.env.OPENAI_API_KEY,
  basePath: "https://oai.helicone.ai/v1",
  baseOptions: {
    headers: {
      "Helicone-Auth": `Bearer ${process.env.HELICONE_API_KEY}`,
    },
  },
});
const openai = new OpenAIApi(configuration);

Include the Helicone session headers in your requests:

Including Helicone Session Headers

const completionParams = {
  model: "gpt-4o-mini",
  messages: conversation,
};

const response = await openai.createChatCompletion(completionParams, {
  headers: {
    "Helicone-Session-Id": sessionId,
    "Helicone-Session-Name": sessionName,
    "Helicone-Session-Path": sessionPath,
    "Helicone-Prompt-Id": "assistant-response",
  },
});

Initialize the conversation with the assistant:

Initializing Conversation

const topic = "The impact of climate change on global economies";

const conversation = [
  {
    role: "system",
    content:
      "You're an AI debate assistant. Engage with the user by presenting arguments for or against the topic. Keep responses concise and insightful.",
  },
  {
    role: "assistant",
    content: `Welcome to our debate! Today's topic is: "${topic}". I will argue in favor, and you will argue against. Please present your opening argument.`,
  },
];

Loop through the debate turns:

Looping Through Debate Turns

const MAX_TURNS = 3;
let turn = 1;

while (turn <= MAX_TURNS) {
  // Get user's argument (simulate user input)
  const userArgument = await getUserArgument();
  conversation.push({ role: "user", content: userArgument });

  // Assistant responds with a counter-argument
  const assistantResponse = await generateAssistantResponse(
    conversation,
    sessionId,
    sessionName,
    sessionPath
  );
  conversation.push(assistantResponse);

  turn++;
}

// Function to simulate user input
async function getUserArgument() {
  // Simulate user input or fetch from an input source
  const userArguments = [
    "I believe climate change is a natural cycle and not significantly influenced by human activities.",
    "Economic resources should focus on immediate human needs rather than combating climate change.",
    "Strict environmental regulations can hinder economic growth and affect employment rates.",
  ];
  // Return the next argument
  return userArguments.shift();
}

// Function to generate assistant's response
async function generateAssistantResponse(
  conversation,
  sessionId,
  sessionName,
  sessionPath
) {
  const completionParams = {
    model: "gpt-4o-mini",
    messages: conversation,
  };

  const response = await openai.createChatCompletion(completionParams, {
    headers: {
      "Helicone-Session-Id": sessionId,
      "Helicone-Session-Name": sessionName,
      "Helicone-Session-Path": sessionPath,
      "Helicone-Prompt-Id": "assistant-response",
    },
  });

  const assistantMessage = response.data.choices[0].message;
  return assistantMessage;
}

After setting up and running your session through Helicone, you can view it in Helicone:

Go fullscreen for the best experience.

Retrieve Session Data

Use Helicone’s Request API to fetch session data.Example: Querying Session Data

Querying Session Data

curl --request POST \
  --url https://api.helicone.ai/v1/request/query \
  --header "Content-Type: application/json" \
  --header "authorization: Bearer $HELICONE_API_KEY" \
  --data '{
    "limit": 100,
    "offset": 0,
    "sort_by": {
      "key": "request_created_at",
      "direction": "asc"
    },
    "filter": {
      "properties": {
        "Helicone-Session-Id": {
          "equals": "<session-id>"
        }
      }
    }
  }'

Modify and Replay the Session

Retrieve the original requests, apply modifications, and resend them to observe the impact.Example: Modifying Requests and Replaying

Modifying Requests and Replaying

const fetch = require("node-fetch");
const { randomUUID } = require("crypto");

const HELICONE_API_KEY = process.env.HELICONE_API_KEY;
const OPENAI_API_KEY = process.env.OPENAI_API_KEY;
const REPLAY_SESSION_ID = randomUUID();

async function replaySession(requests) {
  for (const request of requests) {
    const modifiedRequest = modifyRequestBody(request);

    await sendRequest(modifiedRequest);
  }
}

function modifyRequestBody(request) {
  // Implement modifications to the request body as needed
  // For example, enhancing the system prompt for better responses
  if (request.prompt_id === "assistant-response") {
    const systemMessage = request.body.messages.find(
      (msg) => msg.role === "system"
    );
    if (systemMessage) {
      systemMessage.content +=
        " Take the persona of a field expert and provide more persuasive arguments.";
    }
  }
  return request;
}

async function sendRequest(modifiedRequest) {
  const { body, request_path, path, prompt_id } = modifiedRequest;

  const response = await fetch(request_path, {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${OPENAI_API_KEY}`,
      "Helicone-Auth": `Bearer ${HELICONE_API_KEY}`,
      "Helicone-Session-Id": REPLAY_SESSION_ID,
      "Helicone-Session-Name": "Replayed Session",
      "Helicone-Session-Path": path,
      "Helicone-Prompt-Id": prompt_id,
    },
    body: JSON.stringify(body),
  });

  const data = await response.json();
  // Handle the response as needed
}

Note: In the modifyRequestBody function, we’re enhancing the assistant’s system prompt to make the responses more persuasive by taking the persona of a field expert.

Analyze the Replayed Session

After replaying, use Helicone’s dashboard to compare the original and modified sessions to evaluate improvements.

Go fullscreen for the best experience.

Additional Tips

Version Control Prompts: Keep track of different prompt versions to see which yields the best results.
Use Evaluations: Utilize Helicone’s Evaluation Features to score and compare responses.
Prompt Versioning: Use Helicone’s Prompt Versioning to manage and compare different prompt versions effectively.

Conclusion

By replaying LLM sessions with Helicone, you can effectively optimize your AI agents, leading to improved performance and better user experiences.

Need more help?

Additional questions or feedback? Reach out to [email protected] or schedule a call with us.

Guides

​Use Cases

​Additional Tips

​Conclusion

Use Cases

Additional Tips

Conclusion