Manual Logger with Streaming Support
Helicone’s Manual Logger provides powerful capabilities for tracking LLM requests and responses, including streaming responses. This guide will show you how to use the @helicone/helpers
package to log streaming responses from various LLM providers.
Installation
First, install the @helicone/helpers
package:
npm install @helicone/helpers
# or
yarn add @helicone/helpers
# or
pnpm add @helicone/helpers
Basic Setup
Initialize the HeliconeManualLogger with your API key:
import { HeliconeManualLogger } from "@helicone/helpers";
const helicone = new HeliconeManualLogger({
apiKey: process.env.HELICONE_API_KEY!,
headers: {
// Optional headers to include with all requests
"Helicone-Property-Environment": "production",
},
});
Streaming Methods
The HeliconeManualLogger provides several methods for working with streams:
1. logBuilder (New)
The recommended method for handling streaming responses with improved error handling:
logBuilder(
request: HeliconeLogRequest,
additionalHeaders?: Record<string, string>
): HeliconeLogBuilder
2. logStream
A flexible method that gives you full control over stream handling:
async logStream<T>(
request: HeliconeLogRequest,
operation: (resultRecorder: HeliconeStreamResultRecorder) => Promise<T>,
additionalHeaders?: Record<string, string>
): Promise<T>
3. logSingleStream
A simplified method for logging a single ReadableStream:
async logSingleStream(
request: HeliconeLogRequest,
stream: ReadableStream,
additionalHeaders?: Record<string, string>
): Promise<void>
4. logSingleRequest
For logging a single request with a response body:
async logSingleRequest(
request: HeliconeLogRequest,
body: string,
additionalHeaders?: Record<string, string>
): Promise<void>
Next.js App Router with LogBuilder (Recommended)
The new logBuilder
method provides better error handling and simplified stream management:
// app/api/chat/route.ts
import { HeliconeManualLogger } from "@helicone/helpers";
import { after } from "next/server";
import Together from "together-ai";
const together = new Together();
const helicone = new HeliconeManualLogger({
apiKey: process.env.HELICONE_API_KEY!,
});
export async function POST(request: Request) {
const { question } = await request.json();
const body = {
model: "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
messages: [{ role: "user", content: question }],
stream: true,
};
const heliconeLogBuilder = helicone.logBuilder(body, {
"Helicone-Property-Environment": "dev",
});
try {
const response = await together.chat.completions.create(body);
return new Response(heliconeLogBuilder.toReadableStream(response));
} catch (error) {
heliconeLogBuilder.setError(error);
throw error;
} finally {
after(async () => {
// This will be executed after the response is sent to the client
await heliconeLogBuilder.sendLog();
});
}
}
The logBuilder
approach offers several advantages:
- Better error handling with
setError
method
- Simplified stream handling with
toReadableStream
- More flexible async/await patterns with
sendLog
- Proper error status code tracking
Examples with Different LLM Providers
OpenAI
import OpenAI from "openai";
import { HeliconeManualLogger } from "@helicone/helpers";
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
const helicone = new HeliconeManualLogger({
apiKey: process.env.HELICONE_API_KEY!,
});
async function generateStreamingResponse(prompt: string, userId: string) {
const requestBody = {
model: "gpt-4-turbo",
messages: [{ role: "user", content: prompt }],
stream: true,
};
const response = await openai.chat.completions.create(requestBody);
// For OpenAI's Node.js SDK, we can use the logSingleStream method
const stream = response.toReadableStream();
const [streamForUser, streamForLogging] = stream.tee();
helicone.logSingleStream(requestBody, streamForLogging, {
"Helicone-User-Id": userId,
});
return streamForUser;
}
Together AI
import Together from "together-ai";
import { HeliconeManualLogger } from "@helicone/helpers";
const together = new Together({ apiKey: process.env.TOGETHER_API_KEY });
const helicone = new HeliconeManualLogger({
apiKey: process.env.HELICONE_API_KEY!,
});
export async function generateWithTogetherAI(prompt: string, userId: string) {
const body = {
model: "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
messages: [{ role: "user", content: prompt }],
stream: true,
};
const response = await together.chat.completions.create(body);
// Create two copies of the stream
const [stream1, stream2] = response.tee();
// Log the stream with Helicone
helicone.logStream(
body,
async (resultRecorder) => {
resultRecorder.attachStream(stream2.toReadableStream());
return stream1;
},
{ "Helicone-User-Id": userId }
);
return new Response(stream1.toReadableStream());
}
Anthropic
import Anthropic from "@anthropic-ai/sdk";
import { HeliconeManualLogger } from "@helicone/helpers";
const anthropic = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
const helicone = new HeliconeManualLogger({
apiKey: process.env.HELICONE_API_KEY!,
});
async function generateWithAnthropic(prompt: string, userId: string) {
const requestBody = {
model: "claude-3-opus-20240229",
messages: [{ role: "user", content: prompt }],
stream: true,
};
const response = await anthropic.messages.create(requestBody);
const stream = response.toReadableStream();
const [userStream, loggingStream] = stream.tee();
helicone.logSingleStream(requestBody, loggingStream, {
"Helicone-User-Id": userId,
});
return userStream;
}
Next.js API Route Example
Here’s how to use the manual logger in a Next.js API route:
// pages/api/generate.ts
import { NextApiRequest, NextApiResponse } from "next";
import OpenAI from "openai";
import { HeliconeManualLogger } from "@helicone/helpers";
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
const helicone = new HeliconeManualLogger({
apiKey: process.env.HELICONE_API_KEY!,
});
export default async function handler(
req: NextApiRequest,
res: NextApiResponse
) {
if (req.method !== "POST") {
return res.status(405).json({ error: "Method not allowed" });
}
const { prompt, userId } = req.body;
if (!prompt) {
return res.status(400).json({ error: "Prompt is required" });
}
try {
const requestBody = {
model: "gpt-4-turbo",
messages: [{ role: "user", content: prompt }],
};
// For non-streaming responses
const response = await helicone.logRequest(
requestBody,
async (resultRecorder) => {
const result = await openai.chat.completions.create(requestBody);
resultRecorder.appendResults(result);
return result;
},
{ "Helicone-User-Id": userId || "anonymous" }
);
return res.status(200).json(response);
} catch (error) {
console.error("Error generating response:", error);
return res.status(500).json({ error: "Failed to generate response" });
}
}
Next.js App Router with Vercel’s after
Function
For Next.js App Router, you can use Vercel’s after
function to log requests without blocking the response:
// app/api/generate/route.ts
import { HeliconeManualLogger } from "@helicone/helpers";
import { after } from "next/server";
import Together from "together-ai";
const together = new Together({ apiKey: process.env.TOGETHER_API_KEY });
const helicone = new HeliconeManualLogger({
apiKey: process.env.HELICONE_API_KEY!,
});
export async function POST(request: Request) {
const { question } = await request.json();
// Example with non-streaming response
const nonStreamingBody = {
model: "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
messages: [{ role: "user", content: question }],
stream: false,
};
const completion = await together.chat.completions.create(nonStreamingBody);
// Log non-streaming response after sending the response to the client
after(
helicone.logSingleRequest(nonStreamingBody, JSON.stringify(completion))
);
// Example with streaming response
const streamingBody = {
model: "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
messages: [{ role: "user", content: question }],
stream: true,
};
const response = await together.chat.completions.create(streamingBody);
const [stream1, stream2] = response.tee();
// Log streaming response after sending the response to the client
after(helicone.logSingleStream(streamingBody, stream2.toReadableStream()));
return new Response(stream1.toReadableStream());
}
Logging Custom Events
You can also use the manual logger to log custom events:
// Log a tool usage
await helicone.logSingleRequest(
{
_type: "tool",
toolName: "calculator",
input: { expression: "2 + 2" },
},
JSON.stringify({ result: 4 }),
{ additionalHeaders: { "Helicone-User-Id": "user-123" } }
);
// Log a vector database operation
await helicone.logSingleRequest(
{
_type: "vector_db",
operation: "search",
text: "How to make pasta",
topK: 3,
databaseName: "recipes",
},
JSON.stringify([
{ id: "1", content: "Pasta recipe 1", score: 0.95 },
{ id: "2", content: "Pasta recipe 2", score: 0.87 },
{ id: "3", content: "Pasta recipe 3", score: 0.82 },
]),
{ additionalHeaders: { "Helicone-User-Id": "user-123" } }
);
Advanced Usage: Tracking Time to First Token
The logStream
, logSingleStream
, and logBuilder
methods automatically track the time to first token, which is a valuable metric for understanding LLM response latency:
// Using logBuilder (recommended)
const heliconeLogBuilder = helicone.logBuilder(requestBody, {
"Helicone-User-Id": userId,
});
// The builder will automatically track when the first chunk arrives
const stream = heliconeLogBuilder.toReadableStream(response);
// Later, call sendLog() to complete the logging
await heliconeLogBuilder.sendLog();
// Using logStream
helicone.logStream(
requestBody,
async (resultRecorder) => {
// The resultRecorder will automatically track when the first chunk arrives
resultRecorder.attachStream(stream);
return stream;
},
{ "Helicone-User-Id": userId }
);
// Using logSingleStream
helicone.logSingleStream(requestBody, stream, { "Helicone-User-Id": userId });
This timing information will be available in your Helicone dashboard, allowing you to monitor and optimize your LLM response times.
Conclusion
The HeliconeManualLogger provides powerful capabilities for tracking streaming LLM responses across different providers. By using the appropriate method for your use case, you can gain valuable insights into your LLM usage while maintaining the benefits of streaming responses.