SDK Integration

When building LLM applications, you sometimes need direct control over prompt compilation without routing through the AI Gateway. The SDK provides an alternative integration method that allows you to pull and compile prompts directly in your application.

SDK vs AI Gateway

We provide SDKs for both TypeScript and Python that offer two ways to use Helicone prompts:

AI Gateway Integration - Use prompts through the Helicone AI Gateway (recommended)
Direct SDK Integration - Pull prompts directly via SDK (this page)

Prompts through the AI Gateway come with several benefits:

Cleaner code: Automatically performs compilation and substitution in the router.
Input traces: Traces inputs on each request for better observability in Helicone requests.
Faster TTFT: The AI Gateway adds significantly less latency compared to the SDK.

The SDK is a great option for users that need direct interaction with compiled prompt bodies without using the AI Gateway.

Installation

TypeScript
Python

npm install @helicone/helpers

pip install helicone-helpers openai

Note: The OpenAI Python SDK is required for prompt management features.

Types and Classes

TypeScript
Python

The SDK provides types for both integration methods when using the OpenAI SDK:

Type	Description	Use Case
`HeliconeChatCreateParams`	Standard chat completions with prompts	Non-streaming requests
`HeliconeChatCreateParamsStreaming`	Streaming chat completions with prompts	Streaming requests

Both types extend the OpenAI SDK’s chat completion parameters and add:

prompt_id - Your saved prompt identifier
environment - Optional environment to target (e.g., “production”, “staging”)
version_id - Optional specific version (defaults to production version)
inputs - Variable values

Important: These types make messages optional because Helicone prompts are expected to contain the required message structure. If your prompt template is empty or doesn’t include messages, you’ll need to provide them at runtime.For direct SDK integration:

import { HeliconePromptManager } from '@helicone/helpers';

const promptManager = new HeliconePromptManager({
  apiKey: "your-helicone-api-key"
});

The SDK provides types that extend OpenAI’s official types:

Type	Description	Use Case
`HeliconeChatParams`	Chat completion parameters with prompt support (includes environment)	All prompt requests
`PromptCompilationResult`	Result with body and validation errors	Error handling

The HeliconeChatParams type includes all OpenAI parameters plus:

prompt_id - Your saved prompt identifier
environment - Optional environment to target (e.g., “production”, “staging”)
version_id - Optional specific version (defaults to production version)
inputs - Variable values for template substitution

Important: Similar to TypeScript, messages becomes optional when using prompts since your saved prompt template should contain the necessary message structure.The main class for direct SDK integration:

from helicone_helpers import HeliconePromptManager

prompt_manager = HeliconePromptManager(
    api_key="your-helicone-api-key"
)

Methods

Both SDKs provide the HeliconePromptManager with these main methods:

Method	Description	Returns
`pullPromptVersion()`	Determine which prompt version to use	Prompt version object
`pullPromptBody()`	Fetch raw prompt from storage	Raw prompt body
`pullPromptBodyByVersionId()`	Fetch prompt by specific version ID	Raw prompt body
`mergePromptBody()`	Merge prompt with inputs and validation	Compilation result
`getPromptBody()`	Complete compile process with inputs	Compiled body + validation errors
`extractPromptPartials()`	Extract prompt partial references from prompt body	Array of prompt partial objects
`getPromptPartialSubstitutionValue()`	Get the content to substitute for a prompt partial	Substitution string

Usage Examples

TypeScript
Python

import OpenAI from 'openai';
import { HeliconePromptManager } from '@helicone/helpers';

const openai = new OpenAI({
  baseURL: "https://ai-gateway.helicone.ai",
  apiKey: process.env.HELICONE_API_KEY,
});

const promptManager = new HeliconePromptManager({
  apiKey: "your-helicone-api-key"
});

async function generateWithPrompt() {
  // Get compiled prompt with variable substitution
  const { body, errors } = await promptManager.getPromptBody({
    prompt_id: "abc123",
    model: "gpt-4o-mini",
    inputs: {
      customer_name: "Alice Johnson",
      product: "AI Gateway"
    }
  });

  // Check for validation errors
  if (errors.length > 0) {
    console.warn("Validation errors:", errors);
  }

  // Use compiled prompt with OpenAI SDK
  const response = await openai.chat.completions.create(body);
  console.log(response.choices[0].message.content);
}

import openai
import os
from helicone_helpers import HeliconePromptManager

client = openai.OpenAI(
    base_url="https://ai-gateway.helicone.ai",
    api_key=os.environ.get("HELICONE_API_KEY")
)

prompt_manager = HeliconePromptManager(
    api_key="your-helicone-api-key"
)

def generate_with_prompt():
    # Get compiled prompt with variable substitution
    result = prompt_manager.get_prompt_body({
        "prompt_id": "abc123",
        "model": "gpt-4o-mini",
        "inputs": {
            "customer_name": "Alice Johnson",
            "product": "AI Gateway"
        }
    })

    # Check for validation errors
    if result["errors"]:
        print("Validation errors:", result["errors"])

    # Use compiled prompt with OpenAI SDK
    response = client.chat.completions.create(**result["body"])
    print(response.choices[0].message.content)

Both approaches are fully compatible with all OpenAI SDK features including function calling, response formats, and advanced parameters. The HeliconePromptManager, while not providing input traces, will provide validation error handling.

Handling Prompt Partials

Prompt partials allow you to reference messages from other prompts using the syntax {{hcp:prompt_id:index:environment}}. This enables code reuse across your prompt library.

AI Gateway vs SDK

When using the AI Gateway, prompt partials are automatically resolved - you don’t need to do anything special:

import OpenAI from 'openai';

const openai = new OpenAI({
  baseURL: "https://ai-gateway.helicone.ai",
  apiKey: process.env.HELICONE_API_KEY,
});

// Partials like {{hcp:abc123:0}} are automatically resolved!
const response = await openai.chat.completions.create({
  model: "gpt-4o-mini",
  prompt_id: "xyz789",  // This prompt may contain partials
  inputs: {
    user_name: "Alice"
  }
});

When using the SDK directly, you must manually resolve prompt partials by fetching and substituting the referenced prompts:

TypeScript
Python

Manual Prompt Partial Resolution

import OpenAI from 'openai';
import { HeliconePromptManager } from '@helicone/helpers';

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

const promptManager = new HeliconePromptManager({
  apiKey: "your-helicone-api-key"
});

async function generateWithPromptPartials() {
  // Step 1: Fetch the main prompt body
  const mainPromptBody = await promptManager.pullPromptBody({
    prompt_id: "xyz789"
  });

  // Step 2: Extract all prompt partial references
  const promptPartials = promptManager.extractPromptPartials(mainPromptBody);

  // Step 3: Fetch and resolve each prompt partial
  const promptPartialInputs: Record<string, string> = {};

  for (const partial of promptPartials) {
    // Fetch the referenced prompt's body
    const partialBody = await promptManager.pullPromptBody({
      prompt_id: partial.prompt_id,
      environment: partial.environment || "production"
    });

    // Extract the specific message content
    const substitutionValue = promptManager.getPromptPartialSubstitutionValue(
      partial,
      partialBody
    );

    // Map the template tag to its resolved content
    promptPartialInputs[partial.raw] = substitutionValue;
  }

  // Step 4: Merge the prompt with inputs and resolved partials
  const { body, errors } = await promptManager.mergePromptBody(
    {
      prompt_id: "xyz789",
      model: "gpt-4o-mini",
      inputs: {
        user_name: "Alice"
      }
    },
    mainPromptBody,
    promptPartialInputs  // Pass resolved partials
  );

  if (errors.length > 0) {
    console.warn("Validation errors:", errors);
  }

  // Step 5: Use the compiled prompt
  const response = await openai.chat.completions.create(body);
  console.log(response.choices[0].message.content);
}

Manual Prompt Partial Resolution

import openai
import os
from helicone_helpers import HeliconePromptManager

client = openai.OpenAI(
    api_key=os.environ.get("OPENAI_API_KEY")
)

prompt_manager = HeliconePromptManager(
    api_key="your-helicone-api-key"
)

def generate_with_prompt_partials():
    # Step 1: Fetch the main prompt body
    main_prompt_body = prompt_manager.pull_prompt_body({
        "prompt_id": "xyz789"
    })

    # Step 2: Extract all prompt partial references
    prompt_partials = prompt_manager.extract_prompt_partials(main_prompt_body)

    # Step 3: Fetch and resolve each prompt partial
    prompt_partial_inputs = {}

    for partial in prompt_partials:
        # Fetch the referenced prompt's body
        partial_body = prompt_manager.pull_prompt_body({
            "prompt_id": partial["prompt_id"],
            "environment": partial.get("environment", "production")
        })

        # Extract the specific message content
        substitution_value = prompt_manager.get_prompt_partial_substitution_value(
            partial,
            partial_body
        )

        # Map the template tag to its resolved content
        prompt_partial_inputs[partial["raw"]] = substitution_value

    # Step 4: Merge the prompt with inputs and resolved partials
    result = prompt_manager.merge_prompt_body(
        {
            "prompt_id": "xyz789",
            "model": "gpt-4o-mini",
            "inputs": {
                "user_name": "Alice"
            }
        },
        main_prompt_body,
        prompt_partial_inputs  # Pass resolved partials
    )

    if result["errors"]:
        print("Validation errors:", result["errors"])

    # Step 5: Use the compiled prompt
    response = client.chat.completions.create(**result["body"])
    print(response.choices[0].message.content)

Understanding Prompt Partial Syntax

Prompt partials use the format {{hcp:prompt_id:index:environment}}:

prompt_id - The 6-character alphanumeric identifier of the prompt to reference
index - The message index (0-based) to extract from that prompt
environment - Optional environment identifier (defaults to production)

Examples:

{{hcp:abc123:0}}                   // Message 0 from prompt abc123 (production)
{{hcp:abc123:1:staging}}           // Message 1 from prompt abc123 (staging)
{{hcp:xyz789:2:development}}       // Message 2 from prompt xyz789 (development)

If your prompts don’t contain any prompt partials (no {{hcp:...}} tags), you don’t need to worry about this section. The SDK will work normally without any special handling.

When using the SDK directly, each prompt partial requires a separate API call to fetch the referenced prompt. For prompts with many partials, consider using the AI Gateway instead for better performance and automatic caching.

Overview

Get started with Prompt Management

Prompt Assembly

Understand how prompts are compiled

AI Gateway

Use prompts through the AI Gateway (recommended)

Getting Started

AI Gateway

Observability & Analytics

Prompt Management

Legacy Integrations

References

SDK vs AI Gateway

Installation

Types and Classes

Methods

Usage Examples

Handling Prompt Partials

AI Gateway vs SDK

Understanding Prompt Partial Syntax

Overview

Prompt Assembly

AI Gateway

Getting Started

AI Gateway

Observability & Analytics

Prompt Management

Legacy Integrations

References

​SDK vs AI Gateway

​Installation

​Types and Classes

​Methods

​Usage Examples

​Handling Prompt Partials

​AI Gateway vs SDK

​Understanding Prompt Partial Syntax

​Related Documentation

Overview

Prompt Assembly

AI Gateway

SDK vs AI Gateway

Installation

Types and Classes

Methods

Usage Examples

Handling Prompt Partials

AI Gateway vs SDK

Understanding Prompt Partial Syntax

Related Documentation