Why Prompt Management?

Traditional prompt development involves hardcoded prompts in application code, messy string substitution, and frustrating and rebuilding deployments for every iteration. This creates friction that slows down experimentation and your team’s ability to ship. Our Prompts offers a better approach!
  • Powerful composability: Variables of all types in system prompts, messages, and tool/response schemas
  • Clear version control: Track, compare, and rollback prompt versions without code changes
  • Environment management: Full control over deployment environments like production, staging, and development
  • Easy deployment: Reference prompts by ID and let our AI Gateway handle the rest
  • Real-time testing: Test prompts instantly with different models and parameters

Quickstart

1

Create a Prompt

Build a prompt in the Playground. Save any prompt with clear commit histories and tags.
2

Test and Iterate

Experiment with different variables, inputs, and models until you reach desired output. Variables can be used anywhere, even in tool schemas.
3

Run Prompt with AI Gateway

Use your prompt instantly by referencing its ID in your AI Gateway. No code changes, no rebuilds.
import { OpenAI } from "openai";

const openai = new OpenAI({
  baseURL: "https://ai-gateway.helicone.ai/v1",
  apiKey: "your-openai-api-key",
  defaultHeaders: {
    "Helicone-Auth": "Bearer your-helicone-api-key",
  },
});

const response = await openai.chat.completions.create({
  model: "openai/gpt-4o-mini",
  prompt_id: "abc123", // Reference your saved prompt
  environment: "production", // Optional: specify environment
  inputs: {
    customer_name: "John Doe",
    product: "AI Gateway"
  }
});
Your prompt is automatically compiled with the provided inputs and sent to your chosen model. Update prompts in the dashboard and changes take effect immediately!

Prompt Assembly Process

When you make an LLM call with a prompt ID, the AI Gateway will compile your saved prompt alongside runtime parameters you provide.

Version Selection

The AI Gateway automatically determines which prompt version to use based on the parameters you provide:
  1. Environment specified: If you provide an environment parameter, the gateway uses the prompt version deployed to that environment (e.g., “production”, “staging”, “development”)
  2. Version ID specified: If you provide a version_id parameter but no environment, the gateway uses that specific version
  3. Default behavior: If neither environment nor version ID is specified, the gateway automatically uses the production version
Environment takes precedence over version ID. If both are specified, the environment parameter will be used and version ID will be ignored.

Parameter Priority

Saved prompts store all the configuration you set in the playground - temperature, max tokens, response format, system messages, and more. At runtime, these saved parameters are used as defaults, but any parameters you specify in your API call will override them.
{
  "model": "gpt-4o-mini",
  "temperature": 0.6,
  "max_tokens": 1000,
  "messages": [
    {
      "role": "system", 
      "content": "You are a helpful customer support agent for {{hc:company:string}}."
    },
    {
      "role": "user",
      "content": "Hello, I need help with my account."
    }
  ]
}

Message Handling

Messages work differently than other parameters. Instead of overriding, runtime messages are appended to the saved prompt messages. This allows you to:
  • Define consistent system prompts and example conversations in your saved prompt
  • Add dynamic user messages at runtime
  • Build multi-turn conversations that maintain context
Runtime messages are always appended to the end of your saved prompt messages. Make sure your saved prompt structure accounts for this behavior.

Override Examples

// Saved prompt has temperature: 0.8
const response = await openai.chat.completions.create({
  prompt_id: "abc123",
  temperature: 0.2, // Uses 0.2, not 0.8
  inputs: { topic: "AI safety" }
});
This compilation approach gives you the flexibility to have consistent prompt templates while still allowing runtime customization for specific use cases.

Managing Environments

You can easily manage different deployment environments for your prompts directly in the Helicone dashboard. Create and deploy prompts to production, staging, development, or any custom environment you need.

Variables

Variables make your prompts dynamic and reusable. Define them once in your prompt template, then provide different values at runtime without changing your code.

Variable Syntax

Variables use the format {{hc:name:type}} where:
  • name is your variable identifier
  • type defines the expected data type
{{hc:customer_name:string}}
{{hc:age:number}}
{{hc:is_premium:boolean}}
{{hc:context:any}}

Supported Types

TypeDescriptionExample ValuesValidation
stringText values"John Doe", "Hello world"None
numberNumeric values25, 3.14, -10AI Gateway type-checking
booleanTrue/false valuestrue, false, "yes", "no"AI Gateway type-checking
your_type_nameAny data typeObjects, arrays, stringsNone
Only number and boolean types are validated by the Helicone AI Gateway, which will accept strings for any input as long as they can be converted to valid values.
Boolean variables accept multiple formats:
  • true / false (boolean)
  • "yes" / "no" (string)
  • "true" / "false" (string)

Schema Variables

Variables can be used within JSON schemas for tools and response formatting. This enables dynamic schema generation based on runtime inputs.
{
  "name": "moviebot_response",
  "strict": true,
  "schema": {
    "type": "object",
    "properties": {
      "markdown_response": {
        "type": "string"
      },
      "tools_used": {
        "type": "array",
        "items": {
          "type": "string",
          "enum": "{{hc:tools:array}}"
        }
      },
      "user_tier": {
        "type": "string",
        "enum": "{{hc:tiers:array}}"
      }
    },
    "required": [
      "markdown_response",
      "tools_used",
      "user_tier"
    ],
    "additionalProperties": false
  }
}

Replacement Behavior

Value Replacement: When a variable tag is the only content in a string, it gets replaced with the actual data type:
"enum": "{{hc:tools:array}}""enum": ["search", "calculator", "weather"]
String Substitution: When variables are part of a larger string, normal regex replacement occurs:
"description": "Available for {{hc:name:string}} users""description": "Available for premium users"
Keys and Values: Variables work in both JSON keys and values throughout tool schemas and response schemas.

SDK Helpers

We provide SDKs for both TypeScript and Python that offer two ways to use Helicone prompts:
  1. AI Gateway Integration - Use prompts through the Helicone AI Gateway
  2. Direct SDK Integration - Pull prompts directly via SDK
Prompts through the AI Gateway come with several benefits:
  • Cleaner code: Automatically performs compilation and substitution in the router.
  • Input traces: Traces inputs on each request for better observability in Helicone requests.
  • Faster TTFT: The AI Gateway adds significantly less latency compared to the SDK.
npm install @helicone/helpers

Types and Classes

The SDK provides types for both integration methods when using the OpenAI SDK:
TypeDescriptionUse Case
HeliconeChatCreateParamsStandard chat completions with promptsNon-streaming requests
HeliconeChatCreateParamsStreamingStreaming chat completions with promptsStreaming requests
Both types extend the OpenAI SDK’s chat completion parameters and add:
  • prompt_id - Your saved prompt identifier
  • environment - Optional environment to target (e.g., “production”, “staging”)
  • version_id - Optional specific version (defaults to production version)
  • inputs - Variable values
For direct SDK integration:
import { HeliconePromptManager } from '@helicone/helpers';

const promptManager = new HeliconePromptManager({
  apiKey: "your-helicone-api-key"
});

Methods

Both SDKs provide the HeliconePromptManager with these main methods:
MethodDescriptionReturns
pullPromptVersion()Determine which prompt version to usePrompt version object
pullPromptBody()Fetch raw prompt from storageRaw prompt body
pullPromptBodyByVersionId()Fetch prompt by specific version IDRaw prompt body
mergePromptBody()Merge prompt with inputs and validationCompilation result
getPromptBody()Complete compile process with inputsCompiled body + validation errors

Usage Examples

import OpenAI from 'openai';
import { HeliconePromptManager } from '@helicone/helpers';

const openai = new OpenAI({
  apiKey: "your-openai-api-key",
  baseURL: "https://oai.helicone.ai/v1",
  defaultHeaders: {
    "Helicone-Auth": "Bearer your-helicone-api-key",
  },
});

const promptManager = new HeliconePromptManager({
  apiKey: "your-helicone-api-key"
});

async function generateWithPrompt() {
  // Get compiled prompt with variable substitution
  const { body, errors } = await promptManager.getPromptBody({
    prompt_id: "abc123",
    model: "gpt-4o-mini",
    inputs: {
      customer_name: "Alice Johnson",
      product: "AI Gateway"
    }
  });

  // Check for validation errors
  if (errors.length > 0) {
    console.warn("Validation errors:", errors);
  }

  // Use compiled prompt with OpenAI SDK
  const response = await openai.chat.completions.create(body);
  console.log(response.choices[0].message.content);
}
The Helicone AI Gateway is the recommended way to interact with prompts, as it offers a fully OpenAI-compatible router, caching, rate limits, and more alongside prompts usage. However, the SDK is a great option for users that need direct interaction with compiled prompt bodies.
Both approaches are fully compatible with all OpenAI SDK features including function calling, response formats, and advanced parameters. The HeliconePromptManager, while not providing input traces, will provide validation error handling.