Skip to main content
Helicone’s AI Gateway provides a unified interface for reasoning across providers. Use the same parameters regardless of provider - the Gateway handles the translation automatically.
Gemini model support for reasoning is coming soon.

Quick Start

  • Chat Completions
  • Responses API
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.HELICONE_API_KEY,
  baseURL: "https://ai-gateway.helicone.ai/v1",
});

const response = await client.chat.completions.create({
  model: "claude-sonnet-4-20250514",
  messages: [
    { role: "user", content: "What is the sum of the first 100 prime numbers?" }
  ],
  reasoning_effort: "medium",
  max_completion_tokens: 16000
});

Configuration

  • Chat Completions
  • Responses API
{
  reasoning_effort: "low" | "medium" | "high",
  reasoning_options: {
    budget_tokens: 8000  // Optional, for Anthropic
  }
}

reasoning_effort

LevelDescription
lowLight reasoning for simple tasks
mediumBalanced reasoning
highDeep reasoning for complex problems
For Anthropic models, the default is 4096 max completion tokens with 2048 budget reasoning tokens.

Handling Responses

Chat Completions

  • Streaming
  • Non-Streaming
When streaming, reasoning content arrives in chunks via the reasoning delta field, followed by content, and finally reasoning_details with the finish reason:
// Reasoning chunks arrive first
{
  "choices": [{
    "delta": { "reasoning": "Let me think about this..." }
  }]
}

// Then content chunks
{
  "choices": [{
    "delta": { "content": "The answer is 42." }
  }]
}

// Final chunk includes reasoning_details with signature
{
  "choices": [{
    "delta": {
      "reasoning_details": [{
        "thinking": "The user is asking for...",
        "signature": "EpICCkYIChgCKkCfWt1pnGxEcz48yQJvie3ppkXZ8ryd..."
      }]
    },
    "finish_reason": "stop"
  }]
}

Responses API

  • Streaming
  • Non-Streaming (OpenAI)
  • Non-Streaming (Anthropic)
Streaming events follow the Responses API format:
// Reasoning summary text delta
{
  "type": "response.reasoning_summary_text.delta",
  "item_id": "rs_0ab50bce3156357b...",
  "output_index": 0,
  "summary_index": 0,
  "delta": "Let me think about this..."
}

// Reasoning item complete
{
  "type": "response.output_item.done",
  "output_index": 0,
  "item": {
    "id": "rs_0ab50bce3156357b...",
    "type": "reasoning",
    "summary": [{
      "type": "summary_text",
      "text": "**Crafting the response**\n\nThe user wants..."
    }]
  }
}
Anthropic models always return encrypted_content (signatures) in reasoning items. These signatures validate the reasoning chain and are required for multi-turn conversations. Other providers like OpenAI can optionally return signatures when configured.