When building production AI applications, you need to improve model performance on specific tasks beyond what general-purpose models provide. Datasets & Fine-Tuning help you curate high-quality training data from your real production traffic and fine-tune models for better accuracy, consistency, and domain-specific performance.

Why use Datasets & Fine-Tuning

  • Production-ready datasets: Transform your actual LLM requests into high-quality training data with scoring and filtering
  • Seamless fine-tuning integration: Export to JSONL or connect directly to fine-tuning platforms like OpenPipe
  • Iterative improvement: Use real performance data to continuously refine your datasets and models
Helicone dataset curation interface with request filtering, scoring, and dataset management tools

Dataset curation interface showing request filtering and scoring for fine-tuning preparation

Quick Start

1

Score Your Requests

Review your existing LLM requests in the Helicone dashboard and assign quality scores based on accuracy and relevance. You can score manually or use automated scoring to identify your best examples.Score your requests
2

Filter Requests

Use Helicone’s filtering system to find high-quality requests based on scores, dates, models, or custom properties.Filter your requests
3

Select for Dataset

Choose the filtered requests you want to include and add them to a new or existing dataset.
4

Curate Dataset

Review, organize, and refine your dataset by removing poor examples, balancing categories, and ensuring consistency.
5

Export

Export your curated dataset in JSONL format for use with fine-tuning platforms like OpenAI, Anthropic, or OpenPipe.

Export Formats

Helicone supports multiple export formats for different fine-tuning platforms:
  • OpenAI JSONL: Compatible with OpenAI’s fine-tuning API
  • Anthropic Format: For Claude model fine-tuning
  • Generic JSONL: Works with most platforms
  • CSV: For data analysis and custom workflows