Run experiments with historical datasets to test, evaluate, and improve prompts over time while preventing regressions in production systems.
Click `Start Experiment`
Start Experiment
.Select the base prompt
Continue
. You can edit the prompt in the
next step.production
tag.Edit the prompt
Configure your experiment
Generate random dataset
. We will pick up to 10 random data from your existing
requests.Confirm and run
Diff Viewer
compares your new prompt to the base prompt that you
selected.View outputs