Use constrained outputs - Helicone OSS LLM Observability

What are constrained outputs

Constrained outputs involve instructing the LLM to generate responses that adhere to specific limitations or formats. This could mean setting a word limit, specifying a response type (like “yes” or “no”), or requiring the output to match a particular pattern or structure.

How to implement constrained outputs

Set clear instructions: Be explicit about the constraints you want the model to follow.
Specify the format: Define the exact format or pattern you expect.
Limit the length: Set boundaries on the response length, such as word or character counts.
Use controlled vocabularies: Restrict the model to use only certain words or phrases.
Provide templates: Offer a template that the model should fill in.

Example

Example 1: Binary Classification.

Limiting the response to ‘Approved’ or ‘Denied’ ensures consistency and simplifies automated processing.Prompt:

Review the following application and respond with 'Approved' or 'Denied' only.

Application Details: [Applicant's information and criteria]

Decision:

Example 2: Short Answer Generation.

By specifying that the answer should be in one sentence, you prevent the model from providing overly long or off-topic responses. Prompt:Prompt:

Based on the text below, answer the question in one sentence.

Text: 'The Great Barrier Reef is the world's largest coral reef system located in Australia.'

Question: 'Where is the Great Barrier Reef located?'

Answer:

Example 3: Technical writing.

Setting an exact word limit challenges the model to be concise and focus on the most important information.Prompt:

Summarize the following article in exactly 50 words.

[Insert article text]

Summary (50 words):

Why use constrained outputs

Increase precision: Helps the model provide exactly what you need without unnecessary information.
Enhance consistency: Ensures uniformity across multiple outputs, which is crucial for tasks like data entry or form filling.
Simplify parsing: Makes it easier to programmatically process the responses.
Reduce errors: Minimizes the chance of irrelevant or incorrect information creeping into the output.

Need more help?

Additional questions or feedback? Reach out to help@helicone.ai or schedule a call with us.

Guides

​What are constrained outputs

​How to implement constrained outputs

​Example

​Why use constrained outputs

What are constrained outputs

How to implement constrained outputs

Example

Why use constrained outputs