A Guide to Fine-Tuning AI Outputs

In the world of AI-powered text generation, the quality, creativity, and predictability of responses depend significantly on a set of adjustable parameters. These parameters — such as temperature, top-p, and penalties — serve as the dials that fine-tune the behavior of a language model, enabling it to produce anything from conservative, predictable text to wildly imaginative prose.

Art Quant
4 min readJan 21, 2025
Understanding Sampling Parameters in Language Models: A Guide to Fine-Tuning AI Outputs
Understanding Sampling Parameters in Language Models: A Guide to Fine-Tuning AI Outputs

Whether you’re crafting content, designing conversational agents, or performing data-driven experiments, understanding these parameters can empower you to achieve specific outcomes with precision. This guide delves into the key parameters available in OpenRouter-supported models, explaining their purpose and showcasing practical examples to help you harness the full potential of AI text generation.

1. Temperature

  • High Temperature (e.g., 1.5)
  • Prompt: “The garden was quiet and peaceful.”
  • Output: “The garden buzzed with strange whispers of ancient trees, weaving tales under a purple sky.”
  • Explanation: The response is imaginative and unexpected.
  • Low Temperature (e.g., 0.2)
  • Prompt: “The garden was quiet and peaceful.”
  • Output: “The garden was calm and serene, with flowers blooming.”
  • Explanation: The output is safe, predictable, and factual.

2. Top P (Nucleus Sampling)

  • High Top P (e.g., 0.9)
  • Prompt: “She opened the box and saw…”
  • Output: “a shimmering light that seemed to dance in the air.”
  • Explanation: A broader range of tokens is considered, allowing for more diverse completions.
  • Low Top P (e.g., 0.3)
  • Prompt: “She opened the box and saw…”
  • Output: “a pair of glasses neatly placed inside.”
  • Explanation: Limits choices to highly probable tokens, leading to predictable responses.

3. Top K

  • Top K = 1 (most likely token only)
  • Prompt: “The sky was filled with…”
  • Output: “clouds.”
  • Explanation: The model picks the most probable next word.
  • Top K = 50
  • Prompt: “The sky was filled with…”
  • Output: “vivid colors as the sun set behind the mountains.”
  • Explanation: The model considers a wider range of tokens.

4. Frequency Penalty

  • High Frequency Penalty (e.g., 1.5)
  • Prompt: “The river was flowing gently, and the river sparkled in the sunlight.”
  • Output: “The river flowed gently, shimmering in the sunlight.”
  • Explanation: Repeated tokens (“river”) are avoided.
  • Negative Frequency Penalty (e.g., -1.0)
  • Prompt: “The river was flowing gently, and the river sparkled in the sunlight.”
  • Output: “The river was flowing gently, and the river sparkled in the river’s sunlight.”
  • Explanation: The model favors repetition.

5. Presence Penalty

  • High Presence Penalty (e.g., 1.5)
  • Prompt: “The artist painted a…”
  • Output: “magnificent scene filled with vibrant details and unique strokes.”
  • Explanation: The model avoids repeating “artist” or related tokens.
  • Negative Presence Penalty (e.g., -1.0)
  • Prompt: “The artist painted a…”
  • Output: “artistically stunning artwork that showed the artist’s talent.”
  • Explanation: Encourages reuse of words like “artist.”

6. Repetition Penalty

  • High Repetition Penalty (e.g., 1.8)
  • Prompt: “The forest was dark and mysterious. The forest was…”
  • Output: “shrouded in secrets, with shadows that seemed alive.”
  • Explanation: Reduces the likelihood of repeating “forest.”
  • Low Repetition Penalty (e.g., 0.5)
  • Prompt: “The forest was dark and mysterious. The forest was…”
  • Output: “dark and mysterious. The forest was dark and mysterious.”
  • Explanation: Repetition is more likely.

7. Min P

  • Min P = 0.1
  • Prompt: “The experiment was a success, and it proved…”
  • Output: “that complex systems can be simplified under specific conditions.”
  • Explanation: Filters out unlikely tokens, ensuring quality responses.
  • Min P = 0.0
  • Prompt: “The experiment was a success, and it proved…”
  • Output: “various conclusions depending on the angle of interpretation.”
  • Explanation: Allows more token flexibility, potentially reducing coherence.

8. Top A

  • Top A = 0.2
  • Prompt: “The robot said…”
  • Output: “hello in a monotone voice.”
  • Explanation: Narrow focus on the most likely tokens.
  • Top A = 0.9
  • Prompt: “The robot said…”
  • Output: “a strange phrase about the stars and time.”
  • Explanation: Wider selection of tokens, leading to varied outputs.

9. Seed

  • With Seed
  • Prompt: “The sky turned dark as…”
  • Output: “a storm approached the distant hills.”
  • Explanation: Running the same input and seed will consistently return this result.
  • Without Seed
  • Prompt: “The sky turned dark as…”
  • Output: Varies with each run, e.g., “the sun dipped below the horizon,” or “clouds gathered ominously.”

10. Max Tokens

  • Max Tokens = 10
  • Prompt: “The hero entered the cave and…”
  • Output: “found a treasure chest waiting.”
  • Explanation: Short and concise output.
  • Max Tokens = 100
  • Prompt: “The hero entered the cave and…”
  • Output: “found a treasure chest waiting. It was covered in ancient runes, glowing faintly in the darkness. As the hero approached, a whisper echoed through the chamber, hinting at the chest’s mysterious contents.”
  • Explanation: Longer and more detailed response.

11. Logit Bias

  • Positive Bias
  • Prompt: “The bird…”
  • Bias: +100 for the token “flew.”
  • Output: “flew across the sky.”
  • Explanation: Forces the model to include “flew.”
  • Negative Bias
  • Prompt: “The bird…”
  • Bias: -100 for the token “flew.”
  • Output: “sang beautifully from the branch.”
  • Explanation: Prevents the model from using “flew.”

By manipulating these parameters, you can tailor the model’s behavior to suit specific needs — whether it’s encouraging creativity, maintaining consistency, or ensuring brevity.

--

--

Art Quant
Art Quant

Written by Art Quant

Writing, counting, researching.

Responses (1)