Skip to content

refactor: set do_sample=True if a seed is set on HF backends#1149

Open
psschwei wants to merge 4 commits into
generative-computing:mainfrom
psschwei:sample-seed
Open

refactor: set do_sample=True if a seed is set on HF backends#1149
psschwei wants to merge 4 commits into
generative-computing:mainfrom
psschwei:sample-seed

Conversation

@psschwei
Copy link
Copy Markdown
Member

Pull Request

Issue

Fixes #40

Description

HuggingFace backends silently ignored seed (and warned but ignored temperature) because transformers defaults to greedy decoding (do_sample=False). This change forces do_sample=True in _make_backend_specific_and_remove whenever a SEED or non-zero TEMPERATURE is set and the caller hasn't explicitly chosen do_sample, fixing all four HF generate paths in one place.

Testing

  • Tests added to the respective file if code was changed
  • New code has 100% coverage if code was added
  • Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

Attribution

  • AI coding assistants used

Adding a new component, requirement, sampling strategy, or tool?

If your PR adds or modifies one of the types below, check the matching box. A checklist of type-specific review items will be posted as a comment.

  • Component
  • Requirement
  • Sampling Strategy
  • Tool

NOTE: Please ensure you have an issue that has been acknowledged by a core contributor and routed you to open a pull request against this repository. Otherwise, please open an issue before continuing with this pull request.

Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>
@psschwei psschwei requested a review from a team as a code owner May 22, 2026 22:40
@psschwei psschwei requested review from markstur and planetf1 May 22, 2026 22:41
@github-actions github-actions Bot added the enhancement New feature or request label May 22, 2026
psschwei added 2 commits May 22, 2026 18:48
Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>
Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>
Copy link
Copy Markdown
Contributor

@planetf1 planetf1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor note on a pre-existing issue this PR slightly expands: _make_backend_specific_and_remove is also called at two apply_chat_template sites (lines 805 and 1087 in huggingface.py), and its output is splatted directly into the tokeniser call. That means temperature was already quietly reaching the Jinja template context before this change — do_sample=True now joins that set.

It does not cause any visible problems today (the templates ignore unknown kwargs), but it is not the right shape — generate-only options should not be in the template namespace. There is already a _filter_chat_template_only_options that strips template-only keys before generate(); a mirror function for the other direction would clean this up. Filed as #1154; nothing needed here.

Comment thread mellea/backends/huggingface.py
Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

For huggingface backends, set do_sample=True if a seed is set

2 participants