refactor: set do_sample=True if a seed is set on HF backends#1149
refactor: set do_sample=True if a seed is set on HF backends#1149psschwei wants to merge 4 commits into
Conversation
Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>
Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>
Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>
planetf1
left a comment
There was a problem hiding this comment.
Minor note on a pre-existing issue this PR slightly expands: _make_backend_specific_and_remove is also called at two apply_chat_template sites (lines 805 and 1087 in huggingface.py), and its output is splatted directly into the tokeniser call. That means temperature was already quietly reaching the Jinja template context before this change — do_sample=True now joins that set.
It does not cause any visible problems today (the templates ignore unknown kwargs), but it is not the right shape — generate-only options should not be in the template namespace. There is already a _filter_chat_template_only_options that strips template-only keys before generate(); a mirror function for the other direction would clean this up. Filed as #1154; nothing needed here.
Pull Request
Issue
Fixes #40
Description
HuggingFace backends silently ignored seed (and warned but ignored temperature) because transformers defaults to greedy decoding (
do_sample=False). This change forcesdo_sample=Truein_make_backend_specific_and_removewhenever a SEED or non-zero TEMPERATURE is set and the caller hasn't explicitly chosendo_sample, fixing all four HF generate paths in one place.Testing
Attribution
Adding a new component, requirement, sampling strategy, or tool?
If your PR adds or modifies one of the types below, check the matching box. A checklist of type-specific review items will be posted as a comment.
NOTE: Please ensure you have an issue that has been acknowledged by a core contributor and routed you to open a pull request against this repository. Otherwise, please open an issue before continuing with this pull request.