Skip to content

fix(modeling): remove tuple builtin shadow and raise on empty dtype in get_parameter_dtype#13963

Open
adhavan18 wants to merge 2 commits into
huggingface:mainfrom
adhavan18:fix/get-parameter-dtype-dataparallel
Open

fix(modeling): remove tuple builtin shadow and raise on empty dtype in get_parameter_dtype#13963
adhavan18 wants to merge 2 commits into
huggingface:mainfrom
adhavan18:fix/get-parameter-dtype-dataparallel

Conversation

@adhavan18

@adhavan18 adhavan18 commented Jun 15, 2026

Copy link
Copy Markdown

Fixes #13789.

Problem

In get_parameter_dtype, the loop variable is named tuple, which shadows Python's built-in type. Also, if gen is exhausted without a floating-point tensor, the function falls off the end and implicitly returns None instead of raising.

Fix

  • Renamed loop variable from tuple / last_tuple to t / last_t
  • Added explicit raise ValueError when no floating-point or complex dtype is found

Test

Existing tests pass.

adhavan added 2 commits June 15, 2026 12:44
…-shift (huggingface#13243)

FlowMatchEulerDiscreteScheduler.__init__ computed sigma_min and sigma_max
from the already-shifted sigmas.  When set_timesteps regenerated the sigma
grid from those bounds via _sigma_to_t -> linspace -> /num_train_timesteps,
it recovered the shifted values and then applied the shift formula a second
time, producing a doubly-shifted (and therefore incorrect) schedule.

Fix: record sigma_min and sigma_max from the raw linear sigmas
(timesteps / num_train_timesteps) before the shift formula is applied, so
set_timesteps starts from the correct unshifted bounds and the shift is
applied exactly once.

Regression test: test_set_timesteps_no_double_shift verifies that
set_timesteps(num_inference_steps=1000) reproduces the same sigma grid
that __init__ stored, for a scheduler with shift=3.0.
…n get_parameter_dtype

`get_parameter_dtype` had two problems in its DataParallel fallback path:

1. The loop variable was named `tuple`, silently shadowing Python's
   built-in.  Under some linters / runtime inspectors this causes
   unexpected behaviour and masks the real type annotation
   `list[tuple[str, Tensor]]` on `find_tensor_attributes`.

2. When all three search paths (layerwise hooks, named_parameters/buffers,
   and __dict__ tensor inspection) are exhausted without finding any
   tensors, the function falls off the end and returns `None` implicitly.
   Callers like `UNet2DModel.forward` then pass `dtype=None` to
   `tensor.to()`, which raises a cryptic `TypeError` or `UnboundLocalError`
   that is hard to trace back to `get_parameter_dtype`.

Fixes:
- Rename the loop variable `tuple` → `t` (and `last_tuple` → `last_t`)
  to un-shadow the built-in.
- Add an explicit `raise ValueError` with an actionable message when no
  dtype is found, instead of returning `None`.  The message hints at the
  most common cause (model not moved to device before wrapping with
  `nn.DataParallel`).

Closes huggingface#13789
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

UNet2DModel dtype property fails under nn.DataParallel with UnboundLocalError in get_parameter_dtype

1 participant