-
Notifications
You must be signed in to change notification settings - Fork 6.7k
[modular]support klein #13002
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[modular]support klein #13002
Conversation
| def inputs(self) -> List[InputParam]: | ||
| return [ | ||
| InputParam(name="prompt_embeds", required=True), | ||
| InputParam(name="latent_ids"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed because latent_ids are not used in this block I think
| def inputs(self) -> List[InputParam]: | ||
| return [ | ||
| InputParam("prompt"), | ||
| InputParam("prompt_embeds", type_hint=torch.Tensor, required=False), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in modular, we can just pop out the text_encoder block to use in standalone manner if we want to compute them separately. so I'm trying to remove these arguments everywhere to simplify code a bit (same with image_latents too, it gets really complicated for some inpaiting pipeline)
blocks = ....
text_node = blocks.pop("text_encoder").init_pipeline(repo_id)
pipe = blocks.init_pipeline(repo_id)
prompt_embeds = text_node(prompt = ..)
out = pipe(prompt_embeds = ...)|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
sayakpaul
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a requirement but should we add a small test for it as well? 👀
| return [ | ||
| InputParam(name="prompt_embeds", required=True), | ||
| InputParam(name="latent_ids"), | ||
| InputParam(name="negative_prompt_embeds", required=False), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No strong opinions but WDYT of creating a separate block for Klein altogether? I think this way it will be a bit easier to debug and also separate concerns?
My suggestions mainly comes from the fact that Flux.2-Dev doesn't use negative_prompt_embeds while Flux.2-Klein does. So, maybe that warrants creating separate blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a fair point, but on the other hand, I've personally found that having too many blocks can become overwhelming - each time you need to add something, you still need to go through all of them and understand which ones to use.
I think it makes sense to just add the code in the same blocks here, it is so small and fits in. but this is really a matter of preference, not right or wrong. Maybe we'll know better in the future though after building more pipelines :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I changed my mind - I agree it's better to separate them out. Otherwise negative_prompt_embeds will show up as an optional argument in the auto docstring for both Klein and Dev, which is confusing.
Note that in Qwen (https://github.com/huggingface/diffusers/blob/main/src/diffusers/modular_pipelines/qwenimage/inputs.py#L232), I'm experimenting with more composable blocks for situations like this that you can just reuse. But it also makes the blocks more complex, and I'm not sure if I'm over-engineering. So let's keep them simple here and see how it goes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
| def get_default_blocks_name(self, config_dict: Optional[Dict[str, Any]]) -> Optional[str]: | ||
| if config_dict is not None and "is_distilled" in config_dict and config_dict["is_distilled"]: | ||
| return "Flux2KleinAutoBlocks" | ||
| else: | ||
| return "Flux2KleinBaseAutoBlocks" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe naming them as Flux2KleinDistilledBlocks and Flux2KleinBaseAutoBlocks is slightly better?
src/diffusers/modular_pipelines/flux2/modular_blocks_flux2_klein.py
Outdated
Show resolved
Hide resolved
src/diffusers/modular_pipelines/flux2/modular_blocks_flux2_klein.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: Álvaro Somoza <[email protected]>
| block_state.timesteps = timesteps | ||
| block_state.num_inference_steps = num_inference_steps | ||
|
|
||
| batch_size = block_state.batch_size * block_state.num_images_per_prompt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
separated this to a prepare_guidance block
|
|
||
|
|
||
| class Flux2DecodeStep(ModularPipelineBlocks): | ||
| class Flux2UnpackLatentsStep(ModularPipelineBlocks): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move the "unpacking latent' out of decode step -> this is just to make it easier for decode step work in standalone manner, i.e. user only need to pass latent (not latent_id)
| AUTO_BLOCKS = InsertableDict( | ||
| [ | ||
| ("text_encoder", Flux2TextEncoderStep()), | ||
| ("text_input", Flux2TextInputStep()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rearrangeed a bit so this works with mellon
testing klein
testing klein with modular setting
testing flux2-dev
modular seetting