Skip to content

Ideogram4 lora training#13861

Open
apolinario wants to merge 6 commits into
mainfrom
ideogram4-lora-training
Open

Ideogram4 lora training#13861
apolinario wants to merge 6 commits into
mainfrom
ideogram4-lora-training

Conversation

@apolinario
Copy link
Copy Markdown
Collaborator

DreamBooth LoRA training script + Ideogram4 LoRA loader mixin.

  • LoRA targets the conditional transformer only (asymmetric CFG: the unconditional branch is the CFG prior).
  • Timestep sampling uses Ideogram 4's resolution-aware logit-normal schedule via the standard --weighting_scheme / --logit_mean / --logit_std args (defaults set to the model's schedule).

Stacked on #13859.

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Base automatically changed from add-ideogram-4 to main June 3, 2026 22:03
@bghira
Copy link
Copy Markdown
Contributor

bghira commented Jun 4, 2026

Mean loss: 1.3919, min: 0.852, max: 2.25.

intentionally super high loss? it's like training the audio branch with LTX2

@linoytsaban linoytsaban force-pushed the ideogram4-lora-training branch from 6f8d6e9 to 0128816 Compare June 4, 2026 09:53
@bghira
Copy link
Copy Markdown
Contributor

bghira commented Jun 4, 2026

    def fuse_qkv_projections(self):
        # The attention already uses a single fused `qkv` projection, so there is nothing to fuse.
        raise NotImplementedError(
            "Ideogram4Transformer2DModel already uses a fused QKV projection (`attention.qkv`), "
            "so `fuse_qkv_projections()` is not applicable."
        )

    def unfuse_qkv_projections(self):
        raise NotImplementedError(
            "Ideogram4Transformer2DModel uses a fused QKV projection that cannot be split, "
            "so `unfuse_qkv_projections()` is not applicable."
        )

these were removed, despite qkv now being split. can you re-add?

@bghira
Copy link
Copy Markdown
Contributor

bghira commented Jun 4, 2026

@joangava did you test the script? it's not working.

@bghira
Copy link
Copy Markdown
Contributor

bghira commented Jun 4, 2026

finally identified the issues;

  • the fp8 weights are having the scale discarded by this script's loader, it doesn't actually load the quantised weights properly, this causes the NaN loss and black images
  • the hf accelerate library seems to have a bug. disabling autocast is actually the better move for ideogram (that's how simpletuner works); unwrap_model isn't removing the forward wrapper that Accelerate adds during model prepare, this causes collapsed outputs on step 1

@linoytsaban
Copy link
Copy Markdown
Collaborator

linoytsaban commented Jun 5, 2026

@bghira
Copy link
Copy Markdown
Contributor

bghira commented Jun 5, 2026

  • well their Fp8Linear sucks anyway, it's not using scaled mm and it's upcasting to bf16 on every forward pass
  • 1.13.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants