Skip to content

[Bug] "CLIP: Using CPU backend" and then crash #1210

@AndriiParf

Description

@AndriiParf

Git commit

61659ef

Operating System & Version

Windows 11 24H2

GGML backends

Vulkan

Command-line arguments used

sd-cli.exe --diffusion-model E:\GenerateImage\Models\FLUX\Realistic\flux-2-klein-4b-Q8_0.gguf --vae E:\GenerateImage\SDXLVaeAndClips\FLUX2Klein\ae.safetensors --llm E:\GenerateImage\SDXLVaeAndClips\FLUX2Klein\Qwen3-8B-Q6_K.gguf -r example.png -p "change the background to park" -W 960 -H 960 --cfg-scale 4.0 --steps 4 -v --offload-to-cpu --diffusion-fa -t 12 --sampling-method dpm++2mv2 --clip-on-cpu --vae-tiling --vae-tile-size 64x64

Steps to reproduce

Checkout commit 61659ef.
Run the generation command with Flux 2 Klein 4B.
The application initializes the system info, loads the arguments, detects the Vulkan device, and prints weight statistics.
Immediately after printing CLIP: Using CPU backend, the process terminates (or hangs) without any further error message or assertion failure.

What you expected to happen

The application should continue to load the LLM vocab/merges and then proceed to load model tensors, as it did in previous versions.

What actually happened

The execution stops abruptly at initialization.

Logs / error messages / stack trace

sd-cli.exe --diffusion-model E:\GenerateImage\Models\FLUX\Realistic\flux-2-klein-4b-Q8_0.gguf --vae E:\GenerateImage\SDXLVaeAndClips\FLUX2Klein\ae.safetensors --llm E:\GenerateImage\SDXLVaeAndClips\FLUX2Klein\Qwen3-8B-Q6_K.gguf -r example.png -p "change the background to park" -W 960 -H 960 --cfg-scale 4.0 --steps 4 -v --offload-to-cpu --diffusion-fa -t 12 --sampling-method dpm++2mv2 --clip-on-cpu --vae-tiling --vae-tile-size 64x64
[DEBUG] main.cpp:500 - version: stable-diffusion.cpp version unknown, commit 61659ef
[DEBUG] main.cpp:501 - System Info:
SSE3 = 1 | AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | VSX = 0 |
[DEBUG] main.cpp:502 - SDCliParams {
mode: img_gen,
output_path: "output.png",
verbose: true,
color: false,
canny_preprocess: false,
convert_name: false,
preview_method: none,
preview_interval: 1,
preview_path: "preview.png",
preview_fps: 16,
taesd_preview: false,
preview_noisy: false
}
[DEBUG] main.cpp:503 - SDContextParams {
n_threads: 12,
model_path: "",
clip_l_path: "",
clip_g_path: "",
clip_vision_path: "",
t5xxl_path: "",
llm_path: "E:\GenerateImage\SDXLVaeAndClips\FLUX2Klein\Qwen3-8B-Q6_K.gguf",
llm_vision_path: "",
diffusion_model_path: "E:\GenerateImage\Models\FLUX\Realistic\flux-2-klein-4b-Q8_0.gguf",
high_noise_diffusion_model_path: "",
vae_path: "E:\GenerateImage\SDXLVaeAndClips\FLUX2Klein\ae.safetensors",
taesd_path: "",
esrgan_path: "",
control_net_path: "",
embedding_dir: "",
embeddings: {
}
wtype: NONE,
tensor_type_rules: "",
lora_model_dir: "",
photo_maker_path: "",
rng_type: cuda,
sampler_rng_type: NONE,
flow_shift: INF
offload_params_to_cpu: true,
enable_mmap: false,
control_net_cpu: false,
clip_on_cpu: true,
vae_on_cpu: true,
diffusion_flash_attn: true,
diffusion_conv_direct: false,
vae_conv_direct: false,
circular: false,
circular_x: false,
circular_y: false,
chroma_use_dit_mask: true,
qwen_image_zero_cond_t: false,
chroma_use_t5_mask: false,
chroma_t5_mask_pad: 1,
prediction: NONE,
lora_apply_mode: auto,
vae_tiling_params: { 1, 64, 64, 0.5, 0, 0 },
force_sdxl_vae_conv_scale: false
}
[DEBUG] main.cpp:504 - SDGenerationParams {
loras: "{
}",
high_noise_loras: "{
}",
prompt: "change the background to park",
negative_prompt: "",
clip_skip: -1,
width: 960,
height: 960,
batch_count: 1,
init_image_path: "",
end_image_path: "",
mask_image_path: "",
control_image_path: "",
ref_image_paths: ["example.png"],
control_video_path: "",
auto_resize_ref_image: true,
increase_ref_index: false,
pm_id_images_dir: "",
pm_id_embed_path: "",
pm_style_strength: 20,
skip_layers: [7, 8, 9],
sample_params: (txt_cfg: 4.00, img_cfg: 4.00, distilled_guidance: 3.50, slg.layer_count: 3, slg.layer_start: 0.01, slg.layer_end: 0.20, slg.scale: 0.00, scheduler: NONE, sample_method: dpm++2mv2, sample_steps: 4, eta: 0.00, shifted_timestep: 0),
high_noise_skip_layers: [7, 8, 9],
high_noise_sample_params: (txt_cfg: 7.00, img_cfg: 7.00, distilled_guidance: 3.50, slg.layer_count: 3, slg.layer_start: 0.01, slg.layer_end: 0.20, slg.scale: 0.00, scheduler: NONE, sample_method: NONE, sample_steps: 20, eta: 0.00, shifted_timestep: 0),
custom_sigmas: [],
cache_mode: "",
cache_option: "",
cache: disabled (threshold=1, start=0.15, end=0.95),
moe_boundary: 0.875,
video_frames: 1,
fps: 16,
vace_strength: 1,
strength: 0.75,
control_strength: 0.9,
seed: 42,
upscale_repeats: 1,
upscale_tile_size: 128,
}
[DEBUG] stable-diffusion.cpp:171 - Using Vulkan backend
[DEBUG] ggml_extend.hpp:75 - ggml_vulkan: Found 1 Vulkan devices:
[DEBUG] ggml_extend.hpp:75 - ggml_vulkan: 0 = Radeon RX 580 Series (AMD proprietary driver) | uma: 0 | fp16: 0 | bf16: 0 | warp size: 64 | shared memory: 32768 | int dot: 0 | matrix cores: none
[INFO ] stable-diffusion.cpp:192 - Vulkan: Using device 0
[INFO ] stable-diffusion.cpp:257 - loading diffusion model from 'E:\GenerateImage\Models\FLUX\Realistic\flux-2-klein-4b-Q8_0.gguf'
[INFO ] model.cpp:370 - load E:\GenerateImage\Models\FLUX\Realistic\flux-2-klein-4b-Q8_0.gguf using gguf format
[DEBUG] model.cpp:412 - init from 'E:\GenerateImage\Models\FLUX\Realistic\flux-2-klein-4b-Q8_0.gguf'
[INFO ] stable-diffusion.cpp:304 - loading llm from 'E:\GenerateImage\SDXLVaeAndClips\FLUX2Klein\Qwen3-8B-Q6_K.gguf'
[INFO ] model.cpp:370 - load E:\GenerateImage\SDXLVaeAndClips\FLUX2Klein\Qwen3-8B-Q6_K.gguf using gguf format
[DEBUG] model.cpp:412 - init from 'E:\GenerateImage\SDXLVaeAndClips\FLUX2Klein\Qwen3-8B-Q6_K.gguf'
[INFO ] stable-diffusion.cpp:318 - loading vae from 'E:\GenerateImage\SDXLVaeAndClips\FLUX2Klein\ae.safetensors'
[INFO ] model.cpp:373 - load E:\GenerateImage\SDXLVaeAndClips\FLUX2Klein\ae.safetensors using safetensors format
[DEBUG] model.cpp:503 - init from 'E:\GenerateImage\SDXLVaeAndClips\FLUX2Klein\ae.safetensors', prefix = 'vae.'
[INFO ] stable-diffusion.cpp:334 - Version: Flux.2 klein
[INFO ] stable-diffusion.cpp:362 - Weight type stat: f32: 393 | q8_0: 80 | q6_K: 253 | bf16: 69
[INFO ] stable-diffusion.cpp:363 - Conditioner weight type stat: f32: 145 | q6_K: 253
[INFO ] stable-diffusion.cpp:364 - Diffusion model weight type stat: q8_0: 80 | bf16: 69
[INFO ] stable-diffusion.cpp:365 - VAE weight type stat: f32: 248
[DEBUG] stable-diffusion.cpp:367 - ggml tensor size = 400 bytes
[INFO ] stable-diffusion.cpp:426 - CLIP: Using CPU backend

Additional context / environment details

GPU: Radeon RX 580 8GB (Vulkan)
Models: Flux 2 Klein 4b Q8_0, Qwen3-8B-Q6_K
In commit 9565c7f, progress continues, but an error also appears (a different error, not this one).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions