Skip to content

[Bug] Moderately slow generation with Vulkan backend #1114

@GrainyTV

Description

@GrainyTV

Git commit

master-425-bda7fab, commit bda7fab

Operating System & Version

Linux voidlinux 6.6.101_1 #1 SMP PREEMPT_DYNAMIC Sat Aug 2 03:38:07 UTC 2025 x86_64 GNU/Linux

GGML backends

Vulkan

Command-line arguments used

./stable-diffusion.cpp/build/bin/sd-cli -v --diffusion-model ../models/qwen-image-Q2_K.gguf --vae ../models/qwen_image_vae.safetensors --llm ../models/Qwen2.5-VL-7B-Instruct.Q2_K.gguf -p 'Pink Hatsune Miku. Studio Shaft style.' --cfg-scale 2.5 --sampling-method euler --diffusion-fa --flow-shift 3 --offload-to-cpu

Steps to reproduce

I've built a version of sd from the master branch with Vulkan support enabled using the following command:

cmake -B build -DCMAKE_BUILD_TYPE=Release -DSD_VULKAN=ON -G Ninja

Then I tried to generate a basic 512x512 image using the command from the qwen-image documentation.

What you expected to happen

I expected the process to be faster than the first time when I was using the CPU backend.

What actually happened

The program reaches the generating image phase fine. But then it takes around 100 seconds per iteration. The CPU backend with the same settings was around 60 seconds per iteration.

Logs / error messages / stack trace

This is the output of the command in verbose mode:

[DEBUG] main.cpp:379  - version: stable-diffusion.cpp version master-425-bda7fab, commit bda7fab
[DEBUG] main.cpp:380  - System Info:
    SSE3 = 1 |     AVX = 1 |     AVX2 = 1 |     AVX512 = 1 |     AVX512_VBMI = 1 |     AVX512_VNNI = 1 |     FMA = 1 |     NEON = 0 |     ARM_FMA = 0 |     F16C = 1 |     FP16_VA = 0 |     WASM_SIMD = 0 |     VSX = 0 |
[DEBUG] main.cpp:381  - SDCliParams {
  mode: img_gen,
  output_path: "output.png",
  verbose: true,
  color: false,
  canny_preprocess: false,
  preview_method: none,
  preview_interval: 1,
  preview_path: "preview.png",
  preview_fps: 16,
  taesd_preview: false,
  preview_noisy: false
}
[DEBUG] main.cpp:382  - SDContextParams {
  n_threads: 6,
  model_path: "",
  clip_l_path: "",
  clip_g_path: "",
  clip_vision_path: "",
  t5xxl_path: "",
  llm_path: "../models/Qwen2.5-VL-7B-Instruct.Q2_K.gguf",
  llm_vision_path: "",
  diffusion_model_path: "../models/qwen-image-Q2_K.gguf",
  high_noise_diffusion_model_path: "",
  vae_path: "../models/qwen_image_vae.safetensors",
  taesd_path: "",
  esrgan_path: "",
  control_net_path: "",
  embedding_dir: "",
  embeddings: {
  }
  wtype: NONE,
  tensor_type_rules: "",
  lora_model_dir: "",
  photo_maker_path: "",
  rng_type: cuda,
  sampler_rng_type: NONE,
  flow_shift: 3.000000
  offload_params_to_cpu: true,
  control_net_cpu: false,
  clip_on_cpu: false,
  vae_on_cpu: false,
  diffusion_flash_attn: true,
  diffusion_conv_direct: false,
  vae_conv_direct: false,
  chroma_use_dit_mask: true,
  chroma_use_t5_mask: false,
  chroma_t5_mask_pad: 1,
  prediction: NONE,
  lora_apply_mode: auto,
  vae_tiling_params: { 0, 0, 0, 0.5, 0, 0 },
  force_sdxl_vae_conv_scale: false
}
[DEBUG] main.cpp:383  - SDGenerationParams {
  loras: "{
  }",
  high_noise_loras: "{
  }",
  prompt: "Pink Hatsune Miku. Studio Shaft style.",
  negative_prompt: "",
  clip_skip: -1,
  width: 512,
  height: 512,
  batch_count: 1,
  init_image_path: "",
  end_image_path: "",
  mask_image_path: "",
  control_image_path: "",
  ref_image_paths: [],
  control_video_path: "",
  auto_resize_ref_image: true,
  increase_ref_index: false,
  pm_id_images_dir: "",
  pm_id_embed_path: "",
  pm_style_strength: 20,
  skip_layers: [7, 8, 9],
  sample_params: (txt_cfg: 2.50, img_cfg: 2.50, distilled_guidance: 3.50, slg.layer_count: 3, slg.layer_start: 0.01, slg.layer_end: 0.20, slg.scale: 0.00, scheduler: NONE, sample_method: euler, sample_steps: 20, eta: 0.00, shifted_timestep: 0),
  high_noise_skip_layers: [7, 8, 9],
  high_noise_sample_params: (txt_cfg: 7.00, img_cfg: 7.00, distilled_guidance: 3.50, slg.layer_count: 3, slg.layer_start: 0.01, slg.layer_end: 0.20, slg.scale: 0.00, scheduler: NONE, sample_method: NONE, sample_steps: 20, eta: 0.00, shifted_timestep: 0),
  custom_sigmas: [],
  easycache_option: "",
  easycache: disabled (threshold=0, start=0, end=0),
  moe_boundary: 0.875,
  video_frames: 1,
  fps: 16,
  vace_strength: 1,
  strength: 0.75,
  control_strength: 0.9,
  seed: 42,
  upscale_repeats: 1,
  upscale_tile_size: 128,
}
[DEBUG] stable-diffusion.cpp:167  - Using Vulkan backend
[DEBUG] ggml_extend.hpp:74   - ggml_vulkan: Found 2 Vulkan devices:
[DEBUG] ggml_extend.hpp:74   - ggml_vulkan: 0 = AMD Radeon RX 6750 XT (RADV NAVI22) (radv) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 65536 | int dot: 1 | matrix cores: none
[DEBUG] ggml_extend.hpp:74   - ggml_vulkan: 1 = AMD Radeon Graphics (RADV RAPHAEL_MENDOCINO) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 65536 | int dot: 1 | matrix cores: none
[INFO ] stable-diffusion.cpp:233  - loading diffusion model from '../models/qwen-image-Q2_K.gguf'
[INFO ] model.cpp:370  - load ../models/qwen-image-Q2_K.gguf using gguf format
[DEBUG] model.cpp:412  - init from '../models/qwen-image-Q2_K.gguf'
[INFO ] stable-diffusion.cpp:280  - loading llm from '../models/Qwen2.5-VL-7B-Instruct.Q2_K.gguf'
[INFO ] model.cpp:370  - load ../models/Qwen2.5-VL-7B-Instruct.Q2_K.gguf using gguf format
[DEBUG] model.cpp:412  - init from '../models/Qwen2.5-VL-7B-Instruct.Q2_K.gguf'
[INFO ] stable-diffusion.cpp:294  - loading vae from '../models/qwen_image_vae.safetensors'
[INFO ] model.cpp:373  - load ../models/qwen_image_vae.safetensors using safetensors format
[DEBUG] model.cpp:503  - init from '../models/qwen_image_vae.safetensors', prefix = 'vae.'
[INFO ] stable-diffusion.cpp:310  - Version: Qwen Image
[INFO ] stable-diffusion.cpp:338  - Weight type stat:                      f32: 1228 |    q2_K: 809  |    q3_K: 172  |    q4_K: 56   |    bf16: 200 
[INFO ] stable-diffusion.cpp:339  - Conditioner weight type stat:          f32: 141  |    q2_K: 113  |    q3_K: 56   |    q4_K: 28
[INFO ] stable-diffusion.cpp:340  - Diffusion model weight type stat:      f32: 1087 |    q2_K: 696  |    q3_K: 116  |    q4_K: 28   |    bf16: 6   
[INFO ] stable-diffusion.cpp:341  - VAE weight type stat:                 bf16: 194
[DEBUG] stable-diffusion.cpp:343  - ggml tensor size = 400 bytes
[DEBUG] llm.hpp:285  - merges size 151387
[DEBUG] llm.hpp:317  - vocab size: 151669
[DEBUG] llm.hpp:1130 - llm: num_layers = 28, vocab_size = 152064, hidden_size = 3584, intermediate_size = 18944
[INFO ] qwen_image.hpp:531  - qwen_image_params.num_layers: 60
[INFO ] stable-diffusion.cpp:538  - Using flash attention in the diffusion model
[DEBUG] ggml_extend.hpp:1881 - qwen2.5vl params backend buffer size =  4352.65 MB(RAM) (338 tensors)
[DEBUG] ggml_extend.hpp:1881 - qwen_image params backend buffer size =  6735.20 MB(RAM) (1933 tensors)
[DEBUG] ggml_extend.hpp:1881 - wan_vae params backend buffer size =  139.84 MB(RAM) (108 tensors)
[DEBUG] stable-diffusion.cpp:701  - loading weights
[DEBUG] model.cpp:1351 - using 6 threads for model loading
[DEBUG] model.cpp:1373 - loading tensors from ../models/qwen-image-Q2_K.gguf
  |>                                                 | 40/2465 - 200.00it/  |>                                                 | 44/2465 - 110.00it/  |==>                                               | 106/2465 - 66.25it/  |==>                                               | 107/2465 - 59.44it/  |==>                                               | 111/2465 - 55.50it/  |==>                                               | 124/2465 - 56.34it/  |==>                                               | 134/2465 - 55.81it/  |==>                                               | 138/2465 - 53.06it/  |==>                                               | 138/2465 - 49.27it/  |===>                                              | 150/2465 - 49.98it/  |===>                                              | 167/2465 - 52.17it/  |===>                                              | 169/2465 - 49.69it/  |===>                                              | 181/2465 - 50.26it/  |===>                                              | 190/2465 - 49.99it/  |====>                                             | 199/2465 - 49.74it/  |====>                                             | 206/2465 - 49.04it/  |====>                                             | 207/2465 - 47.03it/  |====>                                             | 220/2465 - 47.82it/  |====>                                             | 228/2465 - 47.49it/  |====>                                             | 234/2465 - 46.79it/  |====>                                             | 238/2465 - 45.75it/  |=====>                                            | 256/2465 - 47.39it/  |=====>                                            | 262/2465 - 46.77it/  |=====>                                            | 270/2465 - 46.54it/  |=====>                                            | 279/2465 - 46.48it/  |=====>                                            | 290/2465 - 46.76it/  |======>                                           | 298/2465 - 46.55it/  |======>                                           | 310/2465 - 46.96it/  |======>                                           | 318/2465 - 46.75it/  |======>                                           | 328/2465 - 46.84it/  |======>                                           | 330/2465 - 45.82it/  |======>                                           | 332/2465 - 44.85it/  |=======>                                          | 347/2465 - 45.65it/  |=======>                                          | 359/2465 - 46.01it/  |=======>                                          | 365/2465 - 45.61it/  |=======>                                          | 375/2465 - 45.71it/  |=======>                                          | 390/2465 - 46.41it/  |========>                                         | 395/2465 - 45.91it/  |========>                                         | 406/2465 - 46.12it/  |========>                                         | 422/2465 - 46.87it/  |========>                                         | 428/2465 - 46.51it/  |========>                                         | 440/2465 - 46.79it/  |=========>                                        | 453/2465 - 47.17it/  |=========>                                        | 459/2465 - 46.82it/  |=========>                                        | 462/2465 - 46.19it/  |=========>                                        | 472/2465 - 46.26it/  |=========>                                        | 484/2465 - 46.53it/  |=========>                                        | 487/2465 - 45.93it/  |=========>                                        | 492/2465 - 45.54it/  |==========>                                       | 506/2465 - 45.98it/  |==========>                                       | 519/2465 - 46.32it/  |==========>                                       | 520/2465 - 45.60it/  |==========>                                       | 526/2465 - 45.33it/  |==========>                                       | 542/2465 - 45.92it/  |===========>                                      | 551/2465 - 45.90it/  |===========>                                      | 554/2465 - 45.39it/  |===========>                                      | 559/2465 - 45.07it/  |===========>                                      | 573/2465 - 45.46it/  |===========>                                      | 583/2465 - 45.53it/  |===========>                                      | 586/2465 - 45.06it/  |===========>                                      | 590/2465 - 44.68it/  |============>                                     | 606/2465 - 45.21it/  |============>                                     | 617/2465 - 45.35it/  |============>                                     | 619/2465 - 44.84it/  |============>                                     | 629/2465 - 44.91it/  |=============>                                    | 643/2465 - 45.27it/  |=============>                                    | 648/2465 - 44.98it/  |=============>                                    | 652/2465 - 44.64it/  |=============>                                    | 668/2465 - 45.12it/  |=============>                                    | 680/2465 - 45.32it/  |=============>                                    | 684/2465 - 44.99it/  |==============>                                   | 698/2465 - 45.31it/  |==============>                                   | 712/2465 - 45.63it/  |==============>                                   | 712/2465 - 45.05it/  |==============>                                   | 726/2465 - 45.36it/  |==============>                                   | 732/2465 - 45.17it/  |===============>                                  | 745/2465 - 45.41it/  |===============>                                  | 748/2465 - 45.04it/  |===============>                                  | 757/2465 - 45.04it/  |===============>                                  | 773/2465 - 45.45it/  |===============>                                  | 778/2465 - 45.22it/  |================>                                 | 791/2465 - 45.44it/  |================>                                 | 804/2465 - 45.67it/  |================>                                 | 807/2465 - 45.32it/  |================>                                 | 813/2465 - 45.15it/  |================>                                 | 822/2465 - 45.15it/  |================>                                 | 832/2465 - 45.20it/  |=================>                                | 840/2465 - 45.15it/  |=================>                                | 844/2465 - 44.88it/  |=================>                                | 860/2465 - 45.25it/  |=================>                                | 872/2465 - 45.40it/  |=================>                                | 876/2465 - 45.14it/  |==================>                               | 888/2465 - 45.29it/  |==================>                               | 897/2465 - 45.29it/  |==================>                               | 904/2465 - 45.18it/  |==================>                               | 911/2465 - 45.08it/  |==================>                               | 922/2465 - 45.18it/  |==================>                               | 933/2465 - 45.28it/  |==================>                               | 936/2465 - 44.98it/  |===================>                              | 949/2465 - 45.18it/  |===================>                              | 961/2465 - 45.32it/  |===================>                              | 966/2465 - 45.13it/  |===================>                              | 973/2465 - 45.03it/  |===================>                              | 984/2465 - 45.12it/  |====================>                             | 998/2465 - 45.35it/  |====================>                             | 1004/2465 - 45.21it  |====================>                             | 1016/2465 - 45.34it  |====================>                             | 1031/2465 - 45.60it  |====================>                             | 1034/2465 - 45.33it  |=====================>                            | 1037/2465 - 45.07it  |=====================>                            | 1059/2465 - 45.63it  |=====================>                            | 1064/2465 - 45.45it  |=====================>                            | 1069/2465 - 45.28it  |=====================>                            | 1082/2465 - 45.45it  |======================>                           | 1094/2465 - 45.57it  |======================>                           | 1099/2465 - 45.40it  |======================>                           | 1109/2465 - 45.44it  |======================>                           | 1122/2465 - 45.59it  |======================>                           | 1128/2465 - 45.47it  |======================>                           | 1130/2465 - 45.19it  |=======================>                          | 1147/2465 - 45.50it  |=======================>                          | 1158/2465 - 45.58it  |=======================>                          | 1160/2465 - 45.30it  |=======================>                          | 1167/2465 - 45.22it  |========================>                         | 1184/2465 - 45.52it  |========================>                         | 1192/2465 - 45.48it  |========================>                         | 1195/2465 - 45.25it  |========================>                         | 1210/2465 - 45.47it  |========================>                         | 1221/2465 - 45.54it  |========================>                         | 1228/2465 - 45.47it  |=========================>                        | 1240/2465 - 45.57it  |=========================>                        | 1254/2465 - 45.75it  |=========================>                        | 1256/2465 - 45.49it  |=========================>                        | 1260/2465 - 45.31it  |=========================>                        | 1273/2465 - 45.45it  |==========================>                       | 1286/2465 - 45.59it  |==========================>                       | 1292/2465 - 45.48it  |==========================>                       | 1304/2465 - 45.58it  |==========================>                       | 1313/2465 - 45.57it  |==========================>                       | 1320/2465 - 45.50it  |===========================>                      | 1333/2465 - 45.64it  |===========================>                      | 1350/2465 - 45.90it  |===========================>                      | 1354/2465 - 45.73it  |===========================>                      | 1358/2465 - 45.56it  |===========================>                      | 1372/2465 - 45.72it  |============================>                     | 1383/2465 - 45.78it  |============================>                     | 1384/2465 - 45.51it  |============================>                     | 1390/2465 - 45.41it  |============================>                     | 1412/2465 - 45.83it  |============================>                     | 1418/2465 - 45.73it  |============================>                     | 1420/2465 - 45.50it  |=============================>                    | 1444/2465 - 45.97it  |=============================>                    | 1448/2465 - 45.81it  |=============================>                    | 1453/2465 - 45.68it  |=============================>                    | 1465/2465 - 45.77it  |=============================>                    | 1478/2465 - 45.88it  |==============================>                   | 1484/2465 - 45.79it  |==============================>                   | 1485/2465 - 45.54it  |==============================>                   | 1510/2465 - 46.02it  |==============================>                   | 1512/2465 - 45.80it  |===============================>                  | 1530/2465 - 46.07it  |===============================>                  | 1545/2465 - 46.24it  |===============================>                  | 1563/2465 - 46.50it  |===============================>                  | 1573/2465 - 46.52it  |===============================>                  | 1576/2465 - 46.34it  |================================>                 | 1589/2465 - 46.45it  |================================>                 | 1596/2465 - 46.38it  |================================>                 | 1610/2465 - 46.52it  |================================>                 | 1623/2465 - 46.62it  |=================================>                | 1656/2465 - 47.30it  |=================================>                | 1673/2465 - 47.51it  |==================================>               | 1685/2465 - 47.58it  |==================================>               | 1696/2465 - 47.62it  |==================================>               | 1704/2465 - 47.58it  |==================================>               | 1709/2465 - 47.46it  |==================================>               | 1722/2465 - 47.55it  |===================================>              | 1733/2465 - 47.59it  |===================================>              | 1741/2465 - 47.55it  |===================================>              | 1751/2465 - 47.56it  |===================================>              | 1759/2465 - 47.52it  |===================================>              | 1768/2465 - 47.51it  |===================================>              | 1772/2465 - 47.36it  |====================================>             | 1785/2465 - 47.46it  |====================================>             | 1796/2465 - 47.50it  |====================================>             | 1806/2465 - 47.51it  |====================================>             | 1818/2465 - 47.58it  |=====================================>            | 1832/2465 - 47.69it  |======================================>           | 1880/2465 - 48.69it  |======================================>           | 1898/2465 - 48.90it  |=======================================>          | 1933/2465 - 49.24it/s
[DEBUG] model.cpp:1373 - loading tensors from ../models/Qwen2.5-VL-7B-Instruct.Q2_K.gguf
  |=======================================>          | 1940/2465 - 49.42it  |=======================================>          | 1959/2465 - 49.65it  |=======================================>          | 1963/2465 - 49.50it  |=======================================>          | 1963/2465 - 49.25it  |=======================================>          | 1965/2465 - 49.05it  |=======================================>          | 1968/2465 - 48.88it  |=======================================>          | 1969/2465 - 48.67it  |========================================>         | 1978/2465 - 48.65it  |========================================>         | 1981/2465 - 48.48it  |========================================>         | 1982/2465 - 48.27it  |========================================>         | 1987/2465 - 48.16it  |========================================>         | 1992/2465 - 48.05it  |========================================>         | 1997/2465 - 47.94it  |========================================>         | 2001/2465 - 47.80it  |========================================>         | 2010/2465 - 47.79it  |========================================>         | 2013/2465 - 47.63it  |========================================>         | 2017/2465 - 47.50it  |========================================>         | 2019/2465 - 47.33it  |=========================================>        | 2023/2465 - 47.20it  |=========================================>        | 2031/2465 - 47.17it  |=========================================>        | 2035/2465 - 47.04it  |=========================================>        | 2046/2465 - 47.08it  |=========================================>        | 2047/2465 - 46.88it  |=========================================>        | 2047/2465 - 46.67it  |=========================================>        | 2049/2465 - 46.50it  |=========================================>        | 2053/2465 - 46.38it  |=========================================>        | 2058/2465 - 46.29it  |=========================================>        | 2066/2465 - 46.26it  |=========================================>        | 2067/2465 - 46.08it  |==========================================>       | 2073/2465 - 46.00it  |==========================================>       | 2083/2465 - 46.02it  |==========================================>       | 2083/2465 - 45.82it  |==========================================>       | 2088/2465 - 45.73it  |==========================================>       | 2095/2465 - 45.68it  |==========================================>       | 2100/2465 - 45.59it  |==========================================>       | 2103/2465 - 45.46it  |==========================================>       | 2112/2465 - 45.46it  |==========================================>       | 2114/2465 - 45.30it  |==========================================>       | 2114/2465 - 45.11it  |===========================================>      | 2120/2465 - 45.05it  |===========================================>      | 2127/2465 - 45.00it  |===========================================>      | 2127/2465 - 44.81it  |===========================================>      | 2132/2465 - 44.73it  |===========================================>      | 2139/2465 - 44.69it  |===========================================>      | 2145/2465 - 44.63it  |===========================================>      | 2150/2465 - 44.55it  |===========================================>      | 2154/2465 - 44.45it  |===========================================>      | 2164/2465 - 44.47it  |===========================================>      | 2165/2465 - 44.31it  |===========================================>      | 2167/2465 - 44.17it  |============================================>     | 2178/2465 - 44.21it  |============================================>     | 2184/2465 - 44.15it  |============================================>     | 2186/2465 - 44.02it  |============================================>     | 2193/2465 - 43.98it  |============================================>     | 2199/2465 - 43.92it  |============================================>     | 2203/2465 - 43.83it  |============================================>     | 2204/2465 - 43.68it  |============================================>     | 2211/2465 - 43.64it  |============================================>     | 2218/2465 - 43.61it  |=============================================>    | 2221/2465 - 43.50it  |=============================================>    | 2224/2465 - 43.38it  |=============================================>    | 2229/2465 - 43.31it  |=============================================>    | 2236/2465 - 43.28it  |=============================================>    | 2243/2465 - 43.25it  |=============================================>    | 2247/2465 - 43.16it  |=============================================>    | 2251/2465 - 43.07it  |=============================================>    | 2253/2465 - 42.94it  |=============================================>    | 2264/2465 - 42.99it  |==============================================>   | 2270/2465 - 42.94it  |==============================================>   | 2271/2465 - 42.39it/s
[DEBUG] model.cpp:1373 - loading tensors from ../models/qwen_image_vae.safetensors
  |==============================================>   | 2278/2465 - 42.52it  |==============================================>   | 2299/2465 - 42.76it  |==============================================>   | 2309/2465 - 42.78it  |===============================================>  | 2332/2465 - 43.05it  |===============================================>  | 2362/2465 - 43.44it  |==================================================| 2465/2465 - 45.13it/s
[INFO ] model.cpp:1579 - loading tensors completed, taking 54.62s (process: 0.00s, read: 54.13s, memcpy: 0.00s, convert: 0.21s, copy_to_backend: 0.00s)
[DEBUG] stable-diffusion.cpp:733  - finished loaded file
[INFO ] stable-diffusion.cpp:790  - total params memory size = 11227.69MB (VRAM 11227.69MB, RAM 0.00MB): text_encoders 4352.65MB(VRAM), diffusion_model 6735.20MB(VRAM), vae 139.84MB(VRAM), controlnet 0.00MB(VRAM), pmid 0.00MB(VRAM)
[INFO ] stable-diffusion.cpp:873  - running in FLOW mode
[DEBUG] stable-diffusion.cpp:3174 - generate_image 512x512
[INFO ] stable-diffusion.cpp:3208 - sampling using Euler method
[INFO ] denoiser.hpp:364  - get_sigmas with discrete scheduler
[INFO ] stable-diffusion.cpp:3331 - TXT2IMG
[DEBUG] conditioner.hpp:1679 - parse '<|im_start|>system
Describe the image by detailing the color, shape, size, texture, quantity, text, spatial relationships of the objects and background:<|im_end|>
<|im_start|>user
Pink Hatsune Miku. Studio Shaft style.<|im_end|>
<|im_start|>assistant
' to [['<|im_start|>system
Describe the image by detailing the color, shape, size, texture, quantity, text, spatial relationships of the objects and background:<|im_end|>
<|im_start|>user
', 1], ['Pink Hatsune Miku. Studio Shaft style.', 1], ['<|im_end|>
<|im_start|>assistant
', 1], ]
[DEBUG] llm.hpp:259  - split prompt "<|im_start|>system
Describe the image by detailing the color, shape, size, texture, quantity, text, spatial relationships of the objects and background:<|im_end|>
<|im_start|>user
" to tokens ["<|im_start|>", "system", "Ċ", "Describe", "Ġthe", "Ġimage", "Ġby", "Ġdetailing", "Ġthe", "Ġcolor", ",", "Ġshape", ",", "Ġsize", ",", "Ġtexture", ",", "Ġquantity", ",", "Ġtext", ",", "Ġspatial", "Ġrelationships", "Ġof", "Ġthe", "Ġobjects", "Ġand", "Ġbackground", ":", "<|im_end|>", "Ċ", "<|im_start|>", "user", "Ċ", ]
[DEBUG] llm.hpp:259  - split prompt "Pink Hatsune Miku. Studio Shaft style." to tokens ["Pink", "ĠHats", "une", "ĠM", "iku", ".", "ĠStudio", "ĠShaft", "Ġstyle", ".", ]
[DEBUG] llm.hpp:259  - split prompt "<|im_end|>
<|im_start|>assistant
" to tokens ["<|im_end|>", "Ċ", "<|im_start|>", "assistant", "Ċ", ]
[INFO ] ggml_extend.hpp:1794 - qwen2.5vl offload params (4352.65 MB, 338 tensors) to runtime backend (Vulkan1), taking 0.87s
[DEBUG] ggml_extend.hpp:1696 - qwen2.5vl compute buffer size: 9.09 MB(VRAM)
[DEBUG] conditioner.hpp:1892 - computing condition graph completed, taking 3091 ms
[DEBUG] conditioner.hpp:1679 - parse '<|im_start|>system
Describe the image by detailing the color, shape, size, texture, quantity, text, spatial relationships of the objects and background:<|im_end|>
<|im_start|>user
<|im_end|>
<|im_start|>assistant
' to [['<|im_start|>system
Describe the image by detailing the color, shape, size, texture, quantity, text, spatial relationships of the objects and background:<|im_end|>
<|im_start|>user
', 1], ['<|im_end|>
<|im_start|>assistant
', 1], ]
[DEBUG] llm.hpp:259  - split prompt "<|im_start|>system
Describe the image by detailing the color, shape, size, texture, quantity, text, spatial relationships of the objects and background:<|im_end|>
<|im_start|>user
" to tokens ["<|im_start|>", "system", "Ċ", "Describe", "Ġthe", "Ġimage", "Ġby", "Ġdetailing", "Ġthe", "Ġcolor", ",", "Ġshape", ",", "Ġsize", ",", "Ġtexture", ",", "Ġquantity", ",", "Ġtext", ",", "Ġspatial", "Ġrelationships", "Ġof", "Ġthe", "Ġobjects", "Ġand", "Ġbackground", ":", "<|im_end|>", "Ċ", "<|im_start|>", "user", "Ċ", ]
[DEBUG] llm.hpp:259  - split prompt "<|im_end|>
<|im_start|>assistant
" to tokens ["<|im_end|>", "Ċ", "<|im_start|>", "assistant", "Ċ", ]
[INFO ] ggml_extend.hpp:1794 - qwen2.5vl offload params (4352.65 MB, 338 tensors) to runtime backend (Vulkan1), taking 0.76s
[DEBUG] ggml_extend.hpp:1696 - qwen2.5vl compute buffer size: 7.24 MB(VRAM)
[DEBUG] conditioner.hpp:1892 - computing condition graph completed, taking 2892 ms
[INFO ] stable-diffusion.cpp:2952 - get_learned_condition completed, taking 5984 ms
[INFO ] stable-diffusion.cpp:3063 - generating image: 1/1 - seed 42
[INFO ] ggml_extend.hpp:1794 - qwen_image offload params (6735.20 MB, 1933 tensors) to runtime backend (Vulkan1), taking 1.27s
[DEBUG] ggml_extend.hpp:1696 - qwen_image compute buffer size: 146.07 MB(VRAM)
  |=======>                                          | 3/20 - 105.11s/it^C

Additional context / environment details

CPU is Amd Ryzen 7600
GPU are Radeon RX 6750 XT and the integrated one (the logs show they are discovered)
32 GB of RAM

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions