Fix Qwen3.5-0.8B WebGPU optimization config#388
Fix Qwen3.5-0.8B WebGPU optimization config#388xiaofeihan1 wants to merge 4 commits intomicrosoft:mainfrom
Conversation
|
Hi @apsonawane PTAL, thanks! |
There was a problem hiding this comment.
Pull request overview
This PR updates the Qwen3.5-0.8B “builtin” WebGPU Olive workflow configs to explicitly target a WebGPU-enabled local system, and adjusts the optimization script’s genai_config patching.
Changes:
- Add
systems.local_system(WebGpuExecutionProvider) and set"target": "local_system"in the WebGPU embedding/text/vision Olive JSON configs. - Update
optimize.pyto usevision_session_optionswhen setting the decodersession_options.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| Qwen-Qwen3.5-0.8B/builtin/webgpu/vision.json | Adds WebGPU system definition and selects it via target. |
| Qwen-Qwen3.5-0.8B/builtin/webgpu/text.json | Adds WebGPU system definition and selects it via target. |
| Qwen-Qwen3.5-0.8B/builtin/webgpu/embedding.json | Adds WebGPU system definition and selects it via target. |
| Qwen-Qwen3.5-0.8B/builtin/optimize.py | Changes which session options are written into genai_config.json for the decoder. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| vision_session_options = {"log_id": "onnxruntime-genai", "provider_options": vision_provider_options} | ||
|
|
||
| config["model"]["decoder"]["session_options"] = session_options | ||
| config["model"]["decoder"]["session_options"] = vision_session_options |
There was a problem hiding this comment.
decoder.session_options is being set to vision_session_options. For device == "gpu", vision_provider_options explicitly disables CUDA graph capture, so this change unintentionally applies the vision/embedding workaround to the decoder too (and also hides the fact that session_options is otherwise undefined). Create a separate session_options based on provider_options and assign that to the decoder, keeping vision_session_options only for vision/embedding.
|
Why is this change required only for qwen3.5-0.8b model? |
Thanks @apsonawane . Applied for other Qwen3.5 models. I have a follow-up question: |
|
@xiaofeihan1 We tried few combinations of quantization and selected the one that gives us best accuracy and performance. That's why there is a difference. |
No description provided.