OpenWebUI update - new features and gpt as a main model by przepeck · Pull Request #4102 · openvinotoolkit/model_server

przepeck · 2026-03-31T12:59:43Z

🛠 Summary

CVS-183785
Changing models used in OpenWebUI, adding sections about new agentic features.

Done [todo]: Update screenshots to use ovms-model model name instead of Godreign/llama-3.2-3b-instruct-openvino-int4-model

🧪 Checklist

Unit tests added.
The documentation updated.
Change follows security best practices.
``

Copilot

Pull request overview

Updates the OpenWebUI integration demo documentation to use newer default models (including a VLM) and adds guidance for newer “agentic” OpenWebUI features like Web Search, Memory, and Code Interpreter.

Changes:

Switched the primary chat model example to OpenVINO/gpt-oss-20b-int4-ov and standardized the OpenWebUI Model ID to ovms-model.
Replaced the VLM example model with Junrui2021/Qwen3-VL-8B-Instruct-int4 and added a new screenshot for image upload.
Added new documentation sections for Web Search, Memory/context, and Code Interpreter configuration in OpenWebUI.

Reviewed changes

Copilot reviewed 1 out of 13 changed files in this pull request and generated 11 comments.

File	Description
demos/integration_with_OpenWebUI/README.md	Updates model pull/config instructions and adds new OpenWebUI feature sections (Web Search, Memory, Code Interpreter).
demos/integration_with_OpenWebUI/upload_images.png	Adds/updates a screenshot used by the VLM “upload images” step.

demos/integration_with_OpenWebUI/README.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

demos/integration_with_OpenWebUI/README.md

…tter compatibility

przepeck · 2026-04-07T07:16:46Z

link to sphinx version: https://openvino-doc.iotg.sclab.intel.com/openwebui_update/model-server/ovms_demos_integration_with_open_webui.html

Copilot

Pull request overview

Copilot reviewed 2 out of 23 changed files in this pull request and generated 3 comments.

demos/integration_with_OpenWebUI/README.md

…date

Copilot

Pull request overview

Copilot reviewed 2 out of 27 changed files in this pull request and generated 1 comment.

Copilot · 2026-04-07T12:34:44Z

demos/integration_with_OpenWebUI/README.md

 mkdir models
-docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --pull --source_model Godreign/llama-3.2-3b-instruct-openvino-int4-model --model_repository_path /models --task text_generation
-docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --add_to_config --config_path  /models/config.json --model_path Godreign/llama-3.2-3b-instruct-openvino-int4-model --model_name Godreign/llama-3.2-3b-instruct-openvino-int4-model
+docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --pull --source_model OpenVINO/gpt-oss-20b-int4-ov --model_repository_path /models --task text_generation --tool_parser gptoss --reasoning_parser gptoss 


There’s a trailing whitespace at the end of this Docker command line. Trimming it avoids noisy diffs and occasional copy/paste quirks.

Suggested change

docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --pull --source_model OpenVINO/gpt-oss-20b-int4-ov --model_repository_path /models --task text_generation --tool_parser gptoss --reasoning_parser gptoss

docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --pull --source_model OpenVINO/gpt-oss-20b-int4-ov --model_repository_path /models --task text_generation --tool_parser gptoss --reasoning_parser gptoss

dtrawins · 2026-04-07T14:17:22Z

demos/integration_with_OpenWebUI/README.md

@@ -73,4 +73,4 @@
 > **Important Note**: While using NPU device for acceleration or model gpt-oss-20b with GPU, it is recommended to disable `Follow-Up Auto-Generation` in `Settings > Interface` menu. It will improve response time and avoid queuing requests. For gpt-oss model it will avoid concurrent execution which in version 2026.0 has an accuracy issue.


This WA is needed only for NPU, gpt-oss is fixed.

dtrawins · 2026-04-07T14:18:49Z

demos/integration_with_OpenWebUI/README.md

@@ -17,4 +17,4 @@
 * [Docker Engine](https://docs.docker.com/engine/) installed
 * Host with x86_64 architecture
 * Linux, macOS, or Windows


dtrawins · 2026-04-07T14:19:04Z

demos/integration_with_OpenWebUI/README.md

 * [Docker Engine](https://docs.docker.com/engine/) installed
 * Host with x86_64 architecture
 * Linux, macOS, or Windows
 * Python 3.11 with pip 


only python3.11?

While pip package of OWU does allow for >=3.11, <3.13.0a1, the install instructions:
https://pypi.org/project/open-webui/
How to install recommend using 3.11

dtrawins · 2026-04-07T14:20:07Z

demos/integration_with_OpenWebUI/README.md


 ### Prerequisites

 In this demo, OpenVINO Model Server is deployed on Linux with CPU using Docker and Open WebUI is installed via Python pip. Requirements to follow this demo:


let's make it GPU by default with option to switch to CPU

dtrawins · 2026-04-07T14:21:12Z

demos/integration_with_OpenWebUI/README.md

 > **Important Note**: While using NPU device for acceleration or model gpt-oss-20b with GPU, it is recommended to disable `Follow-Up Auto-Generation` in `Settings > Interface` menu. It will improve response time and avoid queuing requests. For gpt-oss model it will avoid concurrent execution which in version 2026.0 has an accuracy issue.

 ### References
 [https://docs.openvino.ai/2026/model-server/ovms_demos_continuous_batching.html](https://docs.openvino.ai/2026/model-server/ovms_demos_continuous_batching.html#model-preparation)


is it still relevant reference?

You prefer to drop it or replace it with possibly:
https://docs.openvino.ai/2026/model-server/ovms_demos_continuous_batching_agent.html#export-llm-model?

dtrawins · 2026-04-07T14:22:05Z

demos/integration_with_OpenWebUI/README.md

add info about Native Tool Calling

Its in there:

dtrawins · 2026-04-07T14:22:49Z

demos/integration_with_OpenWebUI/README.md

for gpt-oss it will be "reasoning_effort":"low"

dtrawins · 2026-04-07T14:23:51Z

demos/integration_with_OpenWebUI/README.md

 ```bash
-docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --pull --source_model OpenVINO/InternVL2-2B-int4-ov --model_repository_path models --model_name OpenVINO/InternVL2-2B-int4-ov --task text_generation
-docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --add_to_config --config_path /models/config.json  --model_path OpenVINO/InternVL2-2B-int4-ov --model_name OpenVINO/InternVL2-2B-int4-ov
+docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --pull --source_model Junrui2021/Qwen3-VL-8B-Instruct-int4 --model_repository_path /models --model_name ovms-model-vl --task text_generation --pipeline_type VLM_CB


is pipeline_type needed?

Damian used it in his demos, I assumed that this model works better with that

przepeck added 2 commits March 31, 2026 14:41

OpenWebUI update

a8ad2dc

OpenWebUI update

bfd5af7

przepeck requested review from atobiszei, Copilot and dtrawins March 31, 2026 12:59

Copilot started reviewing on behalf of przepeck March 31, 2026 13:00 View session

Copilot AI reviewed Mar 31, 2026

View reviewed changes

dtrawins mentioned this pull request Apr 1, 2026

Update OpenWEB UI demo to new version for image generation #4100

Open

3 tasks

Apply suggestion from @Copilot

e149a02

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

atobiszei reviewed Apr 2, 2026

View reviewed changes

demos/integration_with_OpenWebUI/README.md Show resolved Hide resolved

atobiszei reviewed Apr 2, 2026

View reviewed changes

demos/integration_with_OpenWebUI/README.md Outdated Show resolved Hide resolved

przepeck added 4 commits April 3, 2026 07:23

fixes

0a27de8

fix

3c7f3c1

spelling

1923661

changing images to use more generic model name for consistancy and be…

473c86a

…tter compatibility

przepeck changed the title ~~[WIP] OpenWebUI update~~ OpenWebUI update Apr 3, 2026

przepeck changed the title ~~OpenWebUI update~~ OpenWebUI update - new features and gpt as a main model Apr 3, 2026

przepeck added 4 commits April 3, 2026 11:20

advanced params screenshot

393a4a6

fix

bf6a331

spelling

cbcc906

tools changes

8ddc72f

atobiszei requested a review from Copilot April 7, 2026 11:44

atobiszei approved these changes Apr 7, 2026

View reviewed changes

Copilot started reviewing on behalf of atobiszei April 7, 2026 11:45 View session

Copilot AI reviewed Apr 7, 2026

View reviewed changes

demos/integration_with_OpenWebUI/README.md Outdated Show resolved Hide resolved

demos/integration_with_OpenWebUI/README.md Outdated Show resolved Hide resolved

demos/integration_with_OpenWebUI/README.md Outdated Show resolved Hide resolved

atobiszei and others added 3 commits April 7, 2026 13:51

Update OpenWEB UI demo to new version for image generation

4881a4e

typos

f2a4d4d

spelling whitelist

2c8a4d1

atobiszei requested a review from Copilot April 7, 2026 12:29

Copilot started reviewing on behalf of atobiszei April 7, 2026 12:30 View session

Merge remote-tracking branch 'origin/main' into przepeck/openwebui_up…

64278be

…date

Copilot AI reviewed Apr 7, 2026

View reviewed changes

dtrawins reviewed Apr 7, 2026

View reviewed changes

atobiszei and others added 2 commits April 7, 2026 16:57

Review fix p1

773b158

review changes

fddd075

	docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --pull --source_model OpenVINO/gpt-oss-20b-int4-ov --model_repository_path /models --task text_generation --tool_parser gptoss --reasoning_parser gptoss
	docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --pull --source_model OpenVINO/gpt-oss-20b-int4-ov --model_repository_path /models --task text_generation --tool_parser gptoss --reasoning_parser gptoss

		@@ -73,4 +73,4 @@
		> Important Note: While using NPU device for acceleration or model gpt-oss-20b with GPU, it is recommended to disable `Follow-Up Auto-Generation` in `Settings > Interface` menu. It will improve response time and avoid queuing requests. For gpt-oss model it will avoid concurrent execution which in version 2026.0 has an accuracy issue.


		### Prerequisites

		In this demo, OpenVINO Model Server is deployed on Linux with CPU using Docker and Open WebUI is installed via Python pip. Requirements to follow this demo:

Conversation

przepeck commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🛠 Summary

🧪 Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

przepeck commented Apr 7, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

przepeck commented Mar 31, 2026 •

edited

Loading