Skip to content

OpenWebUI update - new features and gpt as a main model#4102

Open
przepeck wants to merge 17 commits intomainfrom
przepeck/openwebui_update
Open

OpenWebUI update - new features and gpt as a main model#4102
przepeck wants to merge 17 commits intomainfrom
przepeck/openwebui_update

Conversation

@przepeck
Copy link
Copy Markdown
Collaborator

@przepeck przepeck commented Mar 31, 2026

🛠 Summary

CVS-183785
Changing models used in OpenWebUI, adding sections about new agentic features.

Done [todo]: Update screenshots to use ovms-model model name instead of Godreign/llama-3.2-3b-instruct-openvino-int4-model

🧪 Checklist

  • Unit tests added.
  • The documentation updated.
  • Change follows security best practices.
    ``

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the OpenWebUI integration demo documentation to use newer default models (including a VLM) and adds guidance for newer “agentic” OpenWebUI features like Web Search, Memory, and Code Interpreter.

Changes:

  • Switched the primary chat model example to OpenVINO/gpt-oss-20b-int4-ov and standardized the OpenWebUI Model ID to ovms-model.
  • Replaced the VLM example model with Junrui2021/Qwen3-VL-8B-Instruct-int4 and added a new screenshot for image upload.
  • Added new documentation sections for Web Search, Memory/context, and Code Interpreter configuration in OpenWebUI.

Reviewed changes

Copilot reviewed 1 out of 13 changed files in this pull request and generated 11 comments.

File Description
demos/integration_with_OpenWebUI/README.md Updates model pull/config instructions and adds new OpenWebUI feature sections (Web Search, Memory, Code Interpreter).
demos/integration_with_OpenWebUI/upload_images.png Adds/updates a screenshot used by the VLM “upload images” step.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@przepeck przepeck changed the title [WIP] OpenWebUI update OpenWebUI update Apr 3, 2026
@przepeck przepeck changed the title OpenWebUI update OpenWebUI update - new features and gpt as a main model Apr 3, 2026
@przepeck
Copy link
Copy Markdown
Collaborator Author

przepeck commented Apr 7, 2026

@atobiszei atobiszei requested a review from Copilot April 7, 2026 11:44
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 23 changed files in this pull request and generated 3 comments.

@atobiszei atobiszei requested a review from Copilot April 7, 2026 12:29
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 27 changed files in this pull request and generated 1 comment.

mkdir models
docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --pull --source_model Godreign/llama-3.2-3b-instruct-openvino-int4-model --model_repository_path /models --task text_generation
docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --add_to_config --config_path /models/config.json --model_path Godreign/llama-3.2-3b-instruct-openvino-int4-model --model_name Godreign/llama-3.2-3b-instruct-openvino-int4-model
docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --pull --source_model OpenVINO/gpt-oss-20b-int4-ov --model_repository_path /models --task text_generation --tool_parser gptoss --reasoning_parser gptoss
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s a trailing whitespace at the end of this Docker command line. Trimming it avoids noisy diffs and occasional copy/paste quirks.

Suggested change
docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --pull --source_model OpenVINO/gpt-oss-20b-int4-ov --model_repository_path /models --task text_generation --tool_parser gptoss --reasoning_parser gptoss
docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --pull --source_model OpenVINO/gpt-oss-20b-int4-ov --model_repository_path /models --task text_generation --tool_parser gptoss --reasoning_parser gptoss

Copilot uses AI. Check for mistakes.
@@ -73,4 +73,4 @@
> **Important Note**: While using NPU device for acceleration or model gpt-oss-20b with GPU, it is recommended to disable `Follow-Up Auto-Generation` in `Settings > Interface` menu. It will improve response time and avoid queuing requests. For gpt-oss model it will avoid concurrent execution which in version 2026.0 has an accuracy issue.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This WA is needed only for NPU, gpt-oss is fixed.

@@ -17,4 +17,4 @@
* [Docker Engine](https://docs.docker.com/engine/) installed
* Host with x86_64 architecture
* Linux, macOS, or Windows
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop macos

* [Docker Engine](https://docs.docker.com/engine/) installed
* Host with x86_64 architecture
* Linux, macOS, or Windows
* Python 3.11 with pip
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only python3.11?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While pip package of OWU does allow for >=3.11, <3.13.0a1, the install instructions:
https://pypi.org/project/open-webui/
How to install recommend using 3.11


### Prerequisites

In this demo, OpenVINO Model Server is deployed on Linux with CPU using Docker and Open WebUI is installed via Python pip. Requirements to follow this demo:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's make it GPU by default with option to switch to CPU

> **Important Note**: While using NPU device for acceleration or model gpt-oss-20b with GPU, it is recommended to disable `Follow-Up Auto-Generation` in `Settings > Interface` menu. It will improve response time and avoid queuing requests. For gpt-oss model it will avoid concurrent execution which in version 2026.0 has an accuracy issue.

### References
[https://docs.openvino.ai/2026/model-server/ovms_demos_continuous_batching.html](https://docs.openvino.ai/2026/model-server/ovms_demos_continuous_batching.html#model-preparation)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it still relevant reference?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add info about Native Tool Calling

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its in there:
image

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for gpt-oss it will be "reasoning_effort":"low"

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

```bash
docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --pull --source_model OpenVINO/InternVL2-2B-int4-ov --model_repository_path models --model_name OpenVINO/InternVL2-2B-int4-ov --task text_generation
docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --add_to_config --config_path /models/config.json --model_path OpenVINO/InternVL2-2B-int4-ov --model_name OpenVINO/InternVL2-2B-int4-ov
docker run --rm -u $(id -u):$(id -g) -v $PWD/models:/models openvino/model_server:weekly --pull --source_model Junrui2021/Qwen3-VL-8B-Instruct-int4 --model_repository_path /models --model_name ovms-model-vl --task text_generation --pipeline_type VLM_CB
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is pipeline_type needed?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damian used it in his demos, I assumed that this model works better with that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants