15 Apr 11:09

wangxingjun778

7ebe008

v0.0.7post1 Latest

Latest

Implementation & Logic Updates

Configurable Working Directory for MCP: Add --work-path CLI argument applicable to mcp serve command
Custom Resource Pathing: Allows users to specify a custom directory for storing environment variables, cache files, and history logs.
CLI Argument Parsing: Simplified the logic for parsing the new --work-path argument.
Dynamic Configuration Generation: Updated configuration file generation to dynamically populate the actual work path instead of using static placeholders.
Environment Variable Consistency: Ensured the MODELSCOPE_CACHE environment variable is consistently set across all transport modes based on the specified work path.
Docs Update: Updated documentation to reflect the new --work-path parameter and its usage across supported commands.

What's Changed

[Fix] Hotfix mcp work path by @wangxingjun778 in #161

Full Changelog: v0.0.7...v0.0.7post1

Contributors

wangxingjun778

Assets 2

12 Apr 16:36

wangxingjun778

v0.0.7

b1de3b0

v0.0.7

英文版本 (English Version)

🚀 New Features

Multi-Arch Support: Added support for multi-architecture Docker builds (e.g., amd64, arm64).
C/S Deployment: Introduced a Client/Server deployment framework to improve scalability and remote access.
LLM Fallback Strategy: Added llm_fallback argument to allow the model to generate answers even when no retrieval evidence is found.

🛠 Improvements & Fixes

Docker Optimization:
- Fixed several issues in the Docker builder pipeline.
- Updated Node.js version within the Docker image for better performance and security.
MCP Enhancement: Improved Model Context Protocol (MCP) by automatically expanding ~ in path environment variables after loading .env.
Bug Fixes: Resolved feedback issues and refined internal logic from code reviews.

📝 Documentation & Others

Deployment Guide: Updated README with detailed instructions for Docker deployment.
Community: Updated the project QR code and synchronized version news for v0.0.7.

中文版本 (Chinese Version)

🚀 新功能

多架构 Docker 支持：新增对多平台 Docker 镜像构建的支持（如 amd64, arm64）。
C/S 部署框架：引入客户端/服务器部署模式，提升系统的可扩展性与远程调用能力。
LLM 回退策略：新增 llm_fallback 参数，支持在未检索到证据时由 LLM 直接生成回答。

🛠 优化与修复

Docker 镜像优化：
- 修复了 Docker 构建过程中的若干已知问题。
- 升级了镜像内的 Node.js 版本以提升稳定性。
MCP 功能增强：在 MCP 模块中，支持在加载 .env 后自动展开路径环境变量中的波浪号（~）。
代码修复：修复了 Review 过程中发现的逻辑问题并进行了细节优化。

📝 文档与其它

部署文档：更新了 README 中关于 Docker 部署的操作指南。
社区交流：更新了项目交流群二维码，并同步发布了 v0.0.7 的版本动态。

What's Changed

[Feat] Support multi-arch docker building by @wangxingjun778 in #146
[Fix] Fix issues for docker builder by @wangxingjun778 in #147
update nodejs version for docker image by @wangxingjun778 in #148
[Documentation] Update readme for docker deployment by @wangxingjun778 in #151
[Documentation] update QR code by @wangxingjun778 in #152
[Feat] Add llm_fallback arg to determine llm generated answer without retrieval evidences by @wangxingjun778 in #154
fix review issue by @wangxingjun778 in #155
fix(mcp): expand ~ in path env vars after dotenv load by @liusining in #156
[Feature] Add c/s deployment framework support by @wangxingjun778 in #157
update 0.0.7 news by @wangxingjun778 in #158

New Contributors

@liusining made their first contribution in #156

Full Changelog: v0.0.6post3...v0.0.7

Contributors

wangxingjun778 and liusining

Assets 2

03 Apr 12:24

wangxingjun778

v0.0.6post3

e831019

v0.0.6post3

中文版 (Chinese)

新功能

FAST模式搜索增强：支持多文件证据聚合、小文件（<100KB）免采样通道，并引入 IDF 加权评分。
知识库优化：支持基于路径作用域的知识簇复用。
扫描增强：新增大文件采样机制及 PDF 提取超时控制。

修复与改进

搜索逻辑：实现文件内匹配去重，修复文件名分页及 IDF 量纲一致性。
路径处理：重构路径覆盖率计算，增强根目录处理的鲁棒性。
稳定性：修复日志级别未正确还原的问题。

英文版 (English)

Features

FAST Mode Enhancement: Added multi-file evidence aggregation, a "fast path" for small files (<100KB), and IDF-weighted scoring.
Knowledge Reuse: Implemented path-based scoping for efficient knowledge cluster reuse.
Scan Optimization: Introduced large-file sampling and PDF extraction timeouts.

Fixes & Improvements

Search Logic: Added match deduplication; fixed filename pagination and IDF scale consistency.
Path Handling: Refactored path coverage logic using pathlib for better root directory support.
Stability: Fixed a bug where logger levels were not reverted after timed operations.

What's Changed

[Feat] Add keywords merge weights and activate the top_k_files for FAST mode by @wangxingjun778 in #139
[Feat] Add small file shortcut for FAST mode & fix reuse scope for knowledge by @wangxingjun778 in #140
[Fix] fix sep for normalised_scopes by @wangxingjun778 in #141
[feat] enhance scan functions by @suluyana in #131

Full Changelog: v0.0.6post2...v0.0.6post3

Contributors

wangxingjun778 and suluyana

Assets 2

03 Apr 03:22

wangxingjun778

v0.0.6post2

2ea68c3

v0.0.6post2

Release Notes (English)

🚀 New Features

OpenClaw Skill: Integrated Sirchmunk as an AI agent skill for natural language local file search. (#118)
Offline-First Loading: Models now prioritize local cache before attempting online downloads. (#137)
Smarter Ranking: Switched to random sampling in search ranking to reduce bias and improve recall. (#123)
Enhanced API: Added SSE streaming for real-time log output and optimized core dependencies. (#118)

🐞 Bug Fixes

IME Handling: Fixed accidental message sending when pressing "Enter" during Chinese/CJK input. (#128)
Parsing & Prompts: Improved LLM response parsing and fixed KeyError in prompt formatting. (#130, #133)
CLI Stability: Fixed a startup crash caused by a missing default parameter in the print function. (#134)

📝 Others

Docs: Visual polish with emojis and updated integration guides. (#119)

发布日志 (中文精简版)

🚀 新特性

OpenClaw 技能集成：支持 AI Agent 调用 Sirchmunk 进行本地文件自然语言搜索。(#118)
离线优先加载：Embedding 模型优先从本地缓存读取，显著提升网络受限环境下的稳定性。(#137)
搜索排序优化：引入随机采样策略，减少位置偏见，提升搜索召回率。(#123)
API 增强：新增 SSE 流式日志输出，并精简了核心运行依赖。(#118)

🐞 修复

输入法优化：修复中文输入过程中按回车键意外发送消息的问题。(#128)
Prompt 格式化：修复了 Prompt 格式化导致的 KeyError 报错，并增强了对 LLM 返回 JSON 的解析。(#130, #133)
启动修复：修复了 CLI 启动时因参数缺失导致的崩溃问题。(#134)

📝 其它

文档更新：优化了 README 视觉效果及 OpenClaw 集成指南。(#119)

What's Changed

add recipes for openclaw by @wangxingjun778 in #118
update docs by @wangxingjun778 in #119
fix: Prevent key handling during composition events by @youngjuning in #128
Improve LLM rank candidate selection logic by @coffee3699 in #123
fix prompts by @wangxingjun778 in #130
add placeholder for keywords prompts by @wangxingjun778 in #133
fix: Add default value to _print function parameter by @xerrors in #134
Add offline-first loading for embedding utils by @wangxingjun778 in #137

New Contributors

@youngjuning made their first contribution in #128
@coffee3699 made their first contribution in #123
@xerrors made their first contribution in #134

Full Changelog: v0.0.6post1...v0.0.6post2

Contributors

wangxingjun778, youngjuning, and 2 other contributors

Assets 2

20 Mar 09:03

wangxingjun778

v0.0.6post1

f2d1ea1

v0.0.6post1

Release Notes (English)

🚀 New Features

SSE Log-Events: Integrated Server-Sent Events (SSE) for real-time log streaming and updated core library requirements for improved environment stability. (#116)
Thinking Content Logging: Added support for capturing and logging the "thinking" (reasoning) process of LLMs. (#109)
Embedding Configuration: Introduced flexible configuration options for embedding models. (#115)
MiniMax Upgrade: Upgraded the default MiniMax model to version M2.7. (#113)

🛠️ Bug Fixes

Context Retrieval Fix: Resolved a critical issue where the evidences list was empty when return_context=True. (#111)
API Routing: Corrected endpoint paths for api/v1 to ensure consistent routing. (#117)
Docker Fixes: Fixed docker run command errors and initialized better container execution flows. (#103)

📝 Documentation & Deployment

Docker Optimization: Comprehensive updates to Docker deployment instructions and the dedicated Docker README. (#102, #104)
MiniMax Guide: Added detailed configuration examples for the MiniMax LLM provider. (#108)
README Updates: General improvements and formatting updates to the main README. (#106)

版本更新说明 (Chinese)

🚀 新功能

SSE 日志事件：集成了 Server-Sent Events (SSE) 以实现实时日志流传输，并优化了核心环境依赖项。 (#116)
思维内容日志：新增对大模型“思维链”（Thinking Content）过程的日志记录支持。 (#109)
嵌入配置：引入了对 Embedding（嵌入）模型的自定义配置支持。 (#115)
MiniMax 升级：将默认的 MiniMax 模型版本升级至 M2.7。 (#113)

🛠️ 修复

上下文检索修复：修复了在开启 return_context=True 时，搜索证据（evidences）返回为空的问题。 (#111)
API 路由：修正了 api/v1 下的相关路径映射，确保接口访问正确。 (#117)
Docker 修复：解决了 docker run 运行异常问题，优化了容器启动逻辑。 (#103)

📝 文档与部署

Docker 部署优化：全面更新了 Docker 部署指南及 Docker 专用 README 文档。 (#102, #104)
MiniMax 配置指南：新增了 MiniMax LLM 服务商的详细配置示例。 (#108)
README 更新：对主文档 README 进行了常规优化与内容更新。 (#106)

What's Changed

update docker deployment in readme by @wangxingjun778 in #102
Fix docker run by @wangxingjun778 in #103
update docker readme by @wangxingjun778 in #104
Update readme by @wangxingjun778 in #106
docs: add MiniMax LLM provider configuration examples by @octo-patch in #108
add thinking content logging by @wangxingjun778 in #109
Fix evidences is empty when return_context=True by @wangxingjun778 in #111
feat: upgrade MiniMax default model to M2.7 by @octo-patch in #113
add sse log-events and fix core requirements by @wangxingjun778 in #116
[feat] embedding config by @suluyana in #115
fix paths for api/v1 by @wangxingjun778 in #117

New Contributors

@octo-patch made their first contribution in #108
@suluyana made their first contribution in #115

Full Changelog: v0.0.6...v0.0.6post1

Contributors

wangxingjun778, suluyana, and octo-patch

Assets 2

11 Mar 18:29

wangxingjun778

v0.0.6

ce2a7e0

v0.0.6

中文版本 (Chinese Version)

新特性

多轮对话支持：实现全模式多轮对话上下文管理，新增基于LLM的查询重写机制，支持 CHAT_HISTORY_MAX_TURNS 和 CHAT_HISTORY_MAX_TOKENS 配置，搜索操作默认token预算提升至128,000
文档摘要与跨语言检索：新增文档摘要流水线（分块/合并/重排序），支持跨语言关键词提取，优化目录扫描策略，新增聊天历史相关性过滤机制
Docker构建配置更新：新增 SIRCHMUNK_SEARCH_PATHS 等环境变量支持，entrypoint.sh 支持变量注入，新增 python-docx/openpyxl/python-pptx 等文档处理依赖

优化与修复

OpenAI Chat客户端重构：引入 _ProviderProfile 统一管理多提供商能力差异，支持基于 base_url 自动检测提供商，统一流式/非流式响应处理逻辑，新增 thinking_content 字段支持
OpenAI API兼容性修复：修复 enable_thinking 参数在标准OpenAI接口报错问题，实现提供商感知逻辑，仅对支持的模型注入相应参数

文档

新增钉钉群二维码

English Version (English Version)

New Features

Multi-turn Conversation Support: Implemented context management across all chat modes with LLM-based query rewriting; added CHAT_HISTORY_MAX_TURNS and CHAT_HISTORY_MAX_TOKENS configs; increased default search token budget to 128K
Document Summarization & Cross-lingual Retrieval: Added summarization pipeline (chunking/merging/reranking), cross-lingual keyword extraction, optimized directory scanning with stratified sampling, and chat history relevance filtering
Docker Build Configuration Update: Added support for SIRCHMUNK_SEARCH_PATHS and other env vars, updated entrypoint.sh for variable injection, added document processing dependencies (python-docx/openpyxl/python-pptx)

Refactors & Fixes

OpenAI Chat Client Refactor: Introduced _ProviderProfile to manage multi-provider capabilities, auto-detect provider via base_url, unified streaming/non-streaming handling, added optional thinking_content field
OpenAI API Compatibility Fix: Fixed enable_thinking parameter error with standard OpenAI API; implemented provider-aware logic to inject parameters only for supported backends

Documentation

Added DingTalk QR code

What's Changed

[Docs] Add dingtalk QR code by @wangxingjun778 in #91
[Feature] Add multi-turn conversation support by @wangxingjun778 in #93
fix: the parameter enable_thinking is unknown in standard OpenAI API … by @VictorZhang2014 in #88
[Refactor] Refactor openai chat by @wangxingjun778 in #95
[Feature] Add summary intent and trans-lang retrieval by @wangxingjun778 in #97
[Feature] Update docker building config by @wangxingjun778 in #98

New Contributors

@VictorZhang2014 made their first contribution in #88

Full Changelog: v0.0.5post2...v0.0.6

Contributors

VictorZhang2014 and wangxingjun778

Assets 2

10 Mar 12:15

wangxingjun778

v0.0.5post2

cc7fcee

v0.0.5post2

中文版本 (Chinese Version)

🚀 功能改进

LLM 响应质量门控 (Quality Gating)：新增了 LLM 提示词与搜索逻辑的质量评估环节。系统现在会自动判断生成的摘要内容是否达标，仅对高质量结果进行缓存，从而防止低质量内容进入持久化层。
增强 CLI 参数支持：为 sirchmunk serve 和 sirchmunk web serve 命令新增了 --work-path 参数支持，允许用户自定义工作目录。
优化 CLI 使用体验：改进了命令行界面的整体交互逻辑，提升了易用性。

🐞 Bug 修复

修复聊天意图识别：解决了对话过程中意图识别不准确的问题，提升了系统响应的精准度。

英文版本 (English Version)

🚀 New Features & Improvements

LLM Response Quality Gating: Introduced a quality evaluation step for LLM prompts and search logic. The system now determines if a generated summary is substantial enough to be cached, preventing the persistence of low-quality results.
Enhanced CLI Parameter Support: Added --work-path support for both sirchmunk serve and sirchmunk web serve commands, allowing users to specify a custom working directory.
Improve CLI Usage: Refined the command-line interface logic for a more seamless and flexible user experience.

🐞 Bug Fixes

Fix Chat Intent: Resolved issues with chat intent recognition to ensure more accurate and context-aware responses.

Full Changelog: v0.0.5post1...v0.0.5post2

Assets 2

09 Mar 16:00

wangxingjun778

v0.0.5post1

742d1d1

v0.0.5post1

中文版本 (Chinese Version)

🚀 新功能与改进

新增 SIRCHMUNK_SEARCH_PATHS 环境变量支持：
引入了全新的环境变量 SIRCHMUNK_SEARCH_PATHS，允许用户通过配置路径列表来定义搜索范围。
该功能增强了系统在处理分布式或多目录数据源时的灵活性，方便用户快速指定本地知识库的搜索路径。
优化知识复用机制 (Fix reuse)：
修复并优化了搜索结果的复用逻辑。现在系统能够更准确地识别和重用已有的知识簇（KnowledgeCluster），减少重复计算，显著提升连续查询下的响应速度和资源利用率。

🛠 修复与优化

增强了搜索路径解析的稳定性。
改进了搜索流水线中缓存处理的逻辑，确保复用过程中的数据一致性。

English Version

🚀 Features & Improvements

Added SIRCHMUNK_SEARCH_PATHS Environment Variable:
Introduced a new environment variable SIRCHMUNK_SEARCH_PATHS that allows users to define search scopes by providing a list of directory paths.
This enhancement provides greater flexibility for managing distributed or multi-directory data sources, making it easier to specify local knowledge base locations.
Optimized Knowledge Reuse (Fix reuse):
Refined and fixed the logic for reusing search results. The system can now more accurately identify and leverage existing KnowledgeClusters, reducing redundant computations and significantly improving response times for consecutive queries.

🛠 Bug Fixes & Refinement

Enhanced the stability of search path parsing.
Improved cache handling logic within the search pipeline to ensure data consistency during the reuse process.

Full Changelog: v0.0.5...v0.0.5post1

Assets 2

05 Mar 13:27

wangxingjun778

v0.0.5

9c86462

v0.0.5

中文版本 (Chinese)

⚠️ 破坏性变更 (Breaking Changes)

[Refactor] 重构 search() 入口函数返回类型 (#80)
改动内容：search() 函数现在统一返回 SearchContext 对象或str格式答案，不再返回原始元组或字符串；通过return_context为True or False(default) 来控制返回类型；同时，移除return_cluster参数，KnowledgeCluster返回对象被集成到SearchContext对象中。
影响：如果return_context=True，调用方需根据新的 SearchContext 结构调整代码以获取搜索结果和上下文。

🚀 新功能

[Feature] 引入统一的 SearchContext 结构 (#80)：通过标准化的上下文对象封装搜索结果、元数据和参考引用，提升 API 的可扩展性。

🐛 修复

[Fix] 增强 RAG 聊天稳定性 (#77)：为 RAG 对话引入了重试机制，并增加了细粒度的异常捕获，有效降低了因网络波动导致的请求失败率。
[Fix] 修复 MCP 启动问题 (#78)：解决了 FastMCP 模式下 mcp run 无法正常初始化的问题。

English Version

⚠️ Breaking Changes

[Refactor] Unified search() entry point return type (#80)
Change: The search() function now returns either a SearchContext object or a plain str answer, replacing the previous tuple/string return logic.
Return Logic: The return type is now controlled by the return_context parameter (defaults to False).
Deprecation: The return_cluster parameter has been removed. The KnowledgeCluster object is now integrated directly into the SearchContext structure.
Impact: If return_context=True is used, callers must update their code to handle the SearchContext object instead of the legacy return format.

🚀 Features

[Feature] Unified SearchContext structure (#80): Introduced a standardized context object to wrap search results, metadata, and references, improving API extensibility.

🐛 Bug Fixes

[Fix] Enhanced RAG chat stability (#77): Added retry logic and fine-grained exception handling for RAG-based chats to improve reliability against transient network issues.
[Fix] Fixed MCP execution (#78): Resolved an issue preventing the mcp run command from initializing correctly in FastMCP mode.

Full Changelog: v0.0.4post1...v0.0.5

Assets 2

03 Mar 03:50

wangxingjun778

v0.0.4post1

edb1b6f

v0.0.4post1

中文版本 (Chinese)

🚀 改进

[Refactor] 重构搜索入口函数 (#71)
FAST模式引入复用机制
优化代码结构，解耦 DEEP 模式逻辑并统一返回类型。

🐛 修复

[Fix] 修复 PyPI 安装下的 Web 初始化 (#76)
支持从 site-packages 拷贝源码至可写缓存区，解决安装后无法构建前端的问题。
完善打包配置，确保 Web 静态资源正确包含在发布包中。

English Version

🚀 Improvements

[Refactor] Refactor search func (#71)
Add knowledge reuse for FAST search mode.
Decoupled DEEP mode logic and unified search return types for better maintainability.

🐛 Bug Fixes

[Fix] Fix web init for pypi installation (#76)
Enabled building from a writable cache to support non-editable PyPI installations.
Refined packaging configuration to ensure _web assets are correctly bundled.

Assets 2

Releases: modelscope/sirchmunk

v0.0.7post1

Implementation & Logic Updates

What's Changed

Contributors

Uh oh!

v0.0.7

英文版本 (English Version)

🚀 New Features

🛠 Improvements & Fixes

📝 Documentation & Others

中文版本 (Chinese Version)

🚀 新功能

🛠 优化与修复

📝 文档与其它

What's Changed

New Contributors

Contributors

Uh oh!

v0.0.6post3

中文版 (Chinese)

英文版 (English)

What's Changed

Contributors

Uh oh!

v0.0.6post2

Release Notes (English)

🚀 New Features

🐞 Bug Fixes

📝 Others

发布日志 (中文精简版)

🚀 新特性

🐞 修复

📝 其它

What's Changed

New Contributors

Contributors

Uh oh!

v0.0.6post1

Release Notes (English)

🚀 New Features

🛠️ Bug Fixes

📝 Documentation & Deployment

版本更新说明 (Chinese)

🚀 新功能

🛠️ 修复

📝 文档与部署

What's Changed

New Contributors

Contributors

Uh oh!

v0.0.6

中文版本 (Chinese Version)

新特性

优化与修复

文档

English Version (English Version)

New Features

Refactors & Fixes

Documentation

What's Changed

New Contributors

Contributors

Uh oh!

v0.0.5post2

中文版本 (Chinese Version)

🚀 功能改进

🐞 Bug 修复

英文版本 (English Version)

🚀 New Features & Improvements

🐞 Bug Fixes

Uh oh!

v0.0.5post1

中文版本 (Chinese Version)

🚀 新功能与改进

🛠 修复与优化

English Version

🚀 Features & Improvements

🛠 Bug Fixes & Refinement

Uh oh!