Skip to content

feat(ctl): add GCCatalogCache command for on-demand CN catalog cache GC#24264

Open
aptend wants to merge 2 commits intomatrixorigin:mainfrom
aptend:feat/gc-catalog-cache-cmd
Open

feat(ctl): add GCCatalogCache command for on-demand CN catalog cache GC#24264
aptend wants to merge 2 commits intomatrixorigin:mainfrom
aptend:feat/gc-catalog-cache-cmd

Conversation

@aptend
Copy link
Copy Markdown
Contributor

@aptend aptend commented Apr 30, 2026

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #24263

What this PR does / why we need it:

Adds mo_ctl('cn', 'GCCatalogCache', '<minutes>') to trigger immediate GC of the CN catalog cache. Currently GC only runs on a 20-min ticker with no manual trigger. This helps during debugging, after bulk DDL, or memory pressure investigation.

Default cleans entries older than 20 minutes; parameter sets custom minutes (minimum 1).

Changes:

  • pkg/vm/engine/types.go: new CatalogCacheGCer interface
  • pkg/vm/engine/disttae/engine.go: implement GCCatalogCache
  • pkg/sql/plan/function/ctl/types.go: register method
  • pkg/sql/plan/function/ctl/cmd_gc_catalog.go: handler
  • pkg/sql/plan/function/ctl/cmd_gc_catalog_test.go: unit tests

Add mo_ctl('cn', 'GCCatalogCache', '<minutes>') to trigger immediate
garbage collection of the CN in-memory catalog cache. Default cleans
entries older than 20 minutes; minimum is 1 minute.

Fixes matrixorigin#24263
Copilot AI review requested due to automatic review settings April 30, 2026 09:48
@qodo-code-review
Copy link
Copy Markdown

ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new mo_ctl command to trigger on-demand garbage collection of the CN in-memory catalog cache, complementing the existing periodic GC ticker in disttae.

Changes:

  • Introduces an optional engine interface (CatalogCacheGCer) for manual catalog cache GC.
  • Implements GCCatalogCache on disttae.Engine to GC by “entries older than X duration”.
  • Registers GCCatalogCache in mo_ctl and adds a dedicated handler + unit tests.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pkg/vm/engine/types.go Adds CatalogCacheGCer optional interface for engines supporting on-demand catalog cache GC.
pkg/vm/engine/disttae/engine.go Implements GCCatalogCache on disttae engine by computing a cutoff timestamp and calling catalog.GC().
pkg/sql/plan/function/ctl/types.go Registers the new GCCatalogCache ctl method name and handler.
pkg/sql/plan/function/ctl/cmd_gc_catalog.go Implements the ctl handler: parses minutes parameter, clamps minimum, invokes engine GC.
pkg/sql/plan/function/ctl/cmd_gc_catalog_test.go Adds unit tests for service restriction, default/custom minutes, clamp, invalid param, and unsupported engine.

Comment thread pkg/sql/plan/function/ctl/cmd_gc_catalog.go Outdated
Comment thread pkg/sql/plan/function/ctl/cmd_gc_catalog.go
Comment thread pkg/sql/plan/function/ctl/cmd_gc_catalog_test.go
@Ariznawlll
Copy link
Copy Markdown
Contributor

/auto-test-pr

@Ariznawlll
Copy link
Copy Markdown
Contributor

PR #24264 测试覆盖分析

变更摘要

本 PR 新增 mo_ctl('cn', 'GCCatalogCache', '<minutes>') 命令,实现对 CN catalog cache 的手动垃圾回收(GC)。主要改动包括:

  • 新增 CatalogCacheGCer 接口(pkg/vm/engine/types.go
  • disttae 引擎实现 GCCatalogCache 方法(pkg/vm/engine/disttae/engine.go
  • 注册 ctl 命令处理器(pkg/sql/plan/function/ctl/types.go
  • 新增命令处理逻辑及单元测试(pkg/sql/plan/function/ctl/cmd_gc_catalog.gocmd_gc_catalog_test.go

6 类测试覆盖情况

测试类型 覆盖状态 说明
BVT ⚠️ 新增 ctl 命令,需补充 SQL 层 BVT case 验证 mo_ctl('cn', 'GCCatalogCache', ...) 行为(目前仅有 Go 单元测试)。
稳定性 此功能为调试/手动 GC,无需长时间稳定性测试。
Chaos 不涉及故障注入场景。
大数据 ⚠️ Catalog cache GC 可能影响大规模 DDL 后内存释放,建议补充大数据场景下 bulk DDL 后 GC 命令行为验证。
PITR 与 PITR 备份恢复无关。
Snapshot 与 Snapshot 备份恢复无关。

图例: ✅ 已覆盖 ⚠️ 需补充 ➖ 不相关

建议

BVT 补充建议

新增 SQL 层 BVT case,覆盖以下场景:

  • 默认参数:select mo_ctl('cn', 'GCCatalogCache', '');
  • 自定义参数:select mo_ctl('cn', 'GCCatalogCache', '5');
  • 错误参数:select mo_ctl('cn', 'GCCatalogCache', 'abc');
  • 非 CN 调用:select mo_ctl('tn', 'GCCatalogCache', '');

示例 SQL:

-- 默认 GC
select mo_ctl('cn', 'GCCatalogCache', '');

-- 自定义分钟
select mo_ctl('cn', 'GCCatalogCache', '5');

-- 错误参数
select mo_ctl('cn', 'GCCatalogCache', 'abc');

-- 非 CN 调用
select mo_ctl('tn', 'GCCatalogCache', '');

大数据测试建议

建议在大数据测试场景(如大规模 DDL 后)插入 GC 命令,验证 catalog cache 能及时释放内存。例如:

  • 在 bulk DDL 后执行 select mo_ctl('cn', 'GCCatalogCache', '1');,观察内存变化及命令返回。

示例 SQL:

-- 大数据场景下 bulk DDL 后
CREATE TABLE t1(id INT);
-- ...批量 DDL/数据加载...
select mo_ctl('cn', 'GCCatalogCache', '1');

🤖 自动生成的测试 PR

➖ BVT:已有重复或高度相似测试,跳过新增。生成文件 test/distributed/cases/function/ctl/ctl_gc_catalog_cache_basic.sql 命中已有文件 test/distributed/cases/function/mo_ctl/prefetch_on_subscribed.sql(相似度 1.00)。
✅ 大数据 PR 已提交:https://github.com/Ariznawlll/mo-nightly-regression/pull/8


由 AI Test Analyzer 自动生成 · workflow run

Copy link
Copy Markdown
Contributor

@aunjgr aunjgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

Blocking

  1. Data race on CatalogCache.GC()tidwall/btree is not goroutine-safe. The background GC ticker runs from a single goroutine, but this new ctl command can be invoked from any user session concurrently. Two simultaneous calls (or manual + background ticker) will race on btree Scan/Delete. Either add a mutex in CatalogCache.GC(), serialize via channel to the existing GC goroutine, or add a lock in Engine.GCCatalogCache.

  2. Integer overflow on large parametertime.Duration(minutes) * time.Minute overflows int64 when minutes > ~153,722,867. Overflow flips the duration negative, making the timestamp time.Now() - (negative) = future timestamp, which causes GC() to delete all catalog cache entries. Need an upper bound clamp (e.g., if minutes > 525600 { minutes = 525600 }).

Advisory

  • Default 20min is more aggressive than the background GC's 1-hour cutoff (gcPartitionStateTimer). Worth documenting so operators understand implications.
  • GCReport return value from e.catalog.Load().GC() is discarded — returning stats ("GC'd N table entries, M database entries") in the response would make this command useful for debugging.
  • CatalogCacheGCer interface method should return error for forward-compatibility (context cancellation, etc).
  • No test for concurrent invocation or negative input values.

@aptend
Copy link
Copy Markdown
Contributor Author

aptend commented May 6, 2026

@aunjgr Addressed the review in 3f56c5b.

Blocking:

  • Added CatalogCache.gcMu and lock CatalogCache.GC() itself, so manual GC and the background ticker cannot concurrently scan/delete the underlying tidwall/btree.
  • Added an upper bound before time.Duration(minutes) * time.Minute; too-large values now return ErrInvalidInput instead of overflowing.

Advisory:

  • Changed CatalogCacheGCer.GCCatalogCache to return error and propagated it through the ctl handler.
  • Added tests for overflow input and concurrent CatalogCache.GC() invocation.

I kept the existing 20min default behavior unchanged and did not surface GCReport stats in this patch to keep the review fix scoped. Tested with:

CGO_CFLAGS="-I/home/mo/matrixone/thirdparties/install/include" \
CGO_LDFLAGS="-L/home/mo/matrixone/thirdparties/install/lib -Wl,-rpath,/home/mo/matrixone/thirdparties/install/lib" \
go test ./pkg/sql/plan/function/ctl ./pkg/vm/engine/disttae/cache

@aptend aptend requested a review from aunjgr May 6, 2026 06:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/feature size/M Denotes a PR that changes [100,499] lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants