Conversation
|
/test all |
Summary of ChangesHello @asddongmen, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request focuses on enhancing the performance of the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request aims to optimize resolved timestamp calculation in the log puller using caching and a min-heap. While logservice/logpuller/regionlock/region_range_lock.go changes are sound, a critical high-severity logic error exists in logservice/logpuller/region_event_handler.go. The min-heap update logic was incorrectly placed within a throttled block, leading to potential heap inconsistency. This can cause the system to report an incorrect resolved timestamp, potentially resulting in silent data loss in CDC systems like TiCDC. It is crucial to ensure the heap is updated consistently whenever a region's resolved timestamp changes to maintain data integrity.
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
📝 WalkthroughWalkthroughAdds a cached minimum resolved timestamp for unlocked ranges, introduces a tableID parameter to RangeLock, implements a hack/testing mode with a lock cap and background stats loop, updates call sites and tests to pass tableID, and adds benchmarks and helpers; no core control-flow changes outside range lock internals. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ❌ 3❌ Failed checks (2 warnings, 1 inconclusive)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Signed-off-by: dongmen <414110582@qq.com>
Signed-off-by: dongmen <414110582@qq.com>
Signed-off-by: dongmen <414110582@qq.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@logservice/logpuller/regionlock/region_range_lock.go`:
- Around line 491-496: The created buffered channel (ch) returned when
l.hackLimitLock is reached is never signaled/closed, causing callers of
LockRange's WaitFn to block; modify the hack path in region_range_lock.go (the
block that checks l.hackLimitLock and lockedRanges) to return a channel that is
closed (or otherwise signaled) so waiters unblock—e.g., create ch := make(chan
interface{},1); close(ch) before returning the []<-chan interface{} so
LockRangeResult{Status: LockRangeStatusWait} consumers and the WaitFn (the wait
logic used by LockRange) do not leak goroutines.
- Around line 138-140: Remove the WIP debug code: delete the hackLimitLock field
from the struct and remove all code paths that reference it (the hardcoded table
ID 930 checks, the logic that force-sets the limit to 2/30, and the goroutine
that logs every 30s). If this behavior must be preserved for testing, replace it
with a proper, documented configuration flag (e.g., DebugLimitLock or
LimitLockConfig) passed into RegionRangeLock via constructor/config instead of
hardcoded values; otherwise just eliminate the field and associated
branches/goroutine so no magic IDs or periodic test logging remain.
🧹 Nitpick comments (1)
logservice/logpuller/regionlock/region_range_lock.go (1)
307-320: Good optimization, but inconsistent withGetHeapMinTs().The
ResolvedTs()method now uses the cachedunlockedRangesMinTsfor O(1) lookup instead of scanning all unlocked ranges. However,GetHeapMinTs()(lines 470-472) still callsl.unlockedRanges.getMinTs()directly:// GetHeapMinTs (line 470) unlockedMinTs := l.unlockedRanges.getMinTs()Consider updating
GetHeapMinTs()to also use the cached value for consistency, or document why the difference is intentional.
|
|
||
| // HACK: for testing table 930, limit locked ranges to 30 | ||
| hackLimitLock int |
There was a problem hiding this comment.
Remove debug/hack code before merging.
The hackLimitLock field and associated logic (lines 159-164, 491-496, 169-192) are clearly WIP/debug code:
- Hardcoded magic table ID
930 - Comment says "limit locked ranges to 30" but code sets it to
2 - Starts a goroutine that logs every 30 seconds
This should either be removed or converted to a proper testing/debugging feature with configuration flags rather than hardcoded values.
🤖 Prompt for AI Agents
In `@logservice/logpuller/regionlock/region_range_lock.go` around lines 138 - 140,
Remove the WIP debug code: delete the hackLimitLock field from the struct and
remove all code paths that reference it (the hardcoded table ID 930 checks, the
logic that force-sets the limit to 2/30, and the goroutine that logs every 30s).
If this behavior must be preserved for testing, replace it with a proper,
documented configuration flag (e.g., DebugLimitLock or LimitLockConfig) passed
into RegionRangeLock via constructor/config instead of hardcoded values;
otherwise just eliminate the field and associated branches/goroutine so no magic
IDs or periodic test logging remain.
| // HACK: for testing table 930, limit locked ranges to hackLimitLock | ||
| if l.hackLimitLock > 0 && l.lockedRanges.Len() >= l.hackLimitLock { | ||
| ch := make(chan interface{}, 1) | ||
| // Return wait status so it keeps retrying but never succeeds | ||
| return LockRangeResult{Status: LockRangeStatusWait}, []<-chan interface{}{ch} | ||
| } |
There was a problem hiding this comment.
Goroutine leak: channel is never signaled.
This hack creates a channel that is never signaled or closed. When callers invoke WaitFn() (see lines 205-221 in LockRange), they will block forever waiting on this channel unless the context is cancelled.
ch := make(chan interface{}, 1)
// ch is never written to or closed
return LockRangeResult{Status: LockRangeStatusWait}, []<-chan interface{}{ch}This causes goroutine leaks for any lock attempt when hackLimitLock is reached. Even for debugging purposes, this should at least close the channel or signal it periodically.
🤖 Prompt for AI Agents
In `@logservice/logpuller/regionlock/region_range_lock.go` around lines 491 - 496,
The created buffered channel (ch) returned when l.hackLimitLock is reached is
never signaled/closed, causing callers of LockRange's WaitFn to block; modify
the hack path in region_range_lock.go (the block that checks l.hackLimitLock and
lockedRanges) to return a channel that is closed (or otherwise signaled) so
waiters unblock—e.g., create ch := make(chan interface{},1); close(ch) before
returning the []<-chan interface{} so LockRangeResult{Status:
LockRangeStatusWait} consumers and the WaitFn (the wait logic used by LockRange)
do not leak goroutines.
Signed-off-by: dongmen <414110582@qq.com>
Signed-off-by: dongmen <414110582@qq.com>
Signed-off-by: dongmen <414110582@qq.com>
|
[FORMAT CHECKER NOTIFICATION] Notice: To remove the 📖 For more info, you can check the "Contribute Code" section in the development guide. |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
logservice/logpuller/regionlock/range_ts_map.go (1)
27-34: Preserve hackMode in clones
rangeTsMap.clonedoesn’t copy the newhackModeflag, so a clone created afterhackSplitIntowill unexpectedly allow set/unset to mutate entries. Consider copying the flag to avoid surprising behavior in hack mode.🧩 Proposed fix
func (m *rangeTsMap) clone() (res *rangeTsMap) { res = &rangeTsMap{ m: btree.NewG(16, rangeTsEntryLess), start: m.start, end: m.end, + hackMode: m.hackMode, } m.m.Ascend(func(i rangeTsEntry) bool { res.m.ReplaceOrInsert(i) return true }) return }
🤖 Fix all issues with AI agents
In `@logservice/logpuller/regionlock/region_range_lock.go`:
- Around line 128-130: The hackSplitInto(...) call mutates the unlockedRanges
map but does not update the unlockedRangesMinTs cache; add an explicit cache
update after every hackSplitInto invocation to recompute/unify
unlockedRangesMinTs (same approach used in set() and unset()) so the cached
minimum resolved ts stays consistent with unlockedRanges; apply this fix in the
three places where hackSplitInto is called (the block around the first call, and
the other two blocks referenced) ensuring you hold mu while updating
unlockedRangesMinTs.
🧹 Nitpick comments (1)
logservice/logpuller/regionlock/range_ts_map.go (1)
234-250: Guard hackSplitInto against non-positive nIf
n <= 0, the map is cleared and left empty, which can make later reads returnMaxUint64. A small guard prevents accidental misuse.🛡️ Suggested guard
func (m *rangeTsMap) hackSplitInto(n int, ts uint64) { + if n <= 0 { + log.Panic("hackSplitInto requires n > 0") + } // Clear existing entries m.m.Clear(false)
| // unlockedRangesMinTs caches the minimum resolvedTs among unlocked ranges. | ||
| // It is protected by `mu`. | ||
| unlockedRangesMinTs uint64 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find all occurrences of unlockedRanges with context to understand mutation patterns
rg -n --type=go -B2 -A2 'unlockedRanges' logservice/logpuller/regionlock/region_range_lock.goRepository: pingcap/ticdc
Length of output: 3417
🏁 Script executed:
#!/bin/bash
# Also check if unlockedRanges is used in other files in the same package or elsewhere
fd -e go . logservice/logpuller/regionlock/ | xargs rg -l 'unlockedRanges' 2>/dev/nullRepository: pingcap/ticdc
Length of output: 109
🏁 Script executed:
#!/bin/bash
# Get the full context of the file to understand the structure
wc -l logservice/logpuller/regionlock/region_range_lock.goRepository: pingcap/ticdc
Length of output: 113
🏁 Script executed:
#!/bin/bash
# Find hackSplitInto implementation to understand if it changes min values
rg -n --type=go -B3 -A10 'func.*hackSplitInto'Repository: pingcap/ticdc
Length of output: 1271
🏁 Script executed:
#!/bin/bash
# Check if there are any other methods that mutate unlockedRanges outside of this file
rg -n --type=go 'unlockedRanges\s*=' logservice/logpuller/regionlock/ --type=goRepository: pingcap/ticdc
Length of output: 39
🏁 Script executed:
#!/bin/bash
# Search more broadly for any modifications to unlockedRanges (not just assignments)
rg -n --type=go '\bunlockedRanges\b' logservice/logpuller/regionlock/region_range_lock.go | head -30Repository: pingcap/ticdc
Length of output: 1144
🏁 Script executed:
#!/bin/bash
# Get the full hackSplitInto implementation
rg -n --type=go -A30 'func \(m \*rangeTsMap\) hackSplitInto'Repository: pingcap/ticdc
Length of output: 2530
🏁 Script executed:
#!/bin/bash
# Check if there's a getMinTs call on unlockedRanges anywhere else or if cache must be manually updated
rg -n --type=go 'getMinTs|getMin' logservice/logpuller/regionlock/range_ts_map.go | head -20Repository: pingcap/ticdc
Length of output: 176
🏁 Script executed:
#!/bin/bash
# Let's verify that hackSplitInto is only called once in constructor
rg -n --type=go 'hackSplitInto' logservice/logpuller/regionlock/Repository: pingcap/ticdc
Length of output: 386
🏁 Script executed:
#!/bin/bash
# Double-check: see the getMinTs implementation to ensure it's working correctly
rg -n --type=go -A15 'func \(m \*rangeTsMap\) getMinTs\(\)'Repository: pingcap/ticdc
Length of output: 1212
Add cache update after hackSplitInto mutation
The hackSplitInto() call at line 162 mutates unlockedRanges but does not update unlockedRangesMinTs. While the cache value happens to be correct (all entries created have ts=startTs), this violates the synchronization principle: every path that mutates unlockedRanges should update the cache. Add an explicit cache update after hackSplitInto() to maintain consistency with the pattern used in set() and unset().
Also applies to: 289-293, 524-528
🤖 Prompt for AI Agents
In `@logservice/logpuller/regionlock/region_range_lock.go` around lines 128 - 130,
The hackSplitInto(...) call mutates the unlockedRanges map but does not update
the unlockedRangesMinTs cache; add an explicit cache update after every
hackSplitInto invocation to recompute/unify unlockedRangesMinTs (same approach
used in set() and unset()) so the cached minimum resolved ts stays consistent
with unlockedRanges; apply this fix in the three places where hackSplitInto is
called (the block around the first call, and the other two blocks referenced)
ensuring you hold mu while updating unlockedRangesMinTs.
What problem does this PR solve?
Issue Number: close #xxx
What is changed and how it works?
Check List
Tests
Questions
Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?
Release note
Summary by CodeRabbit
Chores
Tests
✏️ Tip: You can customize this high-level summary in your review settings.