Conversation
Signed-off-by: Jianjun Liao <jianjun.liao@outlook.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Summary of ChangesHello @Leavrth, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new mechanism to retrieve the physical timestamp from PD, ensuring that time-dependent operations, particularly those related to generating date strings for cloud storage, use a consistent and incrementally reliable time source. By switching to PD physical time and UTC, the system enhances the accuracy and predictability of time-based functionalities, addressing potential issues with non-incremental time or timezone discrepancies. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new method CurrentPhysicalTS to the Clock interface to provide a time based on PD's physical timestamp. This is then used in the cloud storage sink to generate date-based file paths. The main goal seems to be to use a non-extrapolated, UTC-based time for this purpose.
My review has identified a few areas for improvement:
- The new
CurrentPhysicalTSmethod in the mainclockimplementation is not strictly incremental, which can be misleading given the PR title. I've provided a suggestion to make it strictly incremental if that's the desired behavior. - The test implementations of
CurrentPhysicalTSinClock4TestandmonotonicClockare inconsistent with the main implementation regarding the timezone (they return local time instead of UTC). I've suggested changes to align them. - A comment in
pkg/sink/cloudstorage/path.gohas become outdated due to the changes and should be removed or updated.
| func (c *clock) CurrentPhysicalTS() time.Time { | ||
| c.mu.RLock() | ||
| defer c.mu.RUnlock() | ||
| return time.UnixMilli(c.mu.physicalTs) | ||
| } |
There was a problem hiding this comment.
The function CurrentPhysicalTS is not strictly incremental as implemented, which contradicts the PR title "make sure time incremental". It returns the last physical timestamp from PD, so two successive calls within the PD update interval can return the same value.
If a strictly incremental UTC time is desired, you should add the elapsed time since the last update.
If a non-extrapolated, "safe" timestamp is the goal, the implementation is fine, but the test in clock_test.go should assert non-decreasing time (!pt1.After(pt2)) instead of strict increment to avoid flakiness.
Assuming strictly incremental time is desired, here is a suggestion:
| func (c *clock) CurrentPhysicalTS() time.Time { | |
| c.mu.RLock() | |
| defer c.mu.RUnlock() | |
| return time.UnixMilli(c.mu.physicalTs) | |
| } | |
| func (c *clock) CurrentPhysicalTS() time.Time { | |
| c.mu.RLock() | |
| defer c.mu.RUnlock() | |
| elapsed := time.Since(c.mu.tsProcessingTime).Milliseconds() | |
| physical := c.mu.physicalTs + elapsed | |
| return time.UnixMilli(physical) | |
| } |
| } | ||
|
|
||
| func (c *Clock4Test) CurrentPhysicalTS() time.Time { | ||
| return c.CurrentTime() |
There was a problem hiding this comment.
The main clock implementation of CurrentPhysicalTS returns a time.Time in UTC. However, this test implementation returns c.CurrentTime(), which returns a time in the local timezone. This inconsistency can lead to subtle issues in tests. To be consistent, this should also return a UTC time.
| return c.CurrentTime() | |
| return c.CurrentTime().UTC() |
| } | ||
|
|
||
| func (c *monotonicClock) CurrentPhysicalTS() time.Time { | ||
| return c.clock.Now() |
There was a problem hiding this comment.
The main clock implementation of CurrentPhysicalTS returns a time.Time in UTC. However, this implementation returns c.clock.Now(), which is in the local timezone. This inconsistency can lead to subtle issues. To be consistent, this should also return a UTC time.
| return c.clock.Now() | |
| return c.clock.Now().UTC() |
|
|
||
| currTime := f.pdClock.CurrentTime() | ||
| currTime := f.pdClock.CurrentPhysicalTS().UTC() | ||
| // Note: `dateStr` is formatted using local TZ. |
What problem does this PR solve?
Issue Number: close #4073
What is changed and how it works?
use PD physical time and UTC location instead.
Check List
Tests
Questions
Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?
Release note