fix(enrichment): preserve memory enrichment table state on reload#25547
fix(enrichment): preserve memory enrichment table state on reload#25547esensar wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6583fd70e4
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if diff.contains(&input.component) | ||
| || diff.is_changed(key) | ||
| || inputs_to_add.contains(&input) |
There was a problem hiding this comment.
Replace paused inputs instead of adding duplicates
When a regular sink or transform is changed but keeps the same unchanged upstream input, shutdown_diff pauses the old fanout entry rather than removing it; with this new diff.is_changed(key) branch, reconnecting now sends ControlMessage::Add instead of Replace. The fanout still contains the paused component key, so Fanout::add asserts on a duplicate output id and the topology can panic during a config reload of any changed component whose input did not change.
Useful? React with 👍 / 👎.
| if !self.diff.enrichment_tables.is_added(name) | ||
| && let Some(existing_table) = ENRICHMENT_TABLES.get(&table_name) | ||
| && existing_table.stateful() | ||
| && table.stateful() | ||
| { | ||
| match table.take_state(existing_table) { |
There was a problem hiding this comment.
Preserve state in the memory config cache too
For a changed memory enrichment table, this transfers the old handles only into the boxed table that will be inserted into ENRICHMENT_TABLES; however MemoryConfig::get_or_build_memory has already cached the freshly built empty Memory, and the rebuilt sink/source later clone that cached instance. After such a reload, transforms read the preserved registry table while the new memory sink writes to a different empty table, so post-reload inserts are no longer visible to enrichment lookups.
Useful? React with 👍 / 👎.
|
Thanks @esensar! Per our new policy I will come back to this once codex comments are resolved. |
Summary
While working on #25143, it was brought to my attention that reload was not handled for memory tables (#25143 (comment)) - that is, new components were generated, that were not attached to the tables that were queried. This PR resolves that by making these tables take over the state of previous components.
Vector configuration
How did you test this PR?
Ran vector with the above configuration and the
--watch-configflag. Changed TTL a couple of times and Vector properly reloaded and kept state, observed by seeing cached output data, instead of newly generated.Change Type
Is this a breaking change?
Does this PR include user facing changes?
no-changeloglabel to this PR.References
Notes
@vectordotdev/vectorto reach out to us regarding this PR.pre-pushhook, please see this template.make fmtmake check-clippy(if there are failures it's possible some of them can be fixed withmake clippy-fix)make testgit merge origin masterandgit push.Cargo.lock), pleaserun
make build-licensesto regenerate the license inventory and commit the changes (if any). More details on the dd-rust-license-tool.