Skip to content

fix(replication): distribute destination omni among the subscribers#968

Merged
meskill merged 2 commits into
push-tmpxoryqppmpfrom
fix/replication-distribute
May 20, 2026
Merged

fix(replication): distribute destination omni among the subscribers#968
meskill merged 2 commits into
push-tmpxoryqppmpfrom
fix/replication-distribute

Conversation

@meskill
Copy link
Copy Markdown
Contributor

@meskill meskill commented May 7, 2026

Attempt to fix issue with test #924 by distributing the destination shards between streams, so they don't overlap and therefore can't deadlock each other. At the cost of consistency in case input omni tables are not equal in the content or the definition of the table changes from source to destination.

@meskill meskill force-pushed the push-tmpxoryqppmp branch from d395b91 to 2cc4e22 Compare May 10, 2026 21:13
@meskill meskill force-pushed the fix/replication-distribute branch from 2e9657c to c2a94ec Compare May 10, 2026 21:14
@blacksmith-sh

This comment has been minimized.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 10, 2026

Codecov Report

❌ Patch coverage is 96.62162% with 5 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...nd/replication/logical/publisher/publisher_impl.rs 0.00% 3 Missing ⚠️
...d/replication/logical/subscriber/omni_ownership.rs 98.63% 1 Missing ⚠️
...c/backend/replication/logical/subscriber/stream.rs 87.50% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@meskill meskill marked this pull request as ready for review May 10, 2026 22:52
Copy link
Copy Markdown
Collaborator

@levkk levkk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my preferred solution for sure. It doesn't sacrifice performance and fixes the original deadlock issue.

One thing to double check is that we send the WAL acknowledgement back to the source for data received via logical replication for omni tables that the connection "doesn't own"; I think this is not critical since other tables will receive writes too and that will move the WAL forward on the source, but it could be possible to have omni-only sharded clusters, theoretically.

@meskill
Copy link
Copy Markdown
Contributor Author

meskill commented May 19, 2026

One thing to double check is that we send the WAL acknowledgement back to the source for data received via logical replication for omni tables that the connection "doesn't own"; I think this is not critical since other tables will receive writes too and that will move the WAL forward on the source, but it could be possible to have omni-only sharded clusters, theoretically.

Should be ok, since the changes are made on the send level and they do not affect neither how acknowledgement to source is sent nor the lsn status update

@meskill meskill force-pushed the push-tmpxoryqppmp branch from 0361bc0 to 1cd4312 Compare May 19, 2026 18:27
@meskill meskill force-pushed the fix/replication-distribute branch from 20f7425 to c2c68c8 Compare May 19, 2026 21:58
@meskill meskill force-pushed the push-tmpxoryqppmp branch from 1cd4312 to 14ffc63 Compare May 19, 2026 23:01
@meskill meskill force-pushed the fix/replication-distribute branch from c2c68c8 to 3bf9f0b Compare May 19, 2026 23:02
@meskill meskill merged commit 09beb61 into push-tmpxoryqppmp May 20, 2026
24 checks passed
@meskill meskill deleted the fix/replication-distribute branch May 20, 2026 09:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants