Fix DependencyLayers race condition crash on missing dependencies#3790
Fix DependencyLayers race condition crash on missing dependencies#3790
Conversation
…removed
When a dependency shape's materializer crashes and is removed while a
dependent shape is being registered, add_after_dependencies/3 would crash
with FunctionClauseError due to a missing clause. This cascaded into
ShapeLogCollector termination and OOM failures.
add_dependency/3 now returns {:ok, layers} | {:error, {:missing_dependencies, missing}},
and ShapeLogCollector handles the error gracefully.
Fixes #3788
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the
📝 WalkthroughWalkthroughThis patch fixes a race condition in dependency layer management where a shape dependency can be removed before its dependent shape is registered, previously causing a FunctionClauseError crash. The system now returns error tuples for missing dependencies instead of crashing, enabling graceful error handling. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3790 +/- ##
===========================================
+ Coverage 75.75% 87.34% +11.58%
===========================================
Files 11 23 +12
Lines 693 2039 +1346
Branches 172 542 +370
===========================================
+ Hits 525 1781 +1256
- Misses 167 256 +89
- Partials 1 2 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@packages/sync-service/lib/electric/shapes/dependency_layers.ex`:
- Around line 29-36: The guard in the second clause of add_after_dependencies is
using map_size(deps_to_find) but deps_to_find is a MapSet; change the guard to
use MapSet.size/1 so the pattern match works correctly (i.e., replace
map_size(deps_to_find) == 0 with MapSet.size(deps_to_find) == 0 in the
add_after_dependencies/3 clause that handles the empty list case).
🧹 Nitpick comments (1)
packages/sync-service/test/electric/shapes/dependency_layers_test.exs (1)
260-267: Prefer a single structural assert for the ok tuple and layers.This can be expressed as one pattern‑matching assert for clearer intent.
As per coding guidelines: Prefer a single structural assert with pattern matching for returned structures in tests rather than multiple asserts on individual elements.♻️ Suggested refactor
- assert {:ok, layers} = - DependencyLayers.new() - |> DependencyLayers.add_dependency(shape, `@inner_shape_handle`) - - assert layers == [MapSet.new([`@inner_shape_handle`])] + assert {:ok, [^MapSet.new([`@inner_shape_handle`])]} = + DependencyLayers.new() + |> DependencyLayers.add_dependency(shape, `@inner_shape_handle`)
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
.changeset/fix-dependency-layers-race.mdpackages/sync-service/lib/electric/replication/shape_log_collector.expackages/sync-service/lib/electric/shapes/dependency_layers.expackages/sync-service/test/electric/shapes/dependency_layers_test.exs
🧰 Additional context used
📓 Path-based instructions (3)
packages/sync-service/lib/electric/shapes/**/*.{ex,exs}
📄 CodeRabbit inference engine (packages/sync-service/AGENTS.md)
Shapes should be defined and managed in
lib/electric/shapes/*for lifecycle management (creation, handles, checkpoints)
Files:
packages/sync-service/lib/electric/shapes/dependency_layers.ex
packages/sync-service/lib/electric/replication/**/*.{ex,exs}
📄 CodeRabbit inference engine (packages/sync-service/AGENTS.md)
Replication logic consuming WAL and producing change events should be implemented in
lib/electric/replication/*
Files:
packages/sync-service/lib/electric/replication/shape_log_collector.ex
packages/sync-service/test/**/*.{exs,ex}
📄 CodeRabbit inference engine (packages/sync-service/AGENTS.md)
Prefer a single structural assert with pattern matching for returned structures in tests rather than multiple asserts on individual elements
Files:
packages/sync-service/test/electric/shapes/dependency_layers_test.exs
🧠 Learnings (8)
📓 Common learnings
Learnt from: CR
Repo: electric-sql/electric PR: 0
File: packages/sync-service/AGENTS.md:0-0
Timestamp: 2026-01-22T19:39:13.351Z
Learning: Applies to packages/sync-service/lib/electric/shapes/**/*.{ex,exs} : Shapes should be defined and managed in `lib/electric/shapes/*` for lifecycle management (creation, handles, checkpoints)
📚 Learning: 2026-01-22T19:39:13.351Z
Learnt from: CR
Repo: electric-sql/electric PR: 0
File: packages/sync-service/AGENTS.md:0-0
Timestamp: 2026-01-22T19:39:13.351Z
Learning: Applies to packages/sync-service/lib/electric/shape_cache/**/*.{ex,exs} : Shape cache persistence and snapshot/changelog serving logic should be in `lib/electric/shape_cache/*`
Applied to files:
.changeset/fix-dependency-layers-race.mdpackages/sync-service/lib/electric/shapes/dependency_layers.expackages/sync-service/lib/electric/replication/shape_log_collector.expackages/sync-service/test/electric/shapes/dependency_layers_test.exs
📚 Learning: 2026-01-22T19:39:13.351Z
Learnt from: CR
Repo: electric-sql/electric PR: 0
File: packages/sync-service/AGENTS.md:0-0
Timestamp: 2026-01-22T19:39:13.351Z
Learning: Applies to packages/sync-service/lib/electric/shapes/**/*.{ex,exs} : Shapes should be defined and managed in `lib/electric/shapes/*` for lifecycle management (creation, handles, checkpoints)
Applied to files:
.changeset/fix-dependency-layers-race.mdpackages/sync-service/lib/electric/shapes/dependency_layers.expackages/sync-service/lib/electric/replication/shape_log_collector.expackages/sync-service/test/electric/shapes/dependency_layers_test.exs
📚 Learning: 2026-01-22T19:39:13.351Z
Learnt from: CR
Repo: electric-sql/electric PR: 0
File: packages/sync-service/AGENTS.md:0-0
Timestamp: 2026-01-22T19:39:13.351Z
Learning: Applies to packages/sync-service/lib/electric/application.ex : Changes to `Electric.Application` APIs (`api/1`, `api_plug_opts/1`, `configuration/1`) can break downstream consumers including elixir-client and Phoenix.Sync
Applied to files:
packages/sync-service/lib/electric/shapes/dependency_layers.expackages/sync-service/lib/electric/replication/shape_log_collector.expackages/sync-service/test/electric/shapes/dependency_layers_test.exs
📚 Learning: 2026-01-22T19:39:13.351Z
Learnt from: CR
Repo: electric-sql/electric PR: 0
File: packages/sync-service/AGENTS.md:0-0
Timestamp: 2026-01-22T19:39:13.351Z
Learning: Applies to packages/sync-service/lib/electric/replication/**/*.{ex,exs} : Replication logic consuming WAL and producing change events should be implemented in `lib/electric/replication/*`
Applied to files:
packages/sync-service/lib/electric/replication/shape_log_collector.ex
📚 Learning: 2026-01-22T19:39:13.351Z
Learnt from: CR
Repo: electric-sql/electric PR: 0
File: packages/sync-service/AGENTS.md:0-0
Timestamp: 2026-01-22T19:39:13.351Z
Learning: Applies to packages/sync-service/lib/electric/plug/**/*.{ex,exs} : HTTP API endpoints should be located in `lib/electric/plug/*` and serve shape requests and health checks
Applied to files:
packages/sync-service/lib/electric/replication/shape_log_collector.ex
📚 Learning: 2026-01-22T19:39:13.351Z
Learnt from: CR
Repo: electric-sql/electric PR: 0
File: packages/sync-service/AGENTS.md:0-0
Timestamp: 2026-01-22T19:39:13.351Z
Learning: Applies to packages/sync-service/{lib/electric/postgres/**/*.{ex,exs},lib/pg_interop/**/*.{ex,exs}} : Postgres integration including connections, identifiers, and type handling should be in `lib/electric/postgres/*` and `lib/pg_interop/*`
Applied to files:
packages/sync-service/lib/electric/replication/shape_log_collector.ex
📚 Learning: 2026-01-22T19:39:13.351Z
Learnt from: CR
Repo: electric-sql/electric PR: 0
File: packages/sync-service/AGENTS.md:0-0
Timestamp: 2026-01-22T19:39:13.351Z
Learning: Applies to packages/sync-service/test/**/*.{exs,ex} : Prefer a single structural assert with pattern matching for returned structures in tests rather than multiple asserts on individual elements
Applied to files:
packages/sync-service/test/electric/shapes/dependency_layers_test.exs
🧬 Code graph analysis (2)
packages/sync-service/lib/electric/replication/shape_log_collector.ex (2)
packages/sync-service/lib/electric/shapes/dependency_layers.ex (1)
add_dependency(6-14)packages/sync-service/lib/electric/replication/shape_log_collector/request_batcher.ex (2)
add_shape(58-60)add_shape(63-65)
packages/sync-service/test/electric/shapes/dependency_layers_test.exs (1)
packages/sync-service/lib/electric/shapes/dependency_layers.ex (2)
add_dependency(6-14)remove_dependency(39-42)
🔇 Additional comments (7)
.changeset/fix-dependency-layers-race.md (1)
5-9: Changelog clarity looks good.The changeset concisely captures the race condition and the new error handling contract.
packages/sync-service/lib/electric/shapes/dependency_layers.ex (2)
16-25: Good error propagation through recursion.The new case statement cleanly preserves {:error, ...} while rebuilding layers on success.
6-13: Fix the guard clause on line 41: useMapSet.sizeinstead ofmap_size.All call sites correctly handle the
{:ok, layers}/{:error, ...}return type. However, line 41 contains a bug:map_size(deps_to_find)is called on aMapSet(created on line 11 withMapSet.new(dependency_handles)), which will cause aFunctionClauseErrorat runtime. UseMapSet.size(deps_to_find)instead.Likely an incorrect or invalid review comment.
packages/sync-service/test/electric/shapes/dependency_layers_test.exs (2)
23-26: Helper simplifies test setup.The add_dependency!/3 helper improves readability and keeps assertions consistent.
196-258: Good coverage of missing‑dependency scenarios.These tests exercise the new error paths and removal‑before‑add race.
packages/sync-service/lib/electric/replication/shape_log_collector.ex (2)
254-263: Restore path enforcement aligns with intent.Crashing on missing dependencies during restore keeps persisted state integrity strict.
362-389: Graceful handling of missing dependencies is well‑scoped.Warn‑and‑fail behavior for registration avoids collector crashes while keeping state consistent.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| # Crash if dependencies are missing during restore - all persisted shapes | ||
| # should have their dependencies present since they were valid when saved. | ||
| {:ok, layers} = DependencyLayers.add_dependency(layers, shape, shape_handle) |
There was a problem hiding this comment.
Is this a valid assumption?
There was a problem hiding this comment.
I'd say yes but I'd prefer a different behaviour - there are cases where one shape might not get restored for one reason or another, and it seems like we should be dropping their children subtree, not saying that all persisted shapes have failed
icehaunter
left a comment
There was a problem hiding this comment.
LGTM if you address that comment
When a parent shape fails to restore for some reason, gracefully skip its children rather than crashing. This allows partial restoration to succeed while logging warnings for the skipped shapes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
|
This PR has been released! 🚀 The following packages include changes from this PR:
Thanks for contributing to Electric! |
Summary
Fixes #3788 — when a dependency shape is removed (due to crash/cleanup) before its dependent shape finishes registering,
DependencyLayers.add_after_dependencies/3crashed withFunctionClauseError, taking down theShapeLogCollectorand cascading into OOM failures.This is the companion fix to #3789, which addressed the Materializer duplicate-key crash that triggered the dependency removal in the first place.
The race condition
all_materializers_alive?→ passesremove_shapeadd_shapeRequestBatcherbatches both operations togetherShapeLogCollectorprocesses the batch — removals first, then additions[[]]ECS logs from production (27 Jan 2026)
The 12ms gap between crashes shows the sequence: Materializer crash → Consumer termination →
remove_shapebatched withadd_shape→ removal processed first → crash on addition.Changes
DependencyLayers.add_dependency/3now returns{:ok, layers}or{:error, {:missing_dependencies, missing}}instead of crashingadd_after_dependencies/3gets a new clause for exhausted layers with unfound dependenciesShapeLogCollectorhandles the error gracefully — logs a warning and fails the shape registration instead of crashinghandle_continue(:restore_shapes)pattern-matches{:ok, layers}(crashes on error, which is correct since restored shapes should always have valid dependencies)Test plan
DependencyLayersTesttests for new return type🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
✏️ Tip: You can customize this high-level summary in your review settings.