Skip to content

Fix DependencyLayers race condition crash on missing dependencies#3790

Merged
robacourt merged 3 commits intomainfrom
rob/fix-dependency-layers-race
Jan 28, 2026
Merged

Fix DependencyLayers race condition crash on missing dependencies#3790
robacourt merged 3 commits intomainfrom
rob/fix-dependency-layers-race

Conversation

@robacourt
Copy link
Copy Markdown
Contributor

@robacourt robacourt commented Jan 28, 2026

Summary

Fixes #3788 — when a dependency shape is removed (due to crash/cleanup) before its dependent shape finishes registering, DependencyLayers.add_after_dependencies/3 crashed with FunctionClauseError, taking down the ShapeLogCollector and cascading into OOM failures.

This is the companion fix to #3789, which addressed the Materializer duplicate-key crash that triggered the dependency removal in the first place.

The race condition

  1. Dependency shape A is added to DependencyLayers
  2. Dependent shape B starts its Consumer, checks all_materializers_alive? → passes
  3. Shape A's Materializer crashes (e.g. duplicate key, fixed by Fix Materializer to handle duplicate records idempotently #3789)
  4. Shape A's Consumer terminates and calls remove_shape
  5. Shape B's Consumer calls add_shape
  6. RequestBatcher batches both operations together
  7. ShapeLogCollector processes the batch — removals first, then additions
  8. Shape A is removed from layers → layers becomes [[]]
  9. Shape B tries to add with dependency on A → CRASH

ECS logs from production (27 Jan 2026)

18:10:10.437 [error] GenServer Materializer "97489818-..." terminating
** (RuntimeError) Key "public"."offers"/"d3c8d8a5-..." already exists

18:10:10.449 [error] GenServer ShapeLogCollector terminating
** (FunctionClauseError) no function clause matching in Electric.Shapes.DependencyLayers.add_after_dependencies/3
    (electric) lib/electric/shapes/dependency_layers.ex
    args: [[], "30042127-...", MapSet.new(["97489818-..."])]

The 12ms gap between crashes shows the sequence: Materializer crash → Consumer termination → remove_shape batched with add_shape → removal processed first → crash on addition.

Changes

  • DependencyLayers.add_dependency/3 now returns {:ok, layers} or {:error, {:missing_dependencies, missing}} instead of crashing
  • add_after_dependencies/3 gets a new clause for exhausted layers with unfound dependencies
  • ShapeLogCollector handles the error gracefully — logs a warning and fails the shape registration instead of crashing
  • handle_continue(:restore_shapes) pattern-matches {:ok, layers} (crashes on error, which is correct since restored shapes should always have valid dependencies)

Test plan

  • Updated existing DependencyLayersTest tests for new return type
  • Added error handling tests: missing dependency, removed dependency, partially missing dependencies, ok-tuple for no-dependency shapes
  • All 8 dependency layers tests pass
  • All 24 shape log collector tests pass

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • Bug Fixes
    • Fixed a race condition that caused crashes when shape dependencies were unavailable during registration or restoration
    • Improved error handling for missing dependencies with graceful recovery instead of system failures

✏️ Tip: You can customize this high-level summary in your review settings.

…removed

When a dependency shape's materializer crashes and is removed while a
dependent shape is being registered, add_after_dependencies/3 would crash
with FunctionClauseError due to a missing clause. This cascaded into
ShapeLogCollector termination and OOM failures.

add_dependency/3 now returns {:ok, layers} | {:error, {:missing_dependencies, missing}},
and ShapeLogCollector handles the error gracefully.

Fixes #3788

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Jan 28, 2026

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

  • 🔍 Trigger a full review
📝 Walkthrough

Walkthrough

This patch fixes a race condition in dependency layer management where a shape dependency can be removed before its dependent shape is registered, previously causing a FunctionClauseError crash. The system now returns error tuples for missing dependencies instead of crashing, enabling graceful error handling.

Changes

Cohort / File(s) Summary
Changeset Documentation
\.changeset/fix-dependency-layers-race\.md
Documents the patch for the dependency layering race condition, clarifying the new success/error return shape and runtime impact.
Dependency Layer Logic
packages/sync-service/lib/electric/shapes/dependency_layers\.ex
Modified add_dependency/3 and add_after_dependencies/3 to return {:ok, layers} or {:error, {:missing_dependencies, missing}} tuples instead of bare lists. Added new function clause to handle missing dependencies explicitly.
Error Handling Integration
packages/sync-service/lib/electric/replication/shape_log_collector\.ex
Updated shape restoration and registration flows to handle dependency resolution errors gracefully. Missing dependencies now log a warning and mark shapes with error status instead of crashing the collector.
Test Coverage
packages/sync-service/test/electric/shapes/dependency_layers_test\.exs
Introduced private helper add_dependency!/3 and new "error handling" test suite covering missing dependencies, removal-before-addition race, and partial dependency scenarios.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • icehaunter
  • magnetised

Poem

🐰 A race we've caught, a crash we've fixed,
Dependencies mixed, now error-picked!
No more does the system fall and break,
When shapes are removed for removal's sake. 🌿

🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately and concisely summarizes the main change: fixing a race condition in DependencyLayers that causes crashes when dependencies are missing.
Linked Issues check ✅ Passed The code changes fully address all requirements from #3788: add_dependency now returns error tuples for missing dependencies, add_after_dependencies handles exhausted layers with missing deps, and ShapeLogCollector gracefully handles missing-dependencies errors.
Out of Scope Changes check ✅ Passed All changes are directly related to fixing the race condition described in #3788. File modifications address dependency layer error handling, shape log collector integration, and comprehensive test coverage for the new behavior.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov bot commented Jan 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.34%. Comparing base (2f11f91) to head (9595ba6).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@             Coverage Diff             @@
##             main    #3790       +/-   ##
===========================================
+ Coverage   75.75%   87.34%   +11.58%     
===========================================
  Files          11       23       +12     
  Lines         693     2039     +1346     
  Branches      172      542      +370     
===========================================
+ Hits          525     1781     +1256     
- Misses        167      256       +89     
- Partials        1        2        +1     
Flag Coverage Δ
packages/experimental 87.73% <ø> (ø)
packages/react-hooks 86.48% <ø> (ø)
packages/start 82.83% <ø> (ø)
packages/typescript-client 93.31% <ø> (?)
packages/y-electric 56.05% <ø> (ø)
typescript 87.34% <ø> (+11.58%) ⬆️
unit-tests 87.34% <ø> (+11.58%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@robacourt
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Jan 28, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@packages/sync-service/lib/electric/shapes/dependency_layers.ex`:
- Around line 29-36: The guard in the second clause of add_after_dependencies is
using map_size(deps_to_find) but deps_to_find is a MapSet; change the guard to
use MapSet.size/1 so the pattern match works correctly (i.e., replace
map_size(deps_to_find) == 0 with MapSet.size(deps_to_find) == 0 in the
add_after_dependencies/3 clause that handles the empty list case).
🧹 Nitpick comments (1)
packages/sync-service/test/electric/shapes/dependency_layers_test.exs (1)

260-267: Prefer a single structural assert for the ok tuple and layers.

This can be expressed as one pattern‑matching assert for clearer intent.

♻️ Suggested refactor
-      assert {:ok, layers} =
-               DependencyLayers.new()
-               |> DependencyLayers.add_dependency(shape, `@inner_shape_handle`)
-
-      assert layers == [MapSet.new([`@inner_shape_handle`])]
+      assert {:ok, [^MapSet.new([`@inner_shape_handle`])]} =
+               DependencyLayers.new()
+               |> DependencyLayers.add_dependency(shape, `@inner_shape_handle`)
As per coding guidelines: Prefer a single structural assert with pattern matching for returned structures in tests rather than multiple asserts on individual elements.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2f11f91 and 8fb1129.

📒 Files selected for processing (4)
  • .changeset/fix-dependency-layers-race.md
  • packages/sync-service/lib/electric/replication/shape_log_collector.ex
  • packages/sync-service/lib/electric/shapes/dependency_layers.ex
  • packages/sync-service/test/electric/shapes/dependency_layers_test.exs
🧰 Additional context used
📓 Path-based instructions (3)
packages/sync-service/lib/electric/shapes/**/*.{ex,exs}

📄 CodeRabbit inference engine (packages/sync-service/AGENTS.md)

Shapes should be defined and managed in lib/electric/shapes/* for lifecycle management (creation, handles, checkpoints)

Files:

  • packages/sync-service/lib/electric/shapes/dependency_layers.ex
packages/sync-service/lib/electric/replication/**/*.{ex,exs}

📄 CodeRabbit inference engine (packages/sync-service/AGENTS.md)

Replication logic consuming WAL and producing change events should be implemented in lib/electric/replication/*

Files:

  • packages/sync-service/lib/electric/replication/shape_log_collector.ex
packages/sync-service/test/**/*.{exs,ex}

📄 CodeRabbit inference engine (packages/sync-service/AGENTS.md)

Prefer a single structural assert with pattern matching for returned structures in tests rather than multiple asserts on individual elements

Files:

  • packages/sync-service/test/electric/shapes/dependency_layers_test.exs
🧠 Learnings (8)
📓 Common learnings
Learnt from: CR
Repo: electric-sql/electric PR: 0
File: packages/sync-service/AGENTS.md:0-0
Timestamp: 2026-01-22T19:39:13.351Z
Learning: Applies to packages/sync-service/lib/electric/shapes/**/*.{ex,exs} : Shapes should be defined and managed in `lib/electric/shapes/*` for lifecycle management (creation, handles, checkpoints)
📚 Learning: 2026-01-22T19:39:13.351Z
Learnt from: CR
Repo: electric-sql/electric PR: 0
File: packages/sync-service/AGENTS.md:0-0
Timestamp: 2026-01-22T19:39:13.351Z
Learning: Applies to packages/sync-service/lib/electric/shape_cache/**/*.{ex,exs} : Shape cache persistence and snapshot/changelog serving logic should be in `lib/electric/shape_cache/*`

Applied to files:

  • .changeset/fix-dependency-layers-race.md
  • packages/sync-service/lib/electric/shapes/dependency_layers.ex
  • packages/sync-service/lib/electric/replication/shape_log_collector.ex
  • packages/sync-service/test/electric/shapes/dependency_layers_test.exs
📚 Learning: 2026-01-22T19:39:13.351Z
Learnt from: CR
Repo: electric-sql/electric PR: 0
File: packages/sync-service/AGENTS.md:0-0
Timestamp: 2026-01-22T19:39:13.351Z
Learning: Applies to packages/sync-service/lib/electric/shapes/**/*.{ex,exs} : Shapes should be defined and managed in `lib/electric/shapes/*` for lifecycle management (creation, handles, checkpoints)

Applied to files:

  • .changeset/fix-dependency-layers-race.md
  • packages/sync-service/lib/electric/shapes/dependency_layers.ex
  • packages/sync-service/lib/electric/replication/shape_log_collector.ex
  • packages/sync-service/test/electric/shapes/dependency_layers_test.exs
📚 Learning: 2026-01-22T19:39:13.351Z
Learnt from: CR
Repo: electric-sql/electric PR: 0
File: packages/sync-service/AGENTS.md:0-0
Timestamp: 2026-01-22T19:39:13.351Z
Learning: Applies to packages/sync-service/lib/electric/application.ex : Changes to `Electric.Application` APIs (`api/1`, `api_plug_opts/1`, `configuration/1`) can break downstream consumers including elixir-client and Phoenix.Sync

Applied to files:

  • packages/sync-service/lib/electric/shapes/dependency_layers.ex
  • packages/sync-service/lib/electric/replication/shape_log_collector.ex
  • packages/sync-service/test/electric/shapes/dependency_layers_test.exs
📚 Learning: 2026-01-22T19:39:13.351Z
Learnt from: CR
Repo: electric-sql/electric PR: 0
File: packages/sync-service/AGENTS.md:0-0
Timestamp: 2026-01-22T19:39:13.351Z
Learning: Applies to packages/sync-service/lib/electric/replication/**/*.{ex,exs} : Replication logic consuming WAL and producing change events should be implemented in `lib/electric/replication/*`

Applied to files:

  • packages/sync-service/lib/electric/replication/shape_log_collector.ex
📚 Learning: 2026-01-22T19:39:13.351Z
Learnt from: CR
Repo: electric-sql/electric PR: 0
File: packages/sync-service/AGENTS.md:0-0
Timestamp: 2026-01-22T19:39:13.351Z
Learning: Applies to packages/sync-service/lib/electric/plug/**/*.{ex,exs} : HTTP API endpoints should be located in `lib/electric/plug/*` and serve shape requests and health checks

Applied to files:

  • packages/sync-service/lib/electric/replication/shape_log_collector.ex
📚 Learning: 2026-01-22T19:39:13.351Z
Learnt from: CR
Repo: electric-sql/electric PR: 0
File: packages/sync-service/AGENTS.md:0-0
Timestamp: 2026-01-22T19:39:13.351Z
Learning: Applies to packages/sync-service/{lib/electric/postgres/**/*.{ex,exs},lib/pg_interop/**/*.{ex,exs}} : Postgres integration including connections, identifiers, and type handling should be in `lib/electric/postgres/*` and `lib/pg_interop/*`

Applied to files:

  • packages/sync-service/lib/electric/replication/shape_log_collector.ex
📚 Learning: 2026-01-22T19:39:13.351Z
Learnt from: CR
Repo: electric-sql/electric PR: 0
File: packages/sync-service/AGENTS.md:0-0
Timestamp: 2026-01-22T19:39:13.351Z
Learning: Applies to packages/sync-service/test/**/*.{exs,ex} : Prefer a single structural assert with pattern matching for returned structures in tests rather than multiple asserts on individual elements

Applied to files:

  • packages/sync-service/test/electric/shapes/dependency_layers_test.exs
🧬 Code graph analysis (2)
packages/sync-service/lib/electric/replication/shape_log_collector.ex (2)
packages/sync-service/lib/electric/shapes/dependency_layers.ex (1)
  • add_dependency (6-14)
packages/sync-service/lib/electric/replication/shape_log_collector/request_batcher.ex (2)
  • add_shape (58-60)
  • add_shape (63-65)
packages/sync-service/test/electric/shapes/dependency_layers_test.exs (1)
packages/sync-service/lib/electric/shapes/dependency_layers.ex (2)
  • add_dependency (6-14)
  • remove_dependency (39-42)
🔇 Additional comments (7)
.changeset/fix-dependency-layers-race.md (1)

5-9: Changelog clarity looks good.

The changeset concisely captures the race condition and the new error handling contract.

packages/sync-service/lib/electric/shapes/dependency_layers.ex (2)

16-25: Good error propagation through recursion.

The new case statement cleanly preserves {:error, ...} while rebuilding layers on success.


6-13: Fix the guard clause on line 41: use MapSet.size instead of map_size.

All call sites correctly handle the {:ok, layers}/{:error, ...} return type. However, line 41 contains a bug: map_size(deps_to_find) is called on a MapSet (created on line 11 with MapSet.new(dependency_handles)), which will cause a FunctionClauseError at runtime. Use MapSet.size(deps_to_find) instead.

Likely an incorrect or invalid review comment.

packages/sync-service/test/electric/shapes/dependency_layers_test.exs (2)

23-26: Helper simplifies test setup.

The add_dependency!/3 helper improves readability and keeps assertions consistent.


196-258: Good coverage of missing‑dependency scenarios.

These tests exercise the new error paths and removal‑before‑add race.

packages/sync-service/lib/electric/replication/shape_log_collector.ex (2)

254-263: Restore path enforcement aligns with intent.

Crashing on missing dependencies during restore keeps persisted state integrity strict.


362-389: Graceful handling of missing dependencies is well‑scoped.

Warn‑and‑fail behavior for registration avoids collector crashes while keeping state consistent.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +256 to +258
# Crash if dependencies are missing during restore - all persisted shapes
# should have their dependencies present since they were valid when saved.
{:ok, layers} = DependencyLayers.add_dependency(layers, shape, shape_handle)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a valid assumption?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say yes but I'd prefer a different behaviour - there are cases where one shape might not get restored for one reason or another, and it seems like we should be dropping their children subtree, not saying that all persisted shapes have failed

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea @icehaunter . done

@robacourt robacourt requested a review from magnetised January 28, 2026 15:19
Copy link
Copy Markdown
Contributor

@icehaunter icehaunter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if you address that comment

robacourt and others added 2 commits January 28, 2026 15:31
When a parent shape fails to restore for some reason, gracefully skip
its children rather than crashing. This allows partial restoration to
succeed while logging warnings for the skipped shapes.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@blacksmith-sh

This comment has been minimized.

@robacourt robacourt merged commit 2a0902e into main Jan 28, 2026
50 of 52 checks passed
@robacourt robacourt deleted the rob/fix-dependency-layers-race branch January 28, 2026 16:42
@github-actions
Copy link
Copy Markdown
Contributor

This PR has been released! 🚀

The following packages include changes from this PR:

  • @core/sync-service@1.4.2

Thanks for contributing to Electric!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DependencyLayers crashes when dependency shape is removed before dependent shape is added

2 participants