Skip to content

Fix NullPointerException on concurrent GenericContainer.stop() calls#11914

Open
nikita-kibitkin wants to merge 1 commit into
testcontainers:mainfrom
nikita-kibitkin:fix-concurrent-stop-npe-11492
Open

Fix NullPointerException on concurrent GenericContainer.stop() calls#11914
nikita-kibitkin wants to merge 1 commit into
testcontainers:mainfrom
nikita-kibitkin:fix-concurrent-stop-npe-11492

Conversation

@nikita-kibitkin

Copy link
Copy Markdown

Fixes #11492

stop() reads the mutable containerId field twice: in the null guard and again when calling ResourceReaper.stopAndRemoveContainer(). The only null write is stop()'s own finally block, so a concurrent stop() that completes between the two reads (the Trino CI case: cleanup racing after a failed startup), or a containerIsStopping() hook that reenters stop(), makes the second read pass null to the reaper and crash with the reported NullPointerException in ConcurrentHashMap.remove(). The fix reads the container state into locals once, so null can never reach the reaper from stop().

This is deliberately fixed at the caller rather than with a null guard inside ResourceReaper.stopAndRemoveContainer() (suggested earlier in the issue): the reaper methods are deprecated public API with a non-null contract, and a guard there would hide caller bugs instead of fixing this one. The duplicate removal attempt by the losing thread is already tolerated - removeContainer() swallows NotFoundException. stop() intentionally stays non-atomic: exactly-once semantics would require holding a lock across Docker calls and user hooks, and the duplicate path is benign.

Both race variants have deterministic tests: a latch-based concurrent test and a reentrant-hook test. Both fail on main with the exact NPE from the issue and pass with the fix; full GenericContainerTest is green; spotless applied.

stop() re-read the mutable containerId field after the null guard, so a
concurrent stop() that completes in between, or a containerIsStopping()
hook that reenters stop(), made the second read pass null to
ResourceReaper.stopAndRemoveContainer, crashing with NullPointerException
in ConcurrentHashMap.remove. Read the container state into locals once
instead; the duplicate removal attempt is already tolerated by
ResourceReaper (NotFoundException is swallowed).

Fixes testcontainers#11492
@nikita-kibitkin nikita-kibitkin requested a review from a team as a code owner July 4, 2026 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: NullPointerException from ResourceReaper in GenericContainer.stop

1 participant