RFR: 8308043: Deadlock in TestCSLocker.java due to blocking GC while allocating

Albert Mingkun Yang ayang at openjdk.org
Mon May 15 14:50:47 UTC 2023


On Mon, 15 May 2023 14:26:42 GMT, Erik Österlund <eosterlund at openjdk.org> wrote:

> The TestCSLocker.java test spawns a thread that grabs the GC locker, and then wait for the first thread to run some java code and then get signal back to release the GC locker. All of this while another thread is allocating garbage and triggering GCs. Naturally, if the thread that is to signal the release of the GC locker requires GC in order to make progress, we will end up with a deadlock that leads to a timeout. As it turns out, that does indeed happen. A println statement is performed, which in its internal implementation performs an allocation, which requires GC. I think any GC can spuriously fail here, but it seems more likely with generational ZGC for whatever reason. While it seems really shady to wait with the GC locker held while a Java thread executing Java code is supposed to make progress, in general, I think the test can be fixed by removing the println statement causing the allocation. I have run the test 200 times, and it's no longer failing with generational ZGC.

Marked as reviewed by ayang (Reviewer).

test/hotspot/jtreg/gc/cslocker/TestCSLocker.java line 54:

> 52:         // check timeout to success deadlocking
> 53:         while(System.currentTimeMillis() < startTime + timeout) {
> 54:             System.out.println("sleeping...");

I think some comments (one cannot run any gc-triggering code, e.g. println) here would be nice. It's super tempting to add some innocent debug-prints before suspending the current thread, while extending/fixing this test case in the future.

-------------

PR Review: https://git.openjdk.org/jdk/pull/13989#pullrequestreview-1426741151
PR Review Comment: https://git.openjdk.org/jdk/pull/13989#discussion_r1193953862


More information about the hotspot-gc-dev mailing list