RFR: 8348907: Stress times out when is executed with ZGC
Markus Grönlund
mgronlun at openjdk.org
Mon Mar 24 22:33:23 UTC 2025
Greetings,
Here is a suggested solution for solving the intricate deadlock issues involving virtual threads, ZGC load barriers, and JFR.
A JFR event can be allocated and committed in specific sensitive contexts, such as inside mutex-protected load barriers. If the thread is a virtual thread, JFR determines its thread name by loading the oop from the thread (jt->vthread()) as part of the event commit.
This operation again triggers the load barrier, which contains a non-reentrant lock, effectively deadlocking the thread with itself.
So, for specific sensitive event sites, JFR mustn't recurse or reenter into the same event site as part of the event commit.
After a few iterations and prototypes, which failed because they eventually ended up touching some oop, I came up with the following.
>From a user perspective, an event (site) can now be marked as "non-reentrant" by wrapping it in a helper class.
This instruction now guarantees JFR will not reenter this site again as part of the event.commit().
The tradeoff is that we cannot write the virtual thread name for these sensitive event sites; we will instead report "" as the virtual thread name, which is the default virtual thread name in Java. All other information about the thread, such as the thread ID, virtual thread, etc., will still be reported.
I believe it is a reasonable tradeoff and a general solution for sensitive JFR event sites, which are rare in practice, with minimal impact on event programming.
Testing: jdk_jfr, stress testing
Let me know what you think.
Thanks
Markus
-------------
Commit messages:
- 8348907
Changes: https://git.openjdk.org/jdk/pull/24209/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24209&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8348907
Stats: 160 lines in 10 files changed: 139 ins; 12 del; 9 mod
Patch: https://git.openjdk.org/jdk/pull/24209.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/24209/head:pull/24209
PR: https://git.openjdk.org/jdk/pull/24209
More information about the hotspot-gc-dev
mailing list