RFR: 8273790: Potential cyclic dependencies between Gregorian and CalendarSystem
Can I please get a review for this change which proposes to fix the issue reported in https://bugs.openjdk.java.net/browse/JDK-8273790? As noted in that issue, trying to class load `sun.util.calendar.CalendarSystem` and `sun.util.calendar.Gregorian` concurrently in separate threads can lead to a deadlock because of the cyclic nature of the code in their static initialization. More specifically, consider this case: - Thread T1 initiates a classload on the `sun.util.calendar.CalendarSystem`. - This gives T1 the implicit class init lock on `CalendarSystem`. - Consider thread T2 which at the same time initiates a classload on `sun.util.calendar.Gregorian` class. - This gives T2 a implicit class init lock on `Gregorian`. - T1, still holding a lock on `CalendarSystem` attempts to load `Gregorian` since it wants to create a (singleton) instance of `Gregorian` and assign it to the `static final GREGORIAN_INSTANCE` member. Since T2 is holding a class init lock on `Gregorian`, T1 ends up waiting - T2 on the other hand is still loading the `Gregorian` class. `Gregorian` itself "is a" `CalendarSystem`, so during this loading of `Gregorian` class, T2 starts travelling up the class hierarchy and asks for a lock on `CalendarSystem`. However T1 is holding this lock and as a result T2 ends up waiting on T1 which is waiting on T2. That triggers this deadlock. The linked JBS issue has a thread dump which shows this in action. The commit here delays the instance creation of `Gregorian` by moving that instance creation logic from the static initialization of the `CalendarSystem` class, to the first call to `CalendarSystem#getGregorianCalendar()`. This prevents the `CalendarSystem` from needing a lock on `Gregorian` during its static init (of course, unless some code in this static init flow calls `CalendarSystem#getGregorianCalendar()`, in which case it is back to square one. I have verified, both manually and through the jtreg test, that the code in question doesn't have such calls) A new jtreg test has been introduced to reproduce the issue and verify the fix. The test in addition to loading these 2 classes in question, also additionally loads a few other classes concurrently. These classes have specific static initialization which leads the calls to `CalendarSystem#getGregorianCalendar()` or `CalendarSystem#forName()`. Including these classes in the tests ensures that this deadlock hasn't "moved" to a different location. I have run multiple runs (approximately 25) of this test with the fix and I haven't seen it deadlock anymore. ------------- Commit messages: - 8273790: Potential cyclic dependencies between Gregorian and CalendarSystem Changes: https://git.openjdk.java.net/jdk/pull/5683/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=5683&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8273790 Stats: 185 lines in 2 files changed: 181 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/5683.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5683/head:pull/5683 PR: https://git.openjdk.java.net/jdk/pull/5683
On Fri, 24 Sep 2021 14:36:07 GMT, Jaikiran Pai <jpai@openjdk.org> wrote:
Can I please get a review for this change which proposes to fix the issue reported in https://bugs.openjdk.java.net/browse/JDK-8273790?
As noted in that issue, trying to class load `sun.util.calendar.CalendarSystem` and `sun.util.calendar.Gregorian` concurrently in separate threads can lead to a deadlock because of the cyclic nature of the code in their static initialization. More specifically, consider this case:
- Thread T1 initiates a classload on the `sun.util.calendar.CalendarSystem`. - This gives T1 the implicit class init lock on `CalendarSystem`. - Consider thread T2 which at the same time initiates a classload on `sun.util.calendar.Gregorian` class. - This gives T2 a implicit class init lock on `Gregorian`. - T1, still holding a lock on `CalendarSystem` attempts to load `Gregorian` since it wants to create a (singleton) instance of `Gregorian` and assign it to the `static final GREGORIAN_INSTANCE` member. Since T2 is holding a class init lock on `Gregorian`, T1 ends up waiting - T2 on the other hand is still loading the `Gregorian` class. `Gregorian` itself "is a" `CalendarSystem`, so during this loading of `Gregorian` class, T2 starts travelling up the class hierarchy and asks for a lock on `CalendarSystem`. However T1 is holding this lock and as a result T2 ends up waiting on T1 which is waiting on T2. That triggers this deadlock.
The linked JBS issue has a thread dump which shows this in action.
The commit here delays the instance creation of `Gregorian` by moving that instance creation logic from the static initialization of the `CalendarSystem` class, to the first call to `CalendarSystem#getGregorianCalendar()`. This prevents the `CalendarSystem` from needing a lock on `Gregorian` during its static init (of course, unless some code in this static init flow calls `CalendarSystem#getGregorianCalendar()`, in which case it is back to square one. I have verified, both manually and through the jtreg test, that the code in question doesn't have such calls)
A new jtreg test has been introduced to reproduce the issue and verify the fix. The test in addition to loading these 2 classes in question, also additionally loads a few other classes concurrently. These classes have specific static initialization which leads the calls to `CalendarSystem#getGregorianCalendar()` or `CalendarSystem#forName()`. Including these classes in the tests ensures that this deadlock hasn't "moved" to a different location. I have run multiple runs (approximately 25) of this test with the fix and I haven't seen it deadlock anymore.
Thanks, Jaikiran. Looks good. Some minor comments. src/java.base/share/classes/sun/util/calendar/CalendarSystem.java line 123:
121: */ 122: public static Gregorian getGregorianCalendar() { 123: var gCal = GREGORIAN_INSTANCE;
Do we need the local variable `gCal`? test/jdk/sun/util/calendar/CalendarSystemDeadLockTest.java line 43:
41: * @run main/othervm CalendarSystemDeadLockTest 42: * @run main/othervm CalendarSystemDeadLockTest 43: * @run main/othervm CalendarSystemDeadLockTest
Just curious, before the fix, how many test instances caused the deadlock? I'd think these 5 runs are arbitrary numbers, Just wanted to have those 5 runs are appropriate. test/jdk/sun/util/calendar/CalendarSystemDeadLockTest.java line 75:
73: tasks.add(new GetGregorianCalTask(taskTriggerLatch)); 74: tasks.add(new GetGregorianCalTask(taskTriggerLatch)); 75: final ExecutorService executor = Executors.newFixedThreadPool(tasks.size());
Asserting `tasks.size() == numTasks` may help here. test/jdk/sun/util/calendar/CalendarSystemDeadLockTest.java line 171:
169: } 170: } 171: }
Need a new line at the EOF. ------------- PR: https://git.openjdk.java.net/jdk/pull/5683
On Fri, 24 Sep 2021 17:53:03 GMT, Naoto Sato <naoto@openjdk.org> wrote:
Jaikiran Pai has updated the pull request incrementally with two additional commits since the last revision:
- Minor test adjustments as suggested by Naoto - use a holder instead of volatile, as suggested by Roger
src/java.base/share/classes/sun/util/calendar/CalendarSystem.java line 123:
121: */ 122: public static Gregorian getGregorianCalendar() { 123: var gCal = GREGORIAN_INSTANCE;
Do we need the local variable `gCal`?
This was there to avoid additional volatile reads in that method. A performance optimization. However, with the change Roger suggested, this is no longer relevant.
test/jdk/sun/util/calendar/CalendarSystemDeadLockTest.java line 43:
41: * @run main/othervm CalendarSystemDeadLockTest 42: * @run main/othervm CalendarSystemDeadLockTest 43: * @run main/othervm CalendarSystemDeadLockTest
Just curious, before the fix, how many test instances caused the deadlock? I'd think these 5 runs are arbitrary numbers, Just wanted to have those 5 runs are appropriate.
Hello @naotoj, On my setup, without the fix, the test deadlocks almost always right on the first run. There have been cases where it did pass the first time, but running it a second time has always reproduced the failure. The 5 runs that I have in this test is indeed an arbitrary number. Given how quickly this test completes, I decided to use a slightly higher number of 5 instead of maybe 2 or 3. Do you think, we should change the run count to something else?
test/jdk/sun/util/calendar/CalendarSystemDeadLockTest.java line 75:
73: tasks.add(new GetGregorianCalTask(taskTriggerLatch)); 74: tasks.add(new GetGregorianCalTask(taskTriggerLatch)); 75: final ExecutorService executor = Executors.newFixedThreadPool(tasks.size());
Asserting `tasks.size() == numTasks` may help here.
Yes, that makes sense. I've updated the test to add this check.
test/jdk/sun/util/calendar/CalendarSystemDeadLockTest.java line 171:
169: } 170: } 171: }
Need a new line at the EOF.
Done. I've updated this in the latest version of the PR. ------------- PR: https://git.openjdk.java.net/jdk/pull/5683
On Fri, 24 Sep 2021 14:36:07 GMT, Jaikiran Pai <jpai@openjdk.org> wrote:
Can I please get a review for this change which proposes to fix the issue reported in https://bugs.openjdk.java.net/browse/JDK-8273790?
As noted in that issue, trying to class load `sun.util.calendar.CalendarSystem` and `sun.util.calendar.Gregorian` concurrently in separate threads can lead to a deadlock because of the cyclic nature of the code in their static initialization. More specifically, consider this case:
- Thread T1 initiates a classload on the `sun.util.calendar.CalendarSystem`. - This gives T1 the implicit class init lock on `CalendarSystem`. - Consider thread T2 which at the same time initiates a classload on `sun.util.calendar.Gregorian` class. - This gives T2 a implicit class init lock on `Gregorian`. - T1, still holding a lock on `CalendarSystem` attempts to load `Gregorian` since it wants to create a (singleton) instance of `Gregorian` and assign it to the `static final GREGORIAN_INSTANCE` member. Since T2 is holding a class init lock on `Gregorian`, T1 ends up waiting - T2 on the other hand is still loading the `Gregorian` class. `Gregorian` itself "is a" `CalendarSystem`, so during this loading of `Gregorian` class, T2 starts travelling up the class hierarchy and asks for a lock on `CalendarSystem`. However T1 is holding this lock and as a result T2 ends up waiting on T1 which is waiting on T2. That triggers this deadlock.
The linked JBS issue has a thread dump which shows this in action.
The commit here delays the instance creation of `Gregorian` by moving that instance creation logic from the static initialization of the `CalendarSystem` class, to the first call to `CalendarSystem#getGregorianCalendar()`. This prevents the `CalendarSystem` from needing a lock on `Gregorian` during its static init (of course, unless some code in this static init flow calls `CalendarSystem#getGregorianCalendar()`, in which case it is back to square one. I have verified, both manually and through the jtreg test, that the code in question doesn't have such calls)
A new jtreg test has been introduced to reproduce the issue and verify the fix. The test in addition to loading these 2 classes in question, also additionally loads a few other classes concurrently. These classes have specific static initialization which leads the calls to `CalendarSystem#getGregorianCalendar()` or `CalendarSystem#forName()`. Including these classes in the tests ensures that this deadlock hasn't "moved" to a different location. I have run multiple runs (approximately 25) of this test with the fix and I haven't seen it deadlock anymore.
As an alternative, can the Gregorian Instance be moved to a nested (static) class. That will delay initialization until it is needed. This "holder" pattern is used elsewhere to defer initialization and break cycles. ------------- PR: https://git.openjdk.java.net/jdk/pull/5683
Can I please get a review for this change which proposes to fix the issue reported in https://bugs.openjdk.java.net/browse/JDK-8273790?
As noted in that issue, trying to class load `sun.util.calendar.CalendarSystem` and `sun.util.calendar.Gregorian` concurrently in separate threads can lead to a deadlock because of the cyclic nature of the code in their static initialization. More specifically, consider this case:
- Thread T1 initiates a classload on the `sun.util.calendar.CalendarSystem`. - This gives T1 the implicit class init lock on `CalendarSystem`. - Consider thread T2 which at the same time initiates a classload on `sun.util.calendar.Gregorian` class. - This gives T2 a implicit class init lock on `Gregorian`. - T1, still holding a lock on `CalendarSystem` attempts to load `Gregorian` since it wants to create a (singleton) instance of `Gregorian` and assign it to the `static final GREGORIAN_INSTANCE` member. Since T2 is holding a class init lock on `Gregorian`, T1 ends up waiting - T2 on the other hand is still loading the `Gregorian` class. `Gregorian` itself "is a" `CalendarSystem`, so during this loading of `Gregorian` class, T2 starts travelling up the class hierarchy and asks for a lock on `CalendarSystem`. However T1 is holding this lock and as a result T2 ends up waiting on T1 which is waiting on T2. That triggers this deadlock.
The linked JBS issue has a thread dump which shows this in action.
The commit here delays the instance creation of `Gregorian` by moving that instance creation logic from the static initialization of the `CalendarSystem` class, to the first call to `CalendarSystem#getGregorianCalendar()`. This prevents the `CalendarSystem` from needing a lock on `Gregorian` during its static init (of course, unless some code in this static init flow calls `CalendarSystem#getGregorianCalendar()`, in which case it is back to square one. I have verified, both manually and through the jtreg test, that the code in question doesn't have such calls)
A new jtreg test has been introduced to reproduce the issue and verify the fix. The test in addition to loading these 2 classes in question, also additionally loads a few other classes concurrently. These classes have specific static initialization which leads the calls to `CalendarSystem#getGregorianCalendar()` or `CalendarSystem#forName()`. Including these classes in the tests ensures that this deadlock hasn't "moved" to a different location. I have run multiple runs (approximately 25) of this test with the fix and I haven't seen it deadlock anymore.
Jaikiran Pai has updated the pull request incrementally with two additional commits since the last revision: - Minor test adjustments as suggested by Naoto - use a holder instead of volatile, as suggested by Roger ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/5683/files - new: https://git.openjdk.java.net/jdk/pull/5683/files/8855f910..8b276d4d Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=5683&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=5683&range=00-01 Stats: 22 lines in 2 files changed: 8 ins; 10 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/5683.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5683/head:pull/5683 PR: https://git.openjdk.java.net/jdk/pull/5683
On Sat, 25 Sep 2021 03:38:24 GMT, Jaikiran Pai <jpai@openjdk.org> wrote:
Can I please get a review for this change which proposes to fix the issue reported in https://bugs.openjdk.java.net/browse/JDK-8273790?
As noted in that issue, trying to class load `sun.util.calendar.CalendarSystem` and `sun.util.calendar.Gregorian` concurrently in separate threads can lead to a deadlock because of the cyclic nature of the code in their static initialization. More specifically, consider this case:
- Thread T1 initiates a classload on the `sun.util.calendar.CalendarSystem`. - This gives T1 the implicit class init lock on `CalendarSystem`. - Consider thread T2 which at the same time initiates a classload on `sun.util.calendar.Gregorian` class. - This gives T2 a implicit class init lock on `Gregorian`. - T1, still holding a lock on `CalendarSystem` attempts to load `Gregorian` since it wants to create a (singleton) instance of `Gregorian` and assign it to the `static final GREGORIAN_INSTANCE` member. Since T2 is holding a class init lock on `Gregorian`, T1 ends up waiting - T2 on the other hand is still loading the `Gregorian` class. `Gregorian` itself "is a" `CalendarSystem`, so during this loading of `Gregorian` class, T2 starts travelling up the class hierarchy and asks for a lock on `CalendarSystem`. However T1 is holding this lock and as a result T2 ends up waiting on T1 which is waiting on T2. That triggers this deadlock.
The linked JBS issue has a thread dump which shows this in action.
The commit here delays the instance creation of `Gregorian` by moving that instance creation logic from the static initialization of the `CalendarSystem` class, to the first call to `CalendarSystem#getGregorianCalendar()`. This prevents the `CalendarSystem` from needing a lock on `Gregorian` during its static init (of course, unless some code in this static init flow calls `CalendarSystem#getGregorianCalendar()`, in which case it is back to square one. I have verified, both manually and through the jtreg test, that the code in question doesn't have such calls)
A new jtreg test has been introduced to reproduce the issue and verify the fix. The test in addition to loading these 2 classes in question, also additionally loads a few other classes concurrently. These classes have specific static initialization which leads the calls to `CalendarSystem#getGregorianCalendar()` or `CalendarSystem#forName()`. Including these classes in the tests ensures that this deadlock hasn't "moved" to a different location. I have run multiple runs (approximately 25) of this test with the fix and I haven't seen it deadlock anymore.
Jaikiran Pai has updated the pull request incrementally with two additional commits since the last revision:
- Minor test adjustments as suggested by Naoto - use a holder instead of volatile, as suggested by Roger
Hello Roger,
As an alternative, can the Gregorian Instance be moved to a nested (static) class. That will delay initialization until it is needed. This "holder" pattern is used elsewhere to defer initialization and break cycles.
I did indeed have that in mind when I started work on this. That was something Chris Hegarty had suggested and we have used in a different (but similar) issue a while back[1]. I was however unsure if that's a common enough technique, so had started off with the volatile approach. I've now updated the PR to use the holder technique instead. [1] https://github.com/openjdk/jdk/pull/2893#issuecomment-797539503 ------------- PR: https://git.openjdk.java.net/jdk/pull/5683
On Sat, 25 Sep 2021 03:38:24 GMT, Jaikiran Pai <jpai@openjdk.org> wrote:
Can I please get a review for this change which proposes to fix the issue reported in https://bugs.openjdk.java.net/browse/JDK-8273790?
As noted in that issue, trying to class load `sun.util.calendar.CalendarSystem` and `sun.util.calendar.Gregorian` concurrently in separate threads can lead to a deadlock because of the cyclic nature of the code in their static initialization. More specifically, consider this case:
- Thread T1 initiates a classload on the `sun.util.calendar.CalendarSystem`. - This gives T1 the implicit class init lock on `CalendarSystem`. - Consider thread T2 which at the same time initiates a classload on `sun.util.calendar.Gregorian` class. - This gives T2 a implicit class init lock on `Gregorian`. - T1, still holding a lock on `CalendarSystem` attempts to load `Gregorian` since it wants to create a (singleton) instance of `Gregorian` and assign it to the `static final GREGORIAN_INSTANCE` member. Since T2 is holding a class init lock on `Gregorian`, T1 ends up waiting - T2 on the other hand is still loading the `Gregorian` class. `Gregorian` itself "is a" `CalendarSystem`, so during this loading of `Gregorian` class, T2 starts travelling up the class hierarchy and asks for a lock on `CalendarSystem`. However T1 is holding this lock and as a result T2 ends up waiting on T1 which is waiting on T2. That triggers this deadlock.
The linked JBS issue has a thread dump which shows this in action.
The commit here delays the instance creation of `Gregorian` by moving that instance creation logic from the static initialization of the `CalendarSystem` class, to the first call to `CalendarSystem#getGregorianCalendar()`. This prevents the `CalendarSystem` from needing a lock on `Gregorian` during its static init (of course, unless some code in this static init flow calls `CalendarSystem#getGregorianCalendar()`, in which case it is back to square one. I have verified, both manually and through the jtreg test, that the code in question doesn't have such calls)
A new jtreg test has been introduced to reproduce the issue and verify the fix. The test in addition to loading these 2 classes in question, also additionally loads a few other classes concurrently. These classes have specific static initialization which leads the calls to `CalendarSystem#getGregorianCalendar()` or `CalendarSystem#forName()`. Including these classes in the tests ensures that this deadlock hasn't "moved" to a different location. I have run multiple runs (approximately 25) of this test with the fix and I haven't seen it deadlock anymore.
Jaikiran Pai has updated the pull request incrementally with two additional commits since the last revision:
- Minor test adjustments as suggested by Naoto - use a holder instead of volatile, as suggested by Roger
@jaikiran Thanks for fixing this. Delaying instance creation via a static holder class seems reasonable to me. ------------- Marked as reviewed by yyang (Committer). PR: https://git.openjdk.java.net/jdk/pull/5683
On Sat, 25 Sep 2021 03:38:24 GMT, Jaikiran Pai <jpai@openjdk.org> wrote:
Can I please get a review for this change which proposes to fix the issue reported in https://bugs.openjdk.java.net/browse/JDK-8273790?
As noted in that issue, trying to class load `sun.util.calendar.CalendarSystem` and `sun.util.calendar.Gregorian` concurrently in separate threads can lead to a deadlock because of the cyclic nature of the code in their static initialization. More specifically, consider this case:
- Thread T1 initiates a classload on the `sun.util.calendar.CalendarSystem`. - This gives T1 the implicit class init lock on `CalendarSystem`. - Consider thread T2 which at the same time initiates a classload on `sun.util.calendar.Gregorian` class. - This gives T2 a implicit class init lock on `Gregorian`. - T1, still holding a lock on `CalendarSystem` attempts to load `Gregorian` since it wants to create a (singleton) instance of `Gregorian` and assign it to the `static final GREGORIAN_INSTANCE` member. Since T2 is holding a class init lock on `Gregorian`, T1 ends up waiting - T2 on the other hand is still loading the `Gregorian` class. `Gregorian` itself "is a" `CalendarSystem`, so during this loading of `Gregorian` class, T2 starts travelling up the class hierarchy and asks for a lock on `CalendarSystem`. However T1 is holding this lock and as a result T2 ends up waiting on T1 which is waiting on T2. That triggers this deadlock.
The linked JBS issue has a thread dump which shows this in action.
The commit here delays the instance creation of `Gregorian` by moving that instance creation logic from the static initialization of the `CalendarSystem` class, to the first call to `CalendarSystem#getGregorianCalendar()`. This prevents the `CalendarSystem` from needing a lock on `Gregorian` during its static init (of course, unless some code in this static init flow calls `CalendarSystem#getGregorianCalendar()`, in which case it is back to square one. I have verified, both manually and through the jtreg test, that the code in question doesn't have such calls)
A new jtreg test has been introduced to reproduce the issue and verify the fix. The test in addition to loading these 2 classes in question, also additionally loads a few other classes concurrently. These classes have specific static initialization which leads the calls to `CalendarSystem#getGregorianCalendar()` or `CalendarSystem#forName()`. Including these classes in the tests ensures that this deadlock hasn't "moved" to a different location. I have run multiple runs (approximately 25) of this test with the fix and I haven't seen it deadlock anymore.
Jaikiran Pai has updated the pull request incrementally with two additional commits since the last revision:
- Minor test adjustments as suggested by Naoto - use a holder instead of volatile, as suggested by Roger
Looks good. Thank you for the fix! ------------- Marked as reviewed by naoto (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5683
On Sat, 25 Sep 2021 03:38:24 GMT, Jaikiran Pai <jpai@openjdk.org> wrote:
Can I please get a review for this change which proposes to fix the issue reported in https://bugs.openjdk.java.net/browse/JDK-8273790?
As noted in that issue, trying to class load `sun.util.calendar.CalendarSystem` and `sun.util.calendar.Gregorian` concurrently in separate threads can lead to a deadlock because of the cyclic nature of the code in their static initialization. More specifically, consider this case:
- Thread T1 initiates a classload on the `sun.util.calendar.CalendarSystem`. - This gives T1 the implicit class init lock on `CalendarSystem`. - Consider thread T2 which at the same time initiates a classload on `sun.util.calendar.Gregorian` class. - This gives T2 a implicit class init lock on `Gregorian`. - T1, still holding a lock on `CalendarSystem` attempts to load `Gregorian` since it wants to create a (singleton) instance of `Gregorian` and assign it to the `static final GREGORIAN_INSTANCE` member. Since T2 is holding a class init lock on `Gregorian`, T1 ends up waiting - T2 on the other hand is still loading the `Gregorian` class. `Gregorian` itself "is a" `CalendarSystem`, so during this loading of `Gregorian` class, T2 starts travelling up the class hierarchy and asks for a lock on `CalendarSystem`. However T1 is holding this lock and as a result T2 ends up waiting on T1 which is waiting on T2. That triggers this deadlock.
The linked JBS issue has a thread dump which shows this in action.
The commit here delays the instance creation of `Gregorian` by moving that instance creation logic from the static initialization of the `CalendarSystem` class, to the first call to `CalendarSystem#getGregorianCalendar()`. This prevents the `CalendarSystem` from needing a lock on `Gregorian` during its static init (of course, unless some code in this static init flow calls `CalendarSystem#getGregorianCalendar()`, in which case it is back to square one. I have verified, both manually and through the jtreg test, that the code in question doesn't have such calls)
A new jtreg test has been introduced to reproduce the issue and verify the fix. The test in addition to loading these 2 classes in question, also additionally loads a few other classes concurrently. These classes have specific static initialization which leads the calls to `CalendarSystem#getGregorianCalendar()` or `CalendarSystem#forName()`. Including these classes in the tests ensures that this deadlock hasn't "moved" to a different location. I have run multiple runs (approximately 25) of this test with the fix and I haven't seen it deadlock anymore.
Jaikiran Pai has updated the pull request incrementally with two additional commits since the last revision:
- Minor test adjustments as suggested by Naoto - use a holder instead of volatile, as suggested by Roger
👍 ------------- Marked as reviewed by rriggs (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5683
On Sat, 25 Sep 2021 03:38:24 GMT, Jaikiran Pai <jpai@openjdk.org> wrote:
Can I please get a review for this change which proposes to fix the issue reported in https://bugs.openjdk.java.net/browse/JDK-8273790?
As noted in that issue, trying to class load `sun.util.calendar.CalendarSystem` and `sun.util.calendar.Gregorian` concurrently in separate threads can lead to a deadlock because of the cyclic nature of the code in their static initialization. More specifically, consider this case:
- Thread T1 initiates a classload on the `sun.util.calendar.CalendarSystem`. - This gives T1 the implicit class init lock on `CalendarSystem`. - Consider thread T2 which at the same time initiates a classload on `sun.util.calendar.Gregorian` class. - This gives T2 a implicit class init lock on `Gregorian`. - T1, still holding a lock on `CalendarSystem` attempts to load `Gregorian` since it wants to create a (singleton) instance of `Gregorian` and assign it to the `static final GREGORIAN_INSTANCE` member. Since T2 is holding a class init lock on `Gregorian`, T1 ends up waiting - T2 on the other hand is still loading the `Gregorian` class. `Gregorian` itself "is a" `CalendarSystem`, so during this loading of `Gregorian` class, T2 starts travelling up the class hierarchy and asks for a lock on `CalendarSystem`. However T1 is holding this lock and as a result T2 ends up waiting on T1 which is waiting on T2. That triggers this deadlock.
The linked JBS issue has a thread dump which shows this in action.
The commit here delays the instance creation of `Gregorian` by moving that instance creation logic from the static initialization of the `CalendarSystem` class, to the first call to `CalendarSystem#getGregorianCalendar()`. This prevents the `CalendarSystem` from needing a lock on `Gregorian` during its static init (of course, unless some code in this static init flow calls `CalendarSystem#getGregorianCalendar()`, in which case it is back to square one. I have verified, both manually and through the jtreg test, that the code in question doesn't have such calls)
A new jtreg test has been introduced to reproduce the issue and verify the fix. The test in addition to loading these 2 classes in question, also additionally loads a few other classes concurrently. These classes have specific static initialization which leads the calls to `CalendarSystem#getGregorianCalendar()` or `CalendarSystem#forName()`. Including these classes in the tests ensures that this deadlock hasn't "moved" to a different location. I have run multiple runs (approximately 25) of this test with the fix and I haven't seen it deadlock anymore.
Jaikiran Pai has updated the pull request incrementally with two additional commits since the last revision:
- Minor test adjustments as suggested by Naoto - use a holder instead of volatile, as suggested by Roger
Thank you Roger, Naoto and Yi Yang for the reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/5683
On Fri, 24 Sep 2021 14:36:07 GMT, Jaikiran Pai <jpai@openjdk.org> wrote:
Can I please get a review for this change which proposes to fix the issue reported in https://bugs.openjdk.java.net/browse/JDK-8273790?
As noted in that issue, trying to class load `sun.util.calendar.CalendarSystem` and `sun.util.calendar.Gregorian` concurrently in separate threads can lead to a deadlock because of the cyclic nature of the code in their static initialization. More specifically, consider this case:
- Thread T1 initiates a classload on the `sun.util.calendar.CalendarSystem`. - This gives T1 the implicit class init lock on `CalendarSystem`. - Consider thread T2 which at the same time initiates a classload on `sun.util.calendar.Gregorian` class. - This gives T2 a implicit class init lock on `Gregorian`. - T1, still holding a lock on `CalendarSystem` attempts to load `Gregorian` since it wants to create a (singleton) instance of `Gregorian` and assign it to the `static final GREGORIAN_INSTANCE` member. Since T2 is holding a class init lock on `Gregorian`, T1 ends up waiting - T2 on the other hand is still loading the `Gregorian` class. `Gregorian` itself "is a" `CalendarSystem`, so during this loading of `Gregorian` class, T2 starts travelling up the class hierarchy and asks for a lock on `CalendarSystem`. However T1 is holding this lock and as a result T2 ends up waiting on T1 which is waiting on T2. That triggers this deadlock.
The linked JBS issue has a thread dump which shows this in action.
The commit here delays the instance creation of `Gregorian` by moving that instance creation logic from the static initialization of the `CalendarSystem` class, to the first call to `CalendarSystem#getGregorianCalendar()`. This prevents the `CalendarSystem` from needing a lock on `Gregorian` during its static init (of course, unless some code in this static init flow calls `CalendarSystem#getGregorianCalendar()`, in which case it is back to square one. I have verified, both manually and through the jtreg test, that the code in question doesn't have such calls)
A new jtreg test has been introduced to reproduce the issue and verify the fix. The test in addition to loading these 2 classes in question, also additionally loads a few other classes concurrently. These classes have specific static initialization which leads the calls to `CalendarSystem#getGregorianCalendar()` or `CalendarSystem#forName()`. Including these classes in the tests ensures that this deadlock hasn't "moved" to a different location. I have run multiple runs (approximately 25) of this test with the fix and I haven't seen it deadlock anymore.
This pull request has now been integrated. Changeset: ddc26274 Author: Jaikiran Pai <jpai@openjdk.org> URL: https://git.openjdk.java.net/jdk/commit/ddc262746aea99050b9a6484f51c7ddb8f4b... Stats: 183 lines in 2 files changed: 179 ins; 0 del; 4 mod 8273790: Potential cyclic dependencies between Gregorian and CalendarSystem Reviewed-by: naoto, yyang, rriggs ------------- PR: https://git.openjdk.java.net/jdk/pull/5683
participants (4)
-
Jaikiran Pai
-
Naoto Sato
-
Roger Riggs
-
Yi Yang