RFR: 8256382: Use try_lock for hs_err EventLog printing
Stefan Karlsson
stefank at openjdk.java.net
Tue Nov 24 14:48:57 UTC 2020
On Tue, 24 Nov 2020 13:31:28 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
>> The EventLog locks are taken when the hs_err files are generated. Since crashes and asserts can occur when other locks are held, this can cause lock reordering problems if the held locks also are low-rank locks. There's no way to solve this if blocking locks are taken.
>>
>> I hit this problem when investigating making the GCLogPrecious lock use the lowest lock rank (same as EventLog). See JDK-8254877.
>>
>> Both GCLogPrecious and EventLog are considered "leaf" locks. No other locks should be taken when those locks are taken. However, if we crash in either of these sub-systems, there will be a lock-reordering error message in the hs_err file, and the rest of the logged info is skipped in the currently logged section.
>>
>> The proposal is to use try_lock_without_range_check and only log information if the lock could be acquired without blocking. This relies on the new try_lock_without_range_check function from JDK-8255678.
>>
>> I've tested this by injecting crashes while not holding locks in both GCLogPrecious, while holding locks during EventLog logging, and when not holding the locks, and verified that we get the expected behavior.
>>
>> Example output while crashing during 'Internal exceptions' logging:
>> Classes redefined (0 events):
>> No events
>>
>> Internal exceptions (5 events):
>> No events printed - crash while holding lock
>>
>> Events (20 events):
>> Event: 1,437 loading class java/util/HashMap$KeyIterator
>> Event: 1,438 loading class java/util/HashMap$KeyIterator done
>> Event: 1,438 loading class java/lang/module/ModuleDescriptor$Exports
>
> Hi Stefan,
>
> can't we just print out unconditionally if VMError::is_error_reported()? Whats the worst that can happen, we crash? Then we would continue with the next reporting step. But we still have a chance to see the event buffer contents even if the event logger itself crashed or asserted.
>
> Side note, some time ago I rewrote the whole event system: https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-March/033150.html to a much simpler implementation and I think that one would also be pretty safe to get printed even if unlocked since it works on a static pre-allocated buffer. But that work somehow got bogged down in review and I never got around to drive it upstream. If there is enough interest I may take up the work again.
Hi @tstuefe
With your proposal, if we have a section that looks like this:
* event log instance 0
* event log instance 1
* event log instance 2
* other type of printing
and we crash at "event log instance 1", then we completely skip printing "event log instance 2" and "other type of printing". My thinking was that that was unfortunate, and I wanted to localize the problem. We often have this problem that one hs_err crash hides important information that was supposed to be written later, because a lot of the code in the hs_err printing isn't hardened and the hs_err sections are on a too high level (IMHO).
With that said, I don't mind skipping the trying to take the lock if we are printing the hs_err file. Do others agree with Thomas?
-------------
PR: https://git.openjdk.java.net/jdk/pull/1408
More information about the hotspot-dev
mailing list