RFR: 8250637: UseOSErrorReporting times out (on Mac and Linux) [v2]

Thomas Stuefe stuefe at openjdk.java.net
Wed Oct 28 07:02:17 UTC 2020


On Tue, 27 Oct 2020 16:05:44 GMT, Gerard Ziemski <gziemski at openjdk.org> wrote:

> I think our differences of opinion all hinges on what happens when code returns from its signal handler:
> 
> #1 Does it resume and actually redoes the exact same instruction? (which this time may succeed?)
> #2 Does it resume and raise the exact same signal? (exhibits the exact same behavior as original?)
> #3 Does it resume past the instruction that originally caused the exception?
> 
> You and Thomas seem to believe that it's #3 (or is that #1 ?), I thought (based on https://developer.apple.com/forums/thread/113742 ) that it was more like #2.
> 

No, not #3. 

#2 is an interesting thought, but I don't think so. Were it so, our polling page mechanism would not work: triggering a SEGV by accessing a poisened page, and in signal handling, unpoisening the page and returning, which then re-executes the same load, but since the page is now unpoisened no fault happens. Which, btw, is an excellent example of a case where returning from a signal handler does _not_ re-raise the same signal. On purpose in this case, but our point is that the same thing may happen accidentally.

I think what happens is that the register contents - so, the crash context - which had been active when the thread got the first fault gets reinstated after signal handler returns, and we resume processing with the same state. So, all registers are the same, including pc. We would attempt to reload the instruction from the same address and re-execute it. But since the underlying memory could have changed in the meantime (starting at: the point the pc points to had been invalid and is now valid, e.g. a bug in the JIT, to: the instruction was a mov/store and its destination had been invalid and is now valid, and so on) there are conceivable scenarios where we may not crash a second time.

> I will continue this investigation in JDK-8237727
> 
> Here I will not be as ambitious and I will simply fix the problem at hand: i.e. address the 2 minutes hang by disabling the option for macOS and Linux.

This is reasonable, thank you.

-------------

PR: https://git.openjdk.java.net/jdk/pull/813


More information about the hotspot-dev mailing list