RFR(s) (JDK10): 8166944: Hanging Error Reporting steps may lead to torn error logs.
David Holmes
david.holmes at oracle.com
Fri Feb 3 02:01:18 UTC 2017
Hi Thomas,
Thanks for picking this back up. Overall this looks okay. I have a few
specific comments/suggestions below to add to Chris's.
David
-----
src/os/windows/vm/vmError_windows.cpp
Typo: canceleation
---
src/share/vm/utilities/vmError.cpp
As previously discussed Atomic::load/store for jlong do not depend on
supports_cx8. The commentary in atomic.hpp should really refer to
'atomic read-modify-write operations on jlongs' - as every platform must
support atomic 64-bit load/store to support Java volatile long/double.
So set_to_now and get_timestamp can be simplified, or even removed
completely.
Does the timestamp really need to be the system time in millis or do you
just want to record relative elapsed time? If the latter then I suggest
using javaTimeNanos()/100000 to avoid potential glitches with the system
time changing.
382 // TestUnresponsiveErrorHandler: three times to trigger first a
step timeout, then
383 // the global error reporting timeout.
I'm a little confused as to why three steps are needed to test two things??
1217 reporting_started();
1218 set_to_now(&reporting_start_time);
I think the setting of reporting_start_time should be internalized in
reporting_started.
Nit: Lets -> Let's (it is a contraction of 'let us')
1272 st->print_cr("------ Timout during error reporting after "
INT64_FORMAT "ms. ------",
Typo: timout
Also I gather we are potentially racing with the WatcherThread here, so
no guarantee these final print statements will be seen?
1502 VMError::interrupt_reporting_thread();
Nit: you're inside a VMError function so shouldn't need the VMError prefix.
---
src/share/vm/utilities/vmError.hpp
Typo: dependend
+ // Called by WatcherThread to check if the currently running error
reporting did timeout.
+ // Returns true if reporting did hit the global ErrorLogTimeout.
This doesn't quite read right to me. How about:
// Called by the WatcherThread to check if error reporting has timed-out.
// Returns true if error reporting has not completed within the
ErrorLogTimeout limit
---
test/runtime/ErrorHandling/TimeoutInErrorHandlingTest.java
I think using @requires and runtime Platform checks is redundant. I
prefer @requires for both debug build testing and not-windows.
On 2/02/2017 9:59 PM, Thomas Stüfe wrote:
> Dear all,
>
> please review (or re-review) this enhancement. It was already reviewed
> extensively for JDK9 in Oct 16, but pushed back to JDK10. Now that JDK10
> repo is open, I would like to close this issue.
>
> In short, this enhancement lets the hotspot cancel hanging error
> reporting steps (when writing the hs-err file), in order to allow
> subsequent error reporting steps to run. In our JVM variant, this has
> been active for a long time and has proven to make error reporting more
> stable in the face of deadlocks or file system stalls.
>
> Issue:
> https://bugs.openjdk.java.net/browse/JDK-8166944
>
> Webrev:
> http://cr.openjdk.java.net/~stuefe/webrevs/8166944-Hanging-Error-Reporting/jdk10-webrev.02/webrev/
>
> Old Diskussion for JDK9:
> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-October/021582.html
>
> The webrev is unchanged from version 2 discussed in october, just got
> rebased atop of JDK10/hs.
>
> Thank you,
>
> Kind Regards, Thomas
More information about the hotspot-runtime-dev
mailing list