RFR(XS): 8188872: runtime/ErrorHandling/TimeoutInErrorHandlingTest.java fails intermittently

Daniel D. Daugherty daniel.daugherty at oracle.com
Mon Jun 3 19:45:51 UTC 2019


Thomas,

Thanks for the review!

Dan


On 6/3/19 3:44 PM, Thomas Stüfe wrote:
> Hi Dan,
>
> I am fine with your patch, consider it reviewed from my side. As I 
> wrote off-list, I am a bit unhappy with the delay-the-timer-start-part 
> but I see your point and have no better idea. Maybe we can think of 
> something better later. For now, I do not wish to drag this out, you 
> suffered enough :)
>
> Thanks for your perseverance!
>
> Cheers, Thomas
>
> On Fri, May 31, 2019 at 11:37 PM Daniel D. Daugherty 
> <daniel.daugherty at oracle.com <mailto:daniel.daugherty at oracle.com>> wrote:
>
>     David H has reviewed this. I still need a second reviewer...
>
>     Dan
>
>
>     On 5/29/19 8:42 PM, Daniel D. Daugherty wrote:
>     > Ping! Anyone out there? :-)
>     >
>     > Dan
>     >
>     > On 5/28/19 8:12 PM, Daniel D. Daugherty wrote:
>     >> Greetings,
>     >>
>     >> I have a fix for the following longstanding bug:
>     >>
>     >>     JDK-8188872
>     runtime/ErrorHandling/TimeoutInErrorHandlingTest.java
>     >> fails intermittently
>     >> https://bugs.openjdk.java.net/browse/JDK-8188872
>     >>
>     >> I've include Thomas Stüfe directly since I'm modifying his code...
>     >>
>     >> This fix include changes to the error handling code, the VM parts
>     >> of the test (-XX:+TestUnresponsiveErrorHandler) and the test
>     itself.
>     >> The changes themselves are small, but the reasons are complicated
>     >> so a detailed explanation is required.
>     >>
>     >> Summary of the changes:
>     >>
>     >> - src/hotspot/share/utilities/vmError.cpp
>     >>   - add VMError::clear_step_start_time() and call it from the
>     >>     error reporting END macro.
>     >>     - VMError::report() is called twice: first, to generate a
>     summary
>     >>       for stdout and second, to generate hs_err_pid output.
>     >>     - Adding clear_step_start_time() prevents
>     >> interrupt_reporting_thread()
>     >>       from interrupting the error reporting thread between the two
>     >> calls to
>     >>       VMError::report().
>     >>     - This solves the problem where hs_err_pid file creation gets
>     >> interrupted
>     >>       and the hs_err_pid file ends up being created in
>     >> /tmp/hs_err_pid...
>     >>   - add a STEP in VMError::report() for setting up the 'start
>     time' for
>     >>     the TestUnresponsiveErrorHandler test
>     >>     - There is a corresponding change in
>     VMError::report_and_die() that
>     >>       skips the call to record_reporting_start_time() when we are
>     >>       executing TestUnresponsiveErrorHandler.
>     >>     - This solves the problem where the error reporting thread is
>     >> exposed
>     >>       to interrupt_reporting_thread() calls before it has
>     reached the
>     >>       first STEP in VMError::report().
>     >>   - change VMError::check_timeout() to only call
>     >> interrupt_reporting_thread()
>     >>     once per timeout detection for either a total reporting
>     timeout or a
>     >>     step timeout:
>     >>     - check_timeout() is called by the WatcherThread once per
>     second
>     >> once
>     >>       it determines that errror reporting has started. This change
>     >> solves
>     >>       the problem where a timeout is detected, the error reporting
>     >> thread
>     >>       takes longer than a second to do its work so the
>     WatcherThread
>     >> calls
>     >>       check_timeout() (and interrupt_reporting_thread()) again
>     which
>     >>       restarts the STEP we were on from the beginning.
>     >> - src/hotspot/share/utilities/vmError.hpp
>     >>   - add clear_step_start_time()
>     >> -
>     >>
>     test/hotspot/jtreg/runtime/ErrorHandling/TimeoutInErrorHandlingTest.java
>     >>   - add support for '-Dverbose=true' to get more verbose test
>     output
>     >>   - Default ERROR_LOG_TIMEOUT is 16 seconds; Solaris sets it to 3X.
>     >>   - dump the cmd output if we can't find the 'hs_err_pid' file
>     >>   - dump the cmd output if we can't open the 'hs_err_pid' file
>     >>   - dump the hs_err_pid file if we fail to match the patterns
>     >>
>     >> Webrev URL:
>     >> http://cr.openjdk.java.net/~dcubed/8188872-webrev/0-for-jdk-jdk13/
>     >>
>     >> Testing: Mach5 Tier[1-5]
>     >>          Included the fix in my latest round of 8153224 testing
>     >>          on Solaris-X64 where this bug reproduces quite a bit.
>     >>
>     >> Thanks, in advance, for any comments, suggestions, or questions.
>     >>
>     >> Dan
>     >>
>     >
>     >
>



More information about the hotspot-runtime-dev mailing list