RFR(XXS): 8077392 and 8131715 - fix for hang in Contended Locking "fast enter" bucket
Daniel D. Daugherty
daniel.daugherty at oracle.com
Wed Mar 23 16:14:09 UTC 2016
On 3/23/16 4:55 AM, Paul Sandoz wrote:
> Hi Dan,
>
> Well done on finding the root cause and fix for this “bug from hell”!
Thanks! It has been an "interesting" bug hunt!
> Do you have any wisdom to share on the techniques and approach taken to debug and find the cause? these kinds of bugs are not easy so it’s often useful to share such advice.
In addition to the fix itself, this bug hunt has also resulted in
creation of a wiki called "An Introduction to Java Monitors". The
sections are not yet complete, but so far I've written (and
internally presented):
Section 1 - Stack Locks
Section 2 - Inflated Locks
Section 3 - Slow Java Monitor Enter
There will likely be two more sections, but I haven't written
those yet. Once we've reviewed and modified the content internally,
we'll have to see how to get this kind of info out into OpenJDK.
Another result of this bug hunt will be some debug/trace kits. I
created separate debug/tracing for the different parts of Java
Monitors (enter, wait, notify, exit) and I should be able to
extract this stuff into "kit" form for other folks to use. No
one else should have to figure out how to add tracing to the
MacroAssembler code!
> Amy and I spent a bit of time just getting a test case that could reproduce in a “reasonable” period of time with enough confidence it was something in HotSpot (bisecting builds), and then we threw it over the fence to you :-)
You guys gave me a reproducible test case and for that I'm very
grateful! So many of the weird bugs that I hunt don't have an
initial reproducible test case so I have often spent a lot of
time just trying to reproduce the rare, strange, and unusual...
Dan
>
> Paul.
>
>> On 22 Mar 2016, at 21:41, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:
>>
>> Greetings,
>>
>> I have fixes for the following two bugs:
>>
>> JDK-8077392 Stream fork/join tasks occasionally fail to complete
>> https://bugs.openjdk.java.net/browse/JDK-8077392
>>
>> JDK-8131715 backout the fix for JDK-8079359 when JDK-8077392 is fixed
>> https://bugs.openjdk.java.net/browse/JDK-8131715
>>
>> Both fixes are very, very small and will be bundled together in the
>> same changeset for obvious reasons.
>>
>> Here is the webrev URL:
>>
>> http://cr.openjdk.java.net/~dcubed/8077392_8131715-webrev/0-jdk9-hs-rt/
>>
>> While the fix for JDK-8077392 is a simple 1-liner, the explanation of
>> the race is much, much longer. I've attached the detailed evaluation
>> to this RFR; it is a copy of the same note that I added to
>> https://bugs.openjdk.java.net/browse/JDK-8077392, but the attached
>> copy has all the indentation white space intact. I don't know why
>> JBS likes to reformat the notes, but it does... :-(
>>
>> Testing:
>>
>> - the original failing test is running in a parallel stress config
>> on my Solaris X64 server; just under 23 hours and just under
>> 3000 iterations without a failure in either instance; I'm planning
>> to let the stress run go for at least 72 hours.
>> - RT/SVC nightly equivalent (in progress)
>>
>> As always, comments, suggestions and/or questions are welcome.
>>
>> Dan
>> <eval_note5.txt>
More information about the hotspot-runtime-dev
mailing list