RFR (S): 8028280 ParkEvent leak when running modified runThese which only loads classes

Wed Mar 5 17:37:50 UTC 2014

For the record, we're backporting this fix from JDK9 -> JDK8u-hs-dev.

Both Frederic Parain and David Holmes (the original reviewers) have
reviewed the backport. Karen Kinnear (acorn) and I have also reviewed
the backport.

There are no content changes between the versions; only line number
and context diff anchor changes.

Dan

On 1/21/14 8:59 AM, David Simms wrote:
>
> Thanks Frederic !
>
> On 21/01/2014 10:52 a.m., Frederic Parain wrote:
>> Looks good to me.
>>
>> Fred
>>
>> On 01/15/2014 10:37 AM, David Simms wrote:
>>>
>>> Still need a 2nd reviewer (doesn't have to be capital 'R') for this 
>>> fix.
>>>
>>> Cheers
>>> /David Simms
>>>
>>> On 11/29/2013 08:30 AM, David Simms wrote:
>>>>
>>>> Thanks for the review !
>>>>
>>>> Inline...
>>>>
>>>> On 29/11/2013 2:06 a.m., David Holmes wrote:
>>>>> Hi David,
>>>>>
>>>>> Apologies for the delay in getting back to this, and even more so
>>>>> that I won't be able to respond further as I'll be on vacation the
>>>>> next two weeks. Fortunately no major issues that would need further
>>>>> discussion :)
>>>>>
>>>> It's okay, we took the bug of the critical list, it's actually hard to
>>>> cause an actual OOM.
>>>>> src/os/*/vm/os_*.cpp
>>>>>
>>>>> So ... os::naked_sleep is now morphed into os::naked_short_sleep and
>>>>> is always really "naked" ie it uses an OS call. Ok.
>>>>>
>>>>> Why does the windows version not have:
>>>>>
>>>>>  assert(ms < 1000, "Long sleep will break Thread.interrupt()
>>>>> semantics");
>>>>>
>>>>> ?
>>>> Will fix.
>>>>>
>>>>> What is your concern with Thread.interrupt here? These sleeps aren't
>>>>> involved in Thread.sleep; and the parking logic only looks for
>>>>> interrupts at specific points (the existing SpinAcquire code will not
>>>>> return if the thread is interrupted). The main issue with long
>>>>> blocking is that the thread is not undergoing a state transition and
>>>>> so a safepoint will not be reachable until the sleep is over (and we
>>>>> reach a point where we check for a safepoint). I'd be worried that a
>>>>> 1 second sleep might cause a problem with safepoints. Of course this
>>>>> depends on what the caller does before using naked_short_sleep.
>>>>>
>>>> My concern is someone else looking to use naked_short_sleep() outside
>>>> the current context, must be aware that this only for short periods.
>>>> I.e. it is in no way interruptible. I can change the comment to be
>>>> more generic, i.e. what I just said.
>>>>
>>>>> ---
>>>>>
>>>>> src/share/vm/runtime/thread.hpp
>>>>>
>>>>>     // Low-level leaf-lock primitives used to implement 
>>>>> synchronization
>>>>>     // and native monitor-mutex infrastructure.
>>>>>     // Not for general synchronization use.
>>>>> +   // Direct naked OS primitives, safe to use standalone in lock 
>>>>> impl
>>>>>
>>>>> Your comment really only applies to SpinAcquire, not the set of
>>>>> methods that the original comment block refers to.
>>>>
>>>> Whoops, thought that comment referred to SpinAcquire and SpinRelease,
>>>> can leave it in for mux acquire and release of course.
>>>>>
>>>>> ---
>>>>>
>>>>> src/share/vm/runtime/park.cpp
>>>>>
>>>>>  91   // Note that if we didn't have the TSM/immortal constraint, 
>>>>> then
>>>>>  92   // when reattaching, above, we could trim the list.
>>>>>
>>>>> What is the "reattaching, above" you are referring to? This comment
>>>>> seem more directed to the old code than the new ??
>>>> Ah, old comment, thanks for spotting that. Although it's still true
>>>> ",/above,/" needs to go.
>>>>>
>>>>> ---
>>>>>
>>>>> Overall this seems okay but I do wonder about the change in
>>>>> performance profile. The scenario that caused the leak indicates a
>>>>> high-level of contention. Have any of the synchronization benchmarks
>>>>> been run as a sanity check?
>>>>
>>>> I can talk with the local performance engineer and check for
>>>> regressions. Shouldn't be a problem semantically, as I'm modifying
>>>> slow-path...of fairly critical code - will test.
>>>>
>>>> The scenario was a targeted stress test, multi-threaded class-loading
>>>> from the same jar. Not quite "real-world", but since the test exist,
>>>> will re-run with performance in mind.
>>>>
>>>> Thanks again !
>>>>
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>> On 26/11/2013 8:17 PM, David Simms wrote:
>>>>>>
>>>>>> Updated patch to avoid Solaris RT link requirement...
>>>>>>
>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8028280
>>>>>> Webrev: http://cr.openjdk.java.net/~dsimms/8028280/
>>>>>> <http://cr.openjdk.java.net/%7Edsimms/8028280/>
>>>>>>
>>>>>> Rerun testing including vm.quick-pcl.testlist on all platforms.
>>>>>>
>>>>>>
>>>>>> On 11/21/2013 04:36 PM, David Simms wrote:
>>>>>>>
>>>>>>> Sorry,
>>>>>>>
>>>>>>> Some test failures due to Solaris ld not finding nanosleep...
>>>>>>>
>>>>>>>
>>>>>>> On 11/21/2013 04:03 PM, David Simms wrote:
>>>>>>>>
>>>>>>>> Small fix for resource leakage with concurrent free list handling
>>>>>>>> down in park.cpp
>>>>>>>>
>>>>>>>> I did modify Thread::SpinAcquire() to be independent of park event
>>>>>>>> (naked OS functions only), since this method doesn't need to 
>>>>>>>> follow
>>>>>>>> Thread.interrupt() semantics (park doesn't make sense here).
>>>>>>>>
>>>>>>>> This allowed a simple spin lock around the resources without 
>>>>>>>> worrying
>>>>>>>> about the fact that park is used in runtime locks.
>>>>>>>>
>>>>>>>> Perhaps SpinAcquire/Release could be spit out to a separate
>>>>>>>> stand-alone util class, a some point in the future.
>>>>>>>>
>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8028280
>>>>>>>> Webrev: http://cr.openjdk.java.net/~dsimms/8028280/
>>>>>>>> <http://cr.openjdk.java.net/%7Edsimms/8028280/>
>>>>>>>>
>>>>>>>> Testing: vm testbase, BigApps (30 mins), jtreg (all 64bit 
>>>>>>>> platforms)
>>>>>>>> (...almost completed)
>>>>>>>>
>>>>>>>>
>>>>>>>> /David Simms
>>>>>>>
>>>>>>
>>>>
>>>
>>
>
>
>
>