RFR (S): 8028280 ParkEvent leak when running modified runThese which only loads classes

David Simms david.simms at oracle.com
Tue Jan 21 07:59:02 PST 2014


Thanks Frederic !

On 21/01/2014 10:52 a.m., Frederic Parain wrote:
> Looks good to me.
>
> Fred
>
> On 01/15/2014 10:37 AM, David Simms wrote:
>>
>> Still need a 2nd reviewer (doesn't have to be capital 'R') for this fix.
>>
>> Cheers
>> /David Simms
>>
>> On 11/29/2013 08:30 AM, David Simms wrote:
>>>
>>> Thanks for the review !
>>>
>>> Inline...
>>>
>>> On 29/11/2013 2:06 a.m., David Holmes wrote:
>>>> Hi David,
>>>>
>>>> Apologies for the delay in getting back to this, and even more so
>>>> that I won't be able to respond further as I'll be on vacation the
>>>> next two weeks. Fortunately no major issues that would need further
>>>> discussion :)
>>>>
>>> It's okay, we took the bug of the critical list, it's actually hard to
>>> cause an actual OOM.
>>>> src/os/*/vm/os_*.cpp
>>>>
>>>> So ... os::naked_sleep is now morphed into os::naked_short_sleep and
>>>> is always really "naked" ie it uses an OS call. Ok.
>>>>
>>>> Why does the windows version not have:
>>>>
>>>>  assert(ms < 1000, "Long sleep will break Thread.interrupt()
>>>> semantics");
>>>>
>>>> ?
>>> Will fix.
>>>>
>>>> What is your concern with Thread.interrupt here? These sleeps aren't
>>>> involved in Thread.sleep; and the parking logic only looks for
>>>> interrupts at specific points (the existing SpinAcquire code will not
>>>> return if the thread is interrupted). The main issue with long
>>>> blocking is that the thread is not undergoing a state transition and
>>>> so a safepoint will not be reachable until the sleep is over (and we
>>>> reach a point where we check for a safepoint). I'd be worried that a
>>>> 1 second sleep might cause a problem with safepoints. Of course this
>>>> depends on what the caller does before using naked_short_sleep.
>>>>
>>> My concern is someone else looking to use naked_short_sleep() outside
>>> the current context, must be aware that this only for short periods.
>>> I.e. it is in no way interruptible. I can change the comment to be
>>> more generic, i.e. what I just said.
>>>
>>>> ---
>>>>
>>>> src/share/vm/runtime/thread.hpp
>>>>
>>>>     // Low-level leaf-lock primitives used to implement 
>>>> synchronization
>>>>     // and native monitor-mutex infrastructure.
>>>>     // Not for general synchronization use.
>>>> +   // Direct naked OS primitives, safe to use standalone in lock impl
>>>>
>>>> Your comment really only applies to SpinAcquire, not the set of
>>>> methods that the original comment block refers to.
>>>
>>> Whoops, thought that comment referred to SpinAcquire and SpinRelease,
>>> can leave it in for mux acquire and release of course.
>>>>
>>>> ---
>>>>
>>>> src/share/vm/runtime/park.cpp
>>>>
>>>>  91   // Note that if we didn't have the TSM/immortal constraint, then
>>>>  92   // when reattaching, above, we could trim the list.
>>>>
>>>> What is the "reattaching, above" you are referring to? This comment
>>>> seem more directed to the old code than the new ??
>>> Ah, old comment, thanks for spotting that. Although it's still true
>>> ",/above,/" needs to go.
>>>>
>>>> ---
>>>>
>>>> Overall this seems okay but I do wonder about the change in
>>>> performance profile. The scenario that caused the leak indicates a
>>>> high-level of contention. Have any of the synchronization benchmarks
>>>> been run as a sanity check?
>>>
>>> I can talk with the local performance engineer and check for
>>> regressions. Shouldn't be a problem semantically, as I'm modifying
>>> slow-path...of fairly critical code - will test.
>>>
>>> The scenario was a targeted stress test, multi-threaded class-loading
>>> from the same jar. Not quite "real-world", but since the test exist,
>>> will re-run with performance in mind.
>>>
>>> Thanks again !
>>>
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>> On 26/11/2013 8:17 PM, David Simms wrote:
>>>>>
>>>>> Updated patch to avoid Solaris RT link requirement...
>>>>>
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8028280
>>>>> Webrev: http://cr.openjdk.java.net/~dsimms/8028280/
>>>>> <http://cr.openjdk.java.net/%7Edsimms/8028280/>
>>>>>
>>>>> Rerun testing including vm.quick-pcl.testlist on all platforms.
>>>>>
>>>>>
>>>>> On 11/21/2013 04:36 PM, David Simms wrote:
>>>>>>
>>>>>> Sorry,
>>>>>>
>>>>>> Some test failures due to Solaris ld not finding nanosleep...
>>>>>>
>>>>>>
>>>>>> On 11/21/2013 04:03 PM, David Simms wrote:
>>>>>>>
>>>>>>> Small fix for resource leakage with concurrent free list handling
>>>>>>> down in park.cpp
>>>>>>>
>>>>>>> I did modify Thread::SpinAcquire() to be independent of park event
>>>>>>> (naked OS functions only), since this method doesn't need to follow
>>>>>>> Thread.interrupt() semantics (park doesn't make sense here).
>>>>>>>
>>>>>>> This allowed a simple spin lock around the resources without 
>>>>>>> worrying
>>>>>>> about the fact that park is used in runtime locks.
>>>>>>>
>>>>>>> Perhaps SpinAcquire/Release could be spit out to a separate
>>>>>>> stand-alone util class, a some point in the future.
>>>>>>>
>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8028280
>>>>>>> Webrev: http://cr.openjdk.java.net/~dsimms/8028280/
>>>>>>> <http://cr.openjdk.java.net/%7Edsimms/8028280/>
>>>>>>>
>>>>>>> Testing: vm testbase, BigApps (30 mins), jtreg (all 64bit 
>>>>>>> platforms)
>>>>>>> (...almost completed)
>>>>>>>
>>>>>>>
>>>>>>> /David Simms
>>>>>>
>>>>>
>>>
>>
>



More information about the hotspot-runtime-dev mailing list