RFR (S): 8228857: Refactor PlatformMonitor into PlatformMutex and PlatformMonitor

David Holmes david.holmes at oracle.com
Tue Aug 6 11:06:30 UTC 2019


Hi Per,

On 6/08/2019 7:12 pm, Per Liden wrote:
> Hi David,
> 
> On 8/6/19 10:06 AM, David Holmes wrote:
>> Hi Per,
>>
>> On 6/08/2019 5:42 pm, Per Liden wrote:
>>> Hi David,
>>>
>>> On 8/2/19 1:03 AM, David Holmes wrote:
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8228857
>>>> webrev: http://cr.openjdk.java.net/~dholmes/8228857/webrev/
>>>
>>> src/hotspot/os/posix/os_posix.hpp
>>> ---------------------------------
>>> As far as I recall, the macos bug is only related to the lifecycle of 
>>> condvars (not mutexes). Is that correct?
>>
>> Not quite. The issue is the allocation of a pthread_mutex_t at an 
>> address that previously held a pthread_condvar_t.
>>
>>> If so, PlatformMutex could just always own a plain pthread_mutex_t, 
>>> and the PLATFORM_MONITOR_IMPL_INDIRECT stuff could be moved to 
>>> PlatformMonitor to and only deal with the lifecycle of the 
>>> pthread_cond_t. That way the PlatformMutex would not carry the 
>>> burden/cost for working around the condvar bug, which would make 
>>> PlatformMutex even more attractive as a replacement/backing for ZLocks.
>>> How does that sound?
>>
>> I deliberately took the simplest path to just leave the macOS 
>> workaround as is, knowing it means we still pay for the condvar slot 
>> on that platform. :)
>>
>> Because the bug involves both types I don't think I can constrain the 
>> workaround to only the PlatformMonitor class, as I'd need to control 
>> the allocation in the PlatformMutex as well. I could do that with 
>> defining and instantiating a different Impl type in PlatformMonitor to 
>> be installed by PlatformMutex, but the overall complexity increase 
>> doesn't seem like a good trade off to me. Perhaps I've missed a simple 
>> way to deal with this?
> 
> Ok. So we can't have a mutex_t at an address where a cond_t has lived in 
> the past. We guarantee that this can not happen by never actually 
> deallocating any cond_t and instead have our own freelist, essentially 
> protecting those cond_t addresses from ever being reused for any other 
> type. Even if this bug involves both types, just constraining cond_t 
> should be enough to solve the problem, right?
> 
> Just to illustrate what I'm thinking, I adjusted your patch:
> http://cr.openjdk.java.net/~pliden/8228857/webrev.0
> 
> And here's just the diff against your patch:
> http://cr.openjdk.java.net/~pliden/8228857/webrev.0-diff
> 
> This keeps the code pretty similar, but puts the cost on PlatforMonitor. 
> Can you spot a problem with this?

First thanks for taking the time to do this - much appreciated.

I had considered doing something like this, but then was concerned as to 
whether the pthread_mutex_t could by chance end up in a location that 
had once been a pthread_cond_t and so still hit the bug. But I think I 
was overthinking this as the same could be true for the allocation of 
the Impl instances.

I'll take your proposed changes for a test run and in particular see if 
I can re-run the tests that demonstrated the kernel bug.

Thanks again,
David
-----

> cheers,
> Per
> 
>>
>> Thanks,
>> David
>>
>>> cheers,
>>> Per
>>>
>>>>
>>>> As suggested by Per Liden when PlatformMonitor was introduced by 
>>>> JDK-8210832 we should refactor PlatformMonitor into a simpler 
>>>> PlatformMutex extended by PlatformMonitor. That would potentially 
>>>> allow other synchronization objects e.g. Zlocks, to be replaced by a 
>>>> suitable PlatformX class, and allows us to redefine JVM_RawMonitor 
>>>> support to use it (forthcoming change).
>>>>
>>>> The refactoring would be obvious and simple if not for the macOS 
>>>> PThread allocation bug workaround. I had to change the Posix variant 
>>>> so that we only use the Impl helper class on macOS, so that we can 
>>>> separate the allocation of the pthread_mutex_t and pthread_cond_t. 
>>>> For macOS both get allocated in the PlatformMutex.
>>>>
>>>> There are not actual usage changes in this RFE.
>>>>
>>>> Testing: mach5 tiers 1-3
>>>>
>>>> Thanks,
>>>> David
>>>> -----


More information about the hotspot-dev mailing list