RFR (S): 8228857: Refactor PlatformMonitor into PlatformMutex and PlatformMonitor

Wed Aug 7 13:50:01 UTC 2019

Hi Kim/David,

On 8/7/19 12:10 AM, David Holmes wrote:
[...]
>> It is true that only a pthread_cond_t that has been allocated and
>> later destroyed can cause problems. However, we (the VM) might not be
>> the only place doing such. Native code, library code, &etc might also
>> do that. Even the current mechanism doesn't fully protect us from
>> that. This was something that Patricio and I recognized when we did
>> the MacOSX workaround; it's not completely bullet proof.
>>
>> However, with the current mechanism of allocating pthread_mutex/cond_t
>> pairs and freelisting them, we highwater pretty quickly. We don't have
>> many places that actually release them. (I don't remember whether
>> there's anything other than the task terminator mechanism.) And once
>> we've highwatered, the VM is safe from any further condvar
>> destructions by non-VM code. So the only risk is from native code
>> using pthread_cond_t's before that highwater mark is reached.
>>
>> If the VM only freelists its pthread_cond_t's and not its
>> pthread_mutex_t's, instead allocating and deleting the latter normally
>> (as Per suggested and I had originally intended to suggest), then the
>> VM's pthread_mutex_t's are subject to getting scrambled by ongoing
>> non-VM pthread_cond_t usage.
> 
> Yes you are right. I was only thinking about VM usage, but if the VM is 
> hosted then mutex/condvar could be being used a lot and not using the 
> freelist for the mutex would greatly increase the risk of reintroducing 
> the bug.
> 
> As you say the freelist approach itself is not bullet-proof and I'm now 
> worried about what complex hosting applications may encounter! :( I 
> think we may need to try and engage with Apple to get the full details 
> on this bug so that we know what our exposure may be. I hope they fixed 
> it with an interim patch and not only with a full release update!
> 
> So in terms of this RFE I leave things as they stand, with the 
> mutex/cond freelist being used in PlatformMutex. I hope the slight space 
> wastage is not significant enough to mean that Zlocks will not consider 
> using PlatformMutex.

It would be really interesting to know if this has been fixed by Apple, 
and hopefully also backported to the macos versions we support. On 
systems that has this bug, we could have all kinds of random VM memory 
corruption (caused by e.g. native code/library using condvars), which we 
can't fix. Right?

May I propose a third alternative, which keeps the best of both worlds: 
Don't mix the two types, i.e. don't let PlatforMonitor inherit from 
PlatformMutex. So, we'd just keep exactly what we have and just add a 
PlatformMutex type, which for posix would be just:

class PlatformMutex : public CHeapObj<mtSynchronizer> {
  protected:
   pthread_mutex_t _mutex;

  public:
   PlatformMutex();
   ~PlatformMutex();

   void lock();
   void unlock();
   bool try_lock();
};

os::PlatformMutex::PlatformMutex() {
   int status = pthread_mutex_init(&_mutex, _mutexAttr);
   assert_status(status == 0, status, "mutex_init");
}

os::PlatformMutex::~PlatformMutex() {
   int status = pthread_mutex_destroy(&_mutex);
   assert_status(status == 0, status, "mutex_destroy");
}

inline void os::PlatformMutex::lock() {
   int status = pthread_mutex_lock(&_mutex);
   assert_status(status == 0, status, "mutex_lock");
}

inline void os::PlatformMutex::unlock() {
   int status = pthread_mutex_unlock(&_mutex);
   assert_status(status == 0, status, "mutex_unlock");
}

inline bool os::PlatformMutex::try_lock() {
   int status = pthread_mutex_trylock(&_mutex);
   assert_status(status == 0 || status == EBUSY, status, "mutex_trylock");
   return status == 0;
}

cheers,
Per