RFR(M): 8166970: Adapt mutex padding according to DEFAULT_CACHE_LINE_SIZE

Coleen Phillimore coleen.phillimore at oracle.com
Tue Oct 11 17:35:11 UTC 2016


I am fine with this change.  Maybe a one line comment here, something like:

// Using Padded subclasses to prevent false sharing of these global monitors and mutexes.
172 void mutex_init() {
  173   def(tty_lock                     , PaddedMutex  , event,       true,  Monitor::_safepoint_check_never);      // allow to lock in VM



On 10/11/16 12:26 PM, Doerr, Martin wrote:
> Hi all,
>
> I came to the same conclusion regarding inheritance from PaddingEnd.
> Unfortunately, you're also right, Claes, that we should better not use 0 as minimal padding length because some compilers may have trouble with 0 length arrays. I hope 1 is ok as minimal padding length because the new operator does not allocate cache line aligned at the moment. So I don't see any benefit in more padding. (Padding length of 1 byte has the advantage that it may not enlarge the object size if the previous field leaves some space due to its type.)
>
> I believe 2 _LockWord fields on one cache line was basically the problem we wanted to avoid.
>
> Here's a new webrev:
> http://cr.openjdk.java.net/~mdoerr/8166970_mutex_padding/webrev.03/
>
> It also enables changing the _name[] field to a pointer or a smaller array. I guess this should better be done in a separate change (jdk10?).
>
> Please take a look.
>
> Thanks and best regards,
> Martin
>
>
> -----Original Message-----
> From: Claes Redestad [mailto:claes.redestad at oracle.com]
> Sent: Dienstag, 11. Oktober 2016 12:05
> To: Coleen Phillimore <coleen.phillimore at oracle.com>; Doerr, Martin <martin.doerr at sap.com>; daniel.daugherty at oracle.com; hotspot-runtime-dev at openjdk.java.net; David Holmes (david.holmes at oracle.com) <david.holmes at oracle.com>
> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to DEFAULT_CACHE_LINE_SIZE
>
> Hi,
>
> On 2016-10-11 02:03, Coleen Phillimore wrote:
>> Hi,
>>
>> Was the linear allocation in mutex.cpp the cause of the false sharing
>> that you observed?  I think I like this change better than the
>> original, because I've wondered myself why the name string was so
>> long.  So with this, we could make Monitor's smaller if they're
>> embedded in metadata or other structures.
> Music to my ears!
>
> I even think most embedded uses would see improvements if _name was
> removed entirely (or "simply" turned into a const char * so that it's
> not copied and embedded into the Monitor/Mutex)
>
>> Thanks,
>> Coleen
>>
>> On 10/10/16 2:00 PM, Doerr, Martin wrote:
>>> Hi Claes,
>>>
>>> thank you very much for your explanations.
>>>
>>> I agree with you that it would be better to pad where the Monitors
>>> are used. It would still fulfill the purpose of this RFE without
>>> disturbing other usages.
>>>
>>> So I could introduce:
>>> class PaddedMonitor : public Monitor {
>>>     enum {
>>>       CACHE_LINE_PADDING = (int)DEFAULT_CACHE_LINE_SIZE -
>>> (int)sizeof(Monitor),
>>>       PADDING_LEN = CACHE_LINE_PADDING > 0 ? CACHE_LINE_PADDING : 0
>>>     };
>>>     char _padding[PADDING_LEN];
>>> };
>>> and similarly PaddedMutex and replace all of the ones which get
>>> allocated in a linear fashion (mutexLocker.cpp mutex_init()).
> Sure!
>
> Some compilers may take issue with cases where PADDING_LEN == 0 (since
> char _padding[0] is technically illegal C++, but works on gcc etc) so
> maybe that special case will have to be (somewhat excessively):
>
> PADDING_LEN = CACHE_LINE_PADDING > 0 ? CACHE_LINE_PADDING :
> DEFAULT_CACHE_LINE_SIZE
>
> We took a look at if it'd be feasible to express class PaddedMonitor :
> public PaddedEnd<Monitor>, but it appears that'd require variadic
> template arguments (C++11) to get right (since we'd need PaddedEnd to
> transitively publish constructors of Monitor).
>
> Thanks!
>
> /Claes
>
>>> Would you agree with this change?
>>>
>>> Thanks and best regards,
>>> Martin
>>>
>>>
>>> -----Original Message-----
>>> From: Claes Redestad [mailto:claes.redestad at oracle.com]
>>> Sent: Freitag, 7. Oktober 2016 12:35
>>> To: Doerr, Martin <martin.doerr at sap.com>;
>>> daniel.daugherty at oracle.com; hotspot-runtime-dev at openjdk.java.net;
>>> David Holmes (david.holmes at oracle.com) <david.holmes at oracle.com>;
>>> Coleen Phillimore (coleen.phillimore at oracle.com)
>>> <coleen.phillimore at oracle.com>
>>> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to
>>> DEFAULT_CACHE_LINE_SIZE
>>>
>>> Hi,
>>>
>>> I'm concerned that this might be an easy-but-wrong fix to a complex
>>> problem, and acknowledging that there are already use cases where the
>>> _name field is contra-productive. This change adds complexity that
>>> makes it even less likely such uses will be optimized for in the
>>> future.
>>>
>>> There are Padded* types put in place to deal with these concerns
>>> explicitly rather than implicitly *where it matters*, which allows us
>>> the choice of applying padding or not on a per use-case basis (which
>>> means we can also remove the _name field for those use cases that don't
>>> care about either, which might be most outside of the global lists).
>>>
>>> I am very concerned about false sharing, but I have no data to support
>>> that this change has any measurable benefit in practice: I even did an
>>> experiment years ago now where I turned _name into a pointer to not pad
>>> at all and saw nothing exceeding noise levels on any benchmark.
>>>
>>> Thanks!
>>>
>>> /Claes
>>>
>>> On 2016-10-07 12:18, Doerr, Martin wrote:
>>>> Hi Claes,
>>>>
>>>> what the change basically does is that the _name[] field gets
>>>> enlarged by 8 bytes on platforms with 128 byte
>>>> DEFAULT_CACHE_LINE_SIZE. The logic behind it is completely computed
>>>> by the C++ compiler.
>>>> What exactly is your concern about the footprint overhead?
>>>> Are you not concerned about the risk of false sharing?
>>>>
>>>> Best regards,
>>>> Martin
>>>>
>>>> -----Original Message-----
>>>> From: Claes Redestad [mailto:claes.redestad at oracle.com]
>>>> Sent: Freitag, 7. Oktober 2016 12:00
>>>> To: Doerr, Martin <martin.doerr at sap.com>;
>>>> daniel.daugherty at oracle.com; hotspot-runtime-dev at openjdk.java.net;
>>>> David Holmes (david.holmes at oracle.com) <david.holmes at oracle.com>;
>>>> Coleen Phillimore (coleen.phillimore at oracle.com)
>>>> <coleen.phillimore at oracle.com>
>>>> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to
>>>> DEFAULT_CACHE_LINE_SIZE
>>>>
>>>> Hi,
>>>>
>>>> after due consideration I strongly consider this change unacceptable
>>>> since it adds footprint overhead to performance critcial compiler and
>>>> GC code with little to no data to support this won't cause regressions.
>>>>
>>>> Changes to Monitor/Mutex needs to be done with more surgical precision
>>>> than this.
>>>>
>>>> If I do have a veto on the matter, here it is.
>>>>
>>>> Thanks!
>>>>
>>>> /Claes
>>>>
>>>> On 2016-10-07 11:34, Doerr, Martin wrote:
>>>>> Hi Dan,
>>>>>
>>>>> thank you very much for reviewing and for investigating the history.
>>>>>
>>>>> It was not intended to make the functions you mentioned public.
>>>>> I've fixed that.
>>>>> I also updated the copyright information.
>>>>>
>>>>> New webrev is here:
>>>>> http://cr.openjdk.java.net/~mdoerr/8166970_mutex_padding/webrev.02/
>>>>>
>>>>> @Coleen: Please use this one. I have also added reviewer attribution.
>>>>>
>>>>> Thanks and best regards,
>>>>> Martin
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Daniel D. Daugherty [mailto:daniel.daugherty at oracle.com]
>>>>> Sent: Donnerstag, 6. Oktober 2016 23:13
>>>>> To: Doerr, Martin <martin.doerr at sap.com>;
>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to
>>>>> DEFAULT_CACHE_LINE_SIZE
>>>>>
>>>>> On 9/30/16 9:48 AM, Doerr, Martin wrote:
>>>>>> Hi,
>>>>>>
>>>>>> the current implementation of Monitor padding (mutex.cpp) assumes
>>>>>> that cache lines are 64 Bytes. There's a platform dependent define
>>>>>> "DEFAULT_CACHE_LINE_SIZE" available which can be used. Purpose of
>>>>>> padding is to avoid false sharing.
>>>>>>
>>>>>> My proposed change is here:
>>>>>> http://cr.openjdk.java.net/~mdoerr/8166970_mutex_padding/webrev.00/
>>>>> src/share/vm/runtime/mutex.hpp
>>>>>          Please update the copyright year before pushing.
>>>>>
>>>>>          L172:   // The default length of monitor name is chosen to
>>>>> avoid
>>>>> false sharing.
>>>>>          L173:   enum {
>>>>>          L174:     CACHE_LINE_PADDING = DEFAULT_CACHE_LINE_SIZE -
>>>>> sizeof(MonitorBase),
>>>>>          L175:     MONITOR_NAME_LEN = CACHE_LINE_PADDING > 64 ?
>>>>> CACHE_LINE_PADDING : 64
>>>>>          L176:   };
>>>>>          L177:   char _name[MONITOR_NAME_LEN];          // Name of
>>>>> mutex
>>>>>
>>>>>              I have to say that I'm not fond of the fact that
>>>>> MONITOR_NAME_LEN
>>>>>              can vary between platforms; I like that it is a minimum
>>>>> of 64 bytes
>>>>>              and is still a constant.
>>>>>
>>>>>              I'm also not happy that the resulting sizeof(Monitor)
>>>>> may not
>>>>> be a multiple
>>>>>              of the DEFAULT_CACHE_LINE_SIZE. However, I have to
>>>>> mitigate
>>>>> that unhappiness
>>>>>              with the fact that sizeof(Monitor) hasn't been a
>>>>> multiple of
>>>>> the cache line
>>>>>              size since at least 2008 and no one complained (that I
>>>>> know of).
>>>>>
>>>>>              So if I was making this change, I would make
>>>>> MONITOR_NAME_LEN
>>>>> 64 bytes
>>>>>              (like it was) and add a pad field that would bring up
>>>>> sizeof(Monitor)
>>>>>              to be a multiple of DEFAULT_CACHE_LINE_SIZE. Of course,
>>>>> Claes
>>>>> would be
>>>>>              unhappy with me and anyone embedding a Monitor into
>>>>> another data
>>>>>              structure would be unhappy with me, but I'm used to
>>>>> that :-)
>>>>>
>>>>>              So what you have is fine, especially for JDK9.
>>>>>
>>>>>          L180:  public:
>>>>>          L181: #ifndef PRODUCT
>>>>>          L182:   debug_only(static bool contains(Monitor * locks,
>>>>> Monitor *
>>>>> lock);)
>>>>>          L183:   debug_only(static Monitor *
>>>>> get_least_ranked_lock(Monitor *
>>>>> locks);)
>>>>>          L184:   debug_only(Monitor *
>>>>> get_least_ranked_lock_besides_this(Monitor * locks);)
>>>>>          L185: #endif
>>>>>          L186:
>>>>>          L187:   void set_owner_implementation(Thread*
>>>>> owner)                        PRODUCT_RETURN;
>>>>>          L188:   void check_prelock_state     (Thread*
>>>>> thread)                       PRODUCT_RETURN;
>>>>>          L189:   void check_block_state       (Thread* thread)
>>>>>
>>>>>              These were all "protected" before. Now they are "public".
>>>>>              Any particular reason?
>>>>>
>>>>> Thumbs up on the mechanics of this change. I'm interested in the
>>>>> answer to the "protected" versus "public" question, but don't
>>>>> considered that query to be a blocker.
>>>>>
>>>>>
>>>>> The rest of this isn't code review, but some of this caught
>>>>> my attention.
>>>>>
>>>>> src/share/vm/runtime/mutex.hpp
>>>>>
>>>>>          old L84: // The default length of monitor name is chosen to
>>>>> be 64
>>>>> to avoid false sharing.
>>>>>          old L85: static const int MONITOR_NAME_LEN = 64;
>>>>>
>>>>> I had to look up the history of this comment:
>>>>>
>>>>> $ hg log -r 55 src/share/vm/runtime/mutex.hpp
>>>>> changeset:   55:2a8eb116ebbe
>>>>> user:        xlu
>>>>> date:        Tue Feb 05 23:21:57 2008 -0800
>>>>> summary:     6610420: Debug VM crashes during monitor lock rank
>>>>> checking
>>>>>
>>>>> $ hg diff -r5{4,5} src/share/vm/runtime/mutex.hpp
>>>>> diff -r d4a0f561287a -r 2a8eb116ebbe src/share/vm/runtime/mutex.hpp
>>>>> --- a/src/share/vm/runtime/mutex.hpp    Thu Jan 31 14:56:50 2008 -0500
>>>>> +++ b/src/share/vm/runtime/mutex.hpp    Tue Feb 05 23:21:57 2008 -0800
>>>>> @@ -82,6 +82,9 @@ class ParkEvent ;
>>>>>       // *in that order*.  If their implementations change such that
>>>>> these
>>>>>       // assumptions are violated, a whole lot of code will break.
>>>>>
>>>>> +// The default length of monitor name is choosen to be 64 to avoid
>>>>> false sharing.
>>>>> +static const int MONITOR_NAME_LEN = 64;
>>>>> +
>>>>>       class Monitor : public CHeapObj {
>>>>>
>>>>>        public:
>>>>> @@ -126,9 +129,8 @@ class Monitor : public CHeapObj {
>>>>>         volatile intptr_t _WaitLock [1] ;      // Protects _WaitSet
>>>>>         ParkEvent * volatile  _WaitSet ;       // LL of ParkEvents
>>>>>         volatile bool     _snuck;              // Used for sneaky
>>>>> locking
>>>>> (evil).
>>>>> -  const char * _name;                    // Name of mutex
>>>>>         int NotifyCount ;                      // diagnostic assist
>>>>> -  double pad [8] ;                       // avoid false sharing
>>>>> +  char _name[MONITOR_NAME_LEN];          // Name of mutex
>>>>>
>>>>>         // Debugging fields for naming, deadlock detection, etc.
>>>>> (some only
>>>>> used in debug mode)
>>>>>       #ifndef PRODUCT
>>>>> @@ -170,7 +172,7 @@ class Monitor : public CHeapObj {
>>>>>          int  ILocked () ;
>>>>>
>>>>>        protected:
>>>>> -   static void ClearMonitor (Monitor * m) ;
>>>>> +   static void ClearMonitor (Monitor * m, const char* name = NULL) ;
>>>>>          Monitor() ;
>>>>>
>>>>> So the original code had an 8-double pad for avoiding false sharing.
>>>>> Sounds very much like the old ObjectMonitor padding. I'm sure at the
>>>>> time that Dice determined that 8-double value, the result was to pad
>>>>> the size of Monitor to an even multiple of a particular cache line
>>>>> size.
>>>>>
>>>>> Xiobin changed the 'name' field to be an array so that the name
>>>>> chars could serve double duty as the cache line pad... pun intended.
>>>>> Unfortunately that pad doesn't make sure that the resulting Monitor
>>>>> size is a multiple of the cache line size.
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>>> Please review. If will also need a sponsor.
>>>>>>
>>>>>> Thanks and best regards,
>>>>>> Martin
>>>>>>



More information about the hotspot-runtime-dev mailing list