RFR(M): 8166970: Adapt mutex padding according to DEFAULT_CACHE_LINE_SIZE

Doerr, Martin martin.doerr at sap.com
Wed Oct 12 08:53:01 UTC 2016


Thanks everbody for reviewing.

The webrev with additional comments is here:
http://cr.openjdk.java.net/~mdoerr/8166970_mutex_padding/webrev.04/

I have added a TODO to check if the _name[] array should better get replaced by a const char*.
Would you like me to open a new bug for jdk 10 so we have a reminder?

Thank you very much for sponsoring, Coleen.
 
Best regards,
Martin

-----Original Message-----
From: David Holmes [mailto:david.holmes at oracle.com] 
Sent: Mittwoch, 12. Oktober 2016 04:22
To: Doerr, Martin <martin.doerr at sap.com>; Claes Redestad <claes.redestad at oracle.com>; Coleen Phillimore <coleen.phillimore at oracle.com>; daniel.daugherty at oracle.com; hotspot-runtime-dev at openjdk.java.net
Subject: Re: RFR(M): 8166970: Adapt mutex padding according to DEFAULT_CACHE_LINE_SIZE

Looks good to me too! Only comment is do we want to change this comment:

  84 // The default length of monitor name is chosen to be 64 to avoid 
false sharing.
   85 static const int MONITOR_NAME_LEN = 64;

and do we even want to change the value here?

Thanks,
David

On 12/10/2016 2:26 AM, Doerr, Martin wrote:
> Hi all,
>
> I came to the same conclusion regarding inheritance from PaddingEnd.
> Unfortunately, you're also right, Claes, that we should better not use 0 as minimal padding length because some compilers may have trouble with 0 length arrays. I hope 1 is ok as minimal padding length because the new operator does not allocate cache line aligned at the moment. So I don't see any benefit in more padding. (Padding length of 1 byte has the advantage that it may not enlarge the object size if the previous field leaves some space due to its type.)
>
> I believe 2 _LockWord fields on one cache line was basically the problem we wanted to avoid.
>
> Here's a new webrev:
> http://cr.openjdk.java.net/~mdoerr/8166970_mutex_padding/webrev.03/
>
> It also enables changing the _name[] field to a pointer or a smaller array. I guess this should better be done in a separate change (jdk10?).
>
> Please take a look.
>
> Thanks and best regards,
> Martin
>
>
> -----Original Message-----
> From: Claes Redestad [mailto:claes.redestad at oracle.com]
> Sent: Dienstag, 11. Oktober 2016 12:05
> To: Coleen Phillimore <coleen.phillimore at oracle.com>; Doerr, Martin <martin.doerr at sap.com>; daniel.daugherty at oracle.com; hotspot-runtime-dev at openjdk.java.net; David Holmes (david.holmes at oracle.com) <david.holmes at oracle.com>
> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to DEFAULT_CACHE_LINE_SIZE
>
> Hi,
>
> On 2016-10-11 02:03, Coleen Phillimore wrote:
>>
>> Hi,
>>
>> Was the linear allocation in mutex.cpp the cause of the false sharing
>> that you observed?  I think I like this change better than the
>> original, because I've wondered myself why the name string was so
>> long.  So with this, we could make Monitor's smaller if they're
>> embedded in metadata or other structures.
>
> Music to my ears!
>
> I even think most embedded uses would see improvements if _name was
> removed entirely (or "simply" turned into a const char * so that it's
> not copied and embedded into the Monitor/Mutex)
>
>>
>> Thanks,
>> Coleen
>>
>> On 10/10/16 2:00 PM, Doerr, Martin wrote:
>>> Hi Claes,
>>>
>>> thank you very much for your explanations.
>>>
>>> I agree with you that it would be better to pad where the Monitors
>>> are used. It would still fulfill the purpose of this RFE without
>>> disturbing other usages.
>>>
>>> So I could introduce:
>>> class PaddedMonitor : public Monitor {
>>>    enum {
>>>      CACHE_LINE_PADDING = (int)DEFAULT_CACHE_LINE_SIZE -
>>> (int)sizeof(Monitor),
>>>      PADDING_LEN = CACHE_LINE_PADDING > 0 ? CACHE_LINE_PADDING : 0
>>>    };
>>>    char _padding[PADDING_LEN];
>>> };
>>> and similarly PaddedMutex and replace all of the ones which get
>>> allocated in a linear fashion (mutexLocker.cpp mutex_init()).
>
> Sure!
>
> Some compilers may take issue with cases where PADDING_LEN == 0 (since
> char _padding[0] is technically illegal C++, but works on gcc etc) so
> maybe that special case will have to be (somewhat excessively):
>
> PADDING_LEN = CACHE_LINE_PADDING > 0 ? CACHE_LINE_PADDING :
> DEFAULT_CACHE_LINE_SIZE
>
> We took a look at if it'd be feasible to express class PaddedMonitor :
> public PaddedEnd<Monitor>, but it appears that'd require variadic
> template arguments (C++11) to get right (since we'd need PaddedEnd to
> transitively publish constructors of Monitor).
>
> Thanks!
>
> /Claes
>
>>>
>>> Would you agree with this change?
>>>
>>> Thanks and best regards,
>>> Martin
>>>
>>>
>>> -----Original Message-----
>>> From: Claes Redestad [mailto:claes.redestad at oracle.com]
>>> Sent: Freitag, 7. Oktober 2016 12:35
>>> To: Doerr, Martin <martin.doerr at sap.com>;
>>> daniel.daugherty at oracle.com; hotspot-runtime-dev at openjdk.java.net;
>>> David Holmes (david.holmes at oracle.com) <david.holmes at oracle.com>;
>>> Coleen Phillimore (coleen.phillimore at oracle.com)
>>> <coleen.phillimore at oracle.com>
>>> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to
>>> DEFAULT_CACHE_LINE_SIZE
>>>
>>> Hi,
>>>
>>> I'm concerned that this might be an easy-but-wrong fix to a complex
>>> problem, and acknowledging that there are already use cases where the
>>> _name field is contra-productive. This change adds complexity that
>>> makes it even less likely such uses will be optimized for in the
>>> future.
>>>
>>> There are Padded* types put in place to deal with these concerns
>>> explicitly rather than implicitly *where it matters*, which allows us
>>> the choice of applying padding or not on a per use-case basis (which
>>> means we can also remove the _name field for those use cases that don't
>>> care about either, which might be most outside of the global lists).
>>>
>>> I am very concerned about false sharing, but I have no data to support
>>> that this change has any measurable benefit in practice: I even did an
>>> experiment years ago now where I turned _name into a pointer to not pad
>>> at all and saw nothing exceeding noise levels on any benchmark.
>>>
>>> Thanks!
>>>
>>> /Claes
>>>
>>> On 2016-10-07 12:18, Doerr, Martin wrote:
>>>> Hi Claes,
>>>>
>>>> what the change basically does is that the _name[] field gets
>>>> enlarged by 8 bytes on platforms with 128 byte
>>>> DEFAULT_CACHE_LINE_SIZE. The logic behind it is completely computed
>>>> by the C++ compiler.
>>>> What exactly is your concern about the footprint overhead?
>>>> Are you not concerned about the risk of false sharing?
>>>>
>>>> Best regards,
>>>> Martin
>>>>
>>>> -----Original Message-----
>>>> From: Claes Redestad [mailto:claes.redestad at oracle.com]
>>>> Sent: Freitag, 7. Oktober 2016 12:00
>>>> To: Doerr, Martin <martin.doerr at sap.com>;
>>>> daniel.daugherty at oracle.com; hotspot-runtime-dev at openjdk.java.net;
>>>> David Holmes (david.holmes at oracle.com) <david.holmes at oracle.com>;
>>>> Coleen Phillimore (coleen.phillimore at oracle.com)
>>>> <coleen.phillimore at oracle.com>
>>>> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to
>>>> DEFAULT_CACHE_LINE_SIZE
>>>>
>>>> Hi,
>>>>
>>>> after due consideration I strongly consider this change unacceptable
>>>> since it adds footprint overhead to performance critcial compiler and
>>>> GC code with little to no data to support this won't cause regressions.
>>>>
>>>> Changes to Monitor/Mutex needs to be done with more surgical precision
>>>> than this.
>>>>
>>>> If I do have a veto on the matter, here it is.
>>>>
>>>> Thanks!
>>>>
>>>> /Claes
>>>>
>>>> On 2016-10-07 11:34, Doerr, Martin wrote:
>>>>> Hi Dan,
>>>>>
>>>>> thank you very much for reviewing and for investigating the history.
>>>>>
>>>>> It was not intended to make the functions you mentioned public.
>>>>> I've fixed that.
>>>>> I also updated the copyright information.
>>>>>
>>>>> New webrev is here:
>>>>> http://cr.openjdk.java.net/~mdoerr/8166970_mutex_padding/webrev.02/
>>>>>
>>>>> @Coleen: Please use this one. I have also added reviewer attribution.
>>>>>
>>>>> Thanks and best regards,
>>>>> Martin
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Daniel D. Daugherty [mailto:daniel.daugherty at oracle.com]
>>>>> Sent: Donnerstag, 6. Oktober 2016 23:13
>>>>> To: Doerr, Martin <martin.doerr at sap.com>;
>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to
>>>>> DEFAULT_CACHE_LINE_SIZE
>>>>>
>>>>> On 9/30/16 9:48 AM, Doerr, Martin wrote:
>>>>>> Hi,
>>>>>>
>>>>>> the current implementation of Monitor padding (mutex.cpp) assumes
>>>>>> that cache lines are 64 Bytes. There's a platform dependent define
>>>>>> "DEFAULT_CACHE_LINE_SIZE" available which can be used. Purpose of
>>>>>> padding is to avoid false sharing.
>>>>>>
>>>>>> My proposed change is here:
>>>>>> http://cr.openjdk.java.net/~mdoerr/8166970_mutex_padding/webrev.00/
>>>>> src/share/vm/runtime/mutex.hpp
>>>>>         Please update the copyright year before pushing.
>>>>>
>>>>>         L172:   // The default length of monitor name is chosen to
>>>>> avoid
>>>>> false sharing.
>>>>>         L173:   enum {
>>>>>         L174:     CACHE_LINE_PADDING = DEFAULT_CACHE_LINE_SIZE -
>>>>> sizeof(MonitorBase),
>>>>>         L175:     MONITOR_NAME_LEN = CACHE_LINE_PADDING > 64 ?
>>>>> CACHE_LINE_PADDING : 64
>>>>>         L176:   };
>>>>>         L177:   char _name[MONITOR_NAME_LEN];          // Name of
>>>>> mutex
>>>>>
>>>>>             I have to say that I'm not fond of the fact that
>>>>> MONITOR_NAME_LEN
>>>>>             can vary between platforms; I like that it is a minimum
>>>>> of 64 bytes
>>>>>             and is still a constant.
>>>>>
>>>>>             I'm also not happy that the resulting sizeof(Monitor)
>>>>> may not
>>>>> be a multiple
>>>>>             of the DEFAULT_CACHE_LINE_SIZE. However, I have to
>>>>> mitigate
>>>>> that unhappiness
>>>>>             with the fact that sizeof(Monitor) hasn't been a
>>>>> multiple of
>>>>> the cache line
>>>>>             size since at least 2008 and no one complained (that I
>>>>> know of).
>>>>>
>>>>>             So if I was making this change, I would make
>>>>> MONITOR_NAME_LEN
>>>>> 64 bytes
>>>>>             (like it was) and add a pad field that would bring up
>>>>> sizeof(Monitor)
>>>>>             to be a multiple of DEFAULT_CACHE_LINE_SIZE. Of course,
>>>>> Claes
>>>>> would be
>>>>>             unhappy with me and anyone embedding a Monitor into
>>>>> another data
>>>>>             structure would be unhappy with me, but I'm used to
>>>>> that :-)
>>>>>
>>>>>             So what you have is fine, especially for JDK9.
>>>>>
>>>>>         L180:  public:
>>>>>         L181: #ifndef PRODUCT
>>>>>         L182:   debug_only(static bool contains(Monitor * locks,
>>>>> Monitor *
>>>>> lock);)
>>>>>         L183:   debug_only(static Monitor *
>>>>> get_least_ranked_lock(Monitor *
>>>>> locks);)
>>>>>         L184:   debug_only(Monitor *
>>>>> get_least_ranked_lock_besides_this(Monitor * locks);)
>>>>>         L185: #endif
>>>>>         L186:
>>>>>         L187:   void set_owner_implementation(Thread*
>>>>> owner)                        PRODUCT_RETURN;
>>>>>         L188:   void check_prelock_state     (Thread*
>>>>> thread)                       PRODUCT_RETURN;
>>>>>         L189:   void check_block_state       (Thread* thread)
>>>>>
>>>>>             These were all "protected" before. Now they are "public".
>>>>>             Any particular reason?
>>>>>
>>>>> Thumbs up on the mechanics of this change. I'm interested in the
>>>>> answer to the "protected" versus "public" question, but don't
>>>>> considered that query to be a blocker.
>>>>>
>>>>>
>>>>> The rest of this isn't code review, but some of this caught
>>>>> my attention.
>>>>>
>>>>> src/share/vm/runtime/mutex.hpp
>>>>>
>>>>>         old L84: // The default length of monitor name is chosen to
>>>>> be 64
>>>>> to avoid false sharing.
>>>>>         old L85: static const int MONITOR_NAME_LEN = 64;
>>>>>
>>>>> I had to look up the history of this comment:
>>>>>
>>>>> $ hg log -r 55 src/share/vm/runtime/mutex.hpp
>>>>> changeset:   55:2a8eb116ebbe
>>>>> user:        xlu
>>>>> date:        Tue Feb 05 23:21:57 2008 -0800
>>>>> summary:     6610420: Debug VM crashes during monitor lock rank
>>>>> checking
>>>>>
>>>>> $ hg diff -r5{4,5} src/share/vm/runtime/mutex.hpp
>>>>> diff -r d4a0f561287a -r 2a8eb116ebbe src/share/vm/runtime/mutex.hpp
>>>>> --- a/src/share/vm/runtime/mutex.hpp    Thu Jan 31 14:56:50 2008 -0500
>>>>> +++ b/src/share/vm/runtime/mutex.hpp    Tue Feb 05 23:21:57 2008 -0800
>>>>> @@ -82,6 +82,9 @@ class ParkEvent ;
>>>>>      // *in that order*.  If their implementations change such that
>>>>> these
>>>>>      // assumptions are violated, a whole lot of code will break.
>>>>>
>>>>> +// The default length of monitor name is choosen to be 64 to avoid
>>>>> false sharing.
>>>>> +static const int MONITOR_NAME_LEN = 64;
>>>>> +
>>>>>      class Monitor : public CHeapObj {
>>>>>
>>>>>       public:
>>>>> @@ -126,9 +129,8 @@ class Monitor : public CHeapObj {
>>>>>        volatile intptr_t _WaitLock [1] ;      // Protects _WaitSet
>>>>>        ParkEvent * volatile  _WaitSet ;       // LL of ParkEvents
>>>>>        volatile bool     _snuck;              // Used for sneaky
>>>>> locking
>>>>> (evil).
>>>>> -  const char * _name;                    // Name of mutex
>>>>>        int NotifyCount ;                      // diagnostic assist
>>>>> -  double pad [8] ;                       // avoid false sharing
>>>>> +  char _name[MONITOR_NAME_LEN];          // Name of mutex
>>>>>
>>>>>        // Debugging fields for naming, deadlock detection, etc.
>>>>> (some only
>>>>> used in debug mode)
>>>>>      #ifndef PRODUCT
>>>>> @@ -170,7 +172,7 @@ class Monitor : public CHeapObj {
>>>>>         int  ILocked () ;
>>>>>
>>>>>       protected:
>>>>> -   static void ClearMonitor (Monitor * m) ;
>>>>> +   static void ClearMonitor (Monitor * m, const char* name = NULL) ;
>>>>>         Monitor() ;
>>>>>
>>>>> So the original code had an 8-double pad for avoiding false sharing.
>>>>> Sounds very much like the old ObjectMonitor padding. I'm sure at the
>>>>> time that Dice determined that 8-double value, the result was to pad
>>>>> the size of Monitor to an even multiple of a particular cache line
>>>>> size.
>>>>>
>>>>> Xiobin changed the 'name' field to be an array so that the name
>>>>> chars could serve double duty as the cache line pad... pun intended.
>>>>> Unfortunately that pad doesn't make sure that the resulting Monitor
>>>>> size is a multiple of the cache line size.
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>>> Please review. If will also need a sponsor.
>>>>>>
>>>>>> Thanks and best regards,
>>>>>> Martin
>>>>>>
>>
>


More information about the hotspot-runtime-dev mailing list