RFR(M): 8166970: Adapt mutex padding according to DEFAULT_CACHE_LINE_SIZE

Daniel D. Daugherty daniel.daugherty at oracle.com
Tue Oct 11 17:51:10 UTC 2016


Thumbs up on this version!

Dan



On 10/11/16 11:44 AM, Claes Redestad wrote:
> I am also happy with this, thanks!
>
> /Claes
>
> On 2016-10-11 19:35, Coleen Phillimore wrote:
>>
>> I am fine with this change.  Maybe a one line comment here, something 
>> like:
>>
>> // Using Padded subclasses to prevent false sharing of these global 
>> monitors and mutexes.
>> 172 void mutex_init() {
>>   173   def(tty_lock                     , PaddedMutex  , 
>> event,       true,  Monitor::_safepoint_check_never);      // allow 
>> to lock in VM
>>
>>
>>
>> On 10/11/16 12:26 PM, Doerr, Martin wrote:
>>> Hi all,
>>>
>>> I came to the same conclusion regarding inheritance from PaddingEnd.
>>> Unfortunately, you're also right, Claes, that we should better not 
>>> use 0 as minimal padding length because some compilers may have 
>>> trouble with 0 length arrays. I hope 1 is ok as minimal padding 
>>> length because the new operator does not allocate cache line aligned 
>>> at the moment. So I don't see any benefit in more padding. (Padding 
>>> length of 1 byte has the advantage that it may not enlarge the 
>>> object size if the previous field leaves some space due to its type.)
>>>
>>> I believe 2 _LockWord fields on one cache line was basically the 
>>> problem we wanted to avoid.
>>>
>>> Here's a new webrev:
>>> http://cr.openjdk.java.net/~mdoerr/8166970_mutex_padding/webrev.03/
>>>
>>> It also enables changing the _name[] field to a pointer or a smaller 
>>> array. I guess this should better be done in a separate change 
>>> (jdk10?).
>>>
>>> Please take a look.
>>>
>>> Thanks and best regards,
>>> Martin
>>>
>>>
>>> -----Original Message-----
>>> From: Claes Redestad [mailto:claes.redestad at oracle.com]
>>> Sent: Dienstag, 11. Oktober 2016 12:05
>>> To: Coleen Phillimore<coleen.phillimore at oracle.com>; Doerr, 
>>> Martin<martin.doerr at sap.com>;daniel.daugherty at oracle.com;hotspot-runtime-dev at openjdk.java.net; 
>>> David Holmes (david.holmes at oracle.com)<david.holmes at oracle.com>
>>> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to 
>>> DEFAULT_CACHE_LINE_SIZE
>>>
>>> Hi,
>>>
>>> On 2016-10-11 02:03, Coleen Phillimore wrote:
>>>> Hi,
>>>>
>>>> Was the linear allocation in mutex.cpp the cause of the false sharing
>>>> that you observed?  I think I like this change better than the
>>>> original, because I've wondered myself why the name string was so
>>>> long.  So with this, we could make Monitor's smaller if they're
>>>> embedded in metadata or other structures.
>>> Music to my ears!
>>>
>>> I even think most embedded uses would see improvements if _name was
>>> removed entirely (or "simply" turned into a const char * so that it's
>>> not copied and embedded into the Monitor/Mutex)
>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>> On 10/10/16 2:00 PM, Doerr, Martin wrote:
>>>>> Hi Claes,
>>>>>
>>>>> thank you very much for your explanations.
>>>>>
>>>>> I agree with you that it would be better to pad where the Monitors
>>>>> are used. It would still fulfill the purpose of this RFE without
>>>>> disturbing other usages.
>>>>>
>>>>> So I could introduce:
>>>>> class PaddedMonitor : public Monitor {
>>>>>     enum {
>>>>>       CACHE_LINE_PADDING = (int)DEFAULT_CACHE_LINE_SIZE -
>>>>> (int)sizeof(Monitor),
>>>>>       PADDING_LEN = CACHE_LINE_PADDING > 0 ? CACHE_LINE_PADDING : 0
>>>>>     };
>>>>>     char _padding[PADDING_LEN];
>>>>> };
>>>>> and similarly PaddedMutex and replace all of the ones which get
>>>>> allocated in a linear fashion (mutexLocker.cpp mutex_init()).
>>> Sure!
>>>
>>> Some compilers may take issue with cases where PADDING_LEN == 0 (since
>>> char _padding[0] is technically illegal C++, but works on gcc etc) so
>>> maybe that special case will have to be (somewhat excessively):
>>>
>>> PADDING_LEN = CACHE_LINE_PADDING > 0 ? CACHE_LINE_PADDING :
>>> DEFAULT_CACHE_LINE_SIZE
>>>
>>> We took a look at if it'd be feasible to express class PaddedMonitor :
>>> public PaddedEnd<Monitor>, but it appears that'd require variadic
>>> template arguments (C++11) to get right (since we'd need PaddedEnd to
>>> transitively publish constructors of Monitor).
>>>
>>> Thanks!
>>>
>>> /Claes
>>>
>>>>> Would you agree with this change?
>>>>>
>>>>> Thanks and best regards,
>>>>> Martin
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Claes Redestad [mailto:claes.redestad at oracle.com]
>>>>> Sent: Freitag, 7. Oktober 2016 12:35
>>>>> To: Doerr, Martin<martin.doerr at sap.com>;
>>>>> daniel.daugherty at oracle.com;hotspot-runtime-dev at openjdk.java.net;
>>>>> David Holmes (david.holmes at oracle.com)<david.holmes at oracle.com>;
>>>>> Coleen Phillimore (coleen.phillimore at oracle.com)
>>>>> <coleen.phillimore at oracle.com>
>>>>> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to
>>>>> DEFAULT_CACHE_LINE_SIZE
>>>>>
>>>>> Hi,
>>>>>
>>>>> I'm concerned that this might be an easy-but-wrong fix to a complex
>>>>> problem, and acknowledging that there are already use cases where the
>>>>> _name field is contra-productive. This change adds complexity that
>>>>> makes it even less likely such uses will be optimized for in the
>>>>> future.
>>>>>
>>>>> There are Padded* types put in place to deal with these concerns
>>>>> explicitly rather than implicitly *where it matters*, which allows us
>>>>> the choice of applying padding or not on a per use-case basis (which
>>>>> means we can also remove the _name field for those use cases that 
>>>>> don't
>>>>> care about either, which might be most outside of the global lists).
>>>>>
>>>>> I am very concerned about false sharing, but I have no data to 
>>>>> support
>>>>> that this change has any measurable benefit in practice: I even 
>>>>> did an
>>>>> experiment years ago now where I turned _name into a pointer to 
>>>>> not pad
>>>>> at all and saw nothing exceeding noise levels on any benchmark.
>>>>>
>>>>> Thanks!
>>>>>
>>>>> /Claes
>>>>>
>>>>> On 2016-10-07 12:18, Doerr, Martin wrote:
>>>>>> Hi Claes,
>>>>>>
>>>>>> what the change basically does is that the _name[] field gets
>>>>>> enlarged by 8 bytes on platforms with 128 byte
>>>>>> DEFAULT_CACHE_LINE_SIZE. The logic behind it is completely computed
>>>>>> by the C++ compiler.
>>>>>> What exactly is your concern about the footprint overhead?
>>>>>> Are you not concerned about the risk of false sharing?
>>>>>>
>>>>>> Best regards,
>>>>>> Martin
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Claes Redestad [mailto:claes.redestad at oracle.com]
>>>>>> Sent: Freitag, 7. Oktober 2016 12:00
>>>>>> To: Doerr, Martin<martin.doerr at sap.com>;
>>>>>> daniel.daugherty at oracle.com;hotspot-runtime-dev at openjdk.java.net;
>>>>>> David Holmes (david.holmes at oracle.com)<david.holmes at oracle.com>;
>>>>>> Coleen Phillimore (coleen.phillimore at oracle.com)
>>>>>> <coleen.phillimore at oracle.com>
>>>>>> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to
>>>>>> DEFAULT_CACHE_LINE_SIZE
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> after due consideration I strongly consider this change unacceptable
>>>>>> since it adds footprint overhead to performance critcial compiler 
>>>>>> and
>>>>>> GC code with little to no data to support this won't cause 
>>>>>> regressions.
>>>>>>
>>>>>> Changes to Monitor/Mutex needs to be done with more surgical 
>>>>>> precision
>>>>>> than this.
>>>>>>
>>>>>> If I do have a veto on the matter, here it is.
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> /Claes
>>>>>>
>>>>>> On 2016-10-07 11:34, Doerr, Martin wrote:
>>>>>>> Hi Dan,
>>>>>>>
>>>>>>> thank you very much for reviewing and for investigating the 
>>>>>>> history.
>>>>>>>
>>>>>>> It was not intended to make the functions you mentioned public.
>>>>>>> I've fixed that.
>>>>>>> I also updated the copyright information.
>>>>>>>
>>>>>>> New webrev is here:
>>>>>>> http://cr.openjdk.java.net/~mdoerr/8166970_mutex_padding/webrev.02/
>>>>>>>
>>>>>>> @Coleen: Please use this one. I have also added reviewer 
>>>>>>> attribution.
>>>>>>>
>>>>>>> Thanks and best regards,
>>>>>>> Martin
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Daniel D. Daugherty [mailto:daniel.daugherty at oracle.com]
>>>>>>> Sent: Donnerstag, 6. Oktober 2016 23:13
>>>>>>> To: Doerr, Martin<martin.doerr at sap.com>;
>>>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>>>> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to
>>>>>>> DEFAULT_CACHE_LINE_SIZE
>>>>>>>
>>>>>>> On 9/30/16 9:48 AM, Doerr, Martin wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> the current implementation of Monitor padding (mutex.cpp) assumes
>>>>>>>> that cache lines are 64 Bytes. There's a platform dependent define
>>>>>>>> "DEFAULT_CACHE_LINE_SIZE" available which can be used. Purpose of
>>>>>>>> padding is to avoid false sharing.
>>>>>>>>
>>>>>>>> My proposed change is here:
>>>>>>>> http://cr.openjdk.java.net/~mdoerr/8166970_mutex_padding/webrev.00/ 
>>>>>>>>
>>>>>>> src/share/vm/runtime/mutex.hpp
>>>>>>>          Please update the copyright year before pushing.
>>>>>>>
>>>>>>>          L172:   // The default length of monitor name is chosen to
>>>>>>> avoid
>>>>>>> false sharing.
>>>>>>>          L173:   enum {
>>>>>>>          L174:     CACHE_LINE_PADDING = DEFAULT_CACHE_LINE_SIZE -
>>>>>>> sizeof(MonitorBase),
>>>>>>>          L175:     MONITOR_NAME_LEN = CACHE_LINE_PADDING > 64 ?
>>>>>>> CACHE_LINE_PADDING : 64
>>>>>>>          L176:   };
>>>>>>>          L177:   char _name[MONITOR_NAME_LEN];          // Name of
>>>>>>> mutex
>>>>>>>
>>>>>>>              I have to say that I'm not fond of the fact that
>>>>>>> MONITOR_NAME_LEN
>>>>>>>              can vary between platforms; I like that it is a 
>>>>>>> minimum
>>>>>>> of 64 bytes
>>>>>>>              and is still a constant.
>>>>>>>
>>>>>>>              I'm also not happy that the resulting sizeof(Monitor)
>>>>>>> may not
>>>>>>> be a multiple
>>>>>>>              of the DEFAULT_CACHE_LINE_SIZE. However, I have to
>>>>>>> mitigate
>>>>>>> that unhappiness
>>>>>>>              with the fact that sizeof(Monitor) hasn't been a
>>>>>>> multiple of
>>>>>>> the cache line
>>>>>>>              size since at least 2008 and no one complained (that I
>>>>>>> know of).
>>>>>>>
>>>>>>>              So if I was making this change, I would make
>>>>>>> MONITOR_NAME_LEN
>>>>>>> 64 bytes
>>>>>>>              (like it was) and add a pad field that would bring up
>>>>>>> sizeof(Monitor)
>>>>>>>              to be a multiple of DEFAULT_CACHE_LINE_SIZE. Of 
>>>>>>> course,
>>>>>>> Claes
>>>>>>> would be
>>>>>>>              unhappy with me and anyone embedding a Monitor into
>>>>>>> another data
>>>>>>>              structure would be unhappy with me, but I'm used to
>>>>>>> that :-)
>>>>>>>
>>>>>>>              So what you have is fine, especially for JDK9.
>>>>>>>
>>>>>>>          L180:  public:
>>>>>>>          L181: #ifndef PRODUCT
>>>>>>>          L182:   debug_only(static bool contains(Monitor * locks,
>>>>>>> Monitor *
>>>>>>> lock);)
>>>>>>>          L183:   debug_only(static Monitor *
>>>>>>> get_least_ranked_lock(Monitor *
>>>>>>> locks);)
>>>>>>>          L184:   debug_only(Monitor *
>>>>>>> get_least_ranked_lock_besides_this(Monitor * locks);)
>>>>>>>          L185: #endif
>>>>>>>          L186:
>>>>>>>          L187:   void set_owner_implementation(Thread*
>>>>>>> owner)                        PRODUCT_RETURN;
>>>>>>>          L188:   void check_prelock_state     (Thread*
>>>>>>> thread)                       PRODUCT_RETURN;
>>>>>>>          L189:   void check_block_state       (Thread* thread)
>>>>>>>
>>>>>>>              These were all "protected" before. Now they are 
>>>>>>> "public".
>>>>>>>              Any particular reason?
>>>>>>>
>>>>>>> Thumbs up on the mechanics of this change. I'm interested in the
>>>>>>> answer to the "protected" versus "public" question, but don't
>>>>>>> considered that query to be a blocker.
>>>>>>>
>>>>>>>
>>>>>>> The rest of this isn't code review, but some of this caught
>>>>>>> my attention.
>>>>>>>
>>>>>>> src/share/vm/runtime/mutex.hpp
>>>>>>>
>>>>>>>          old L84: // The default length of monitor name is 
>>>>>>> chosen to
>>>>>>> be 64
>>>>>>> to avoid false sharing.
>>>>>>>          old L85: static const int MONITOR_NAME_LEN = 64;
>>>>>>>
>>>>>>> I had to look up the history of this comment:
>>>>>>>
>>>>>>> $ hg log -r 55 src/share/vm/runtime/mutex.hpp
>>>>>>> changeset:   55:2a8eb116ebbe
>>>>>>> user:        xlu
>>>>>>> date:        Tue Feb 05 23:21:57 2008 -0800
>>>>>>> summary:     6610420: Debug VM crashes during monitor lock rank
>>>>>>> checking
>>>>>>>
>>>>>>> $ hg diff -r5{4,5} src/share/vm/runtime/mutex.hpp
>>>>>>> diff -r d4a0f561287a -r 2a8eb116ebbe src/share/vm/runtime/mutex.hpp
>>>>>>> --- a/src/share/vm/runtime/mutex.hpp    Thu Jan 31 14:56:50 2008 
>>>>>>> -0500
>>>>>>> +++ b/src/share/vm/runtime/mutex.hpp    Tue Feb 05 23:21:57 2008 
>>>>>>> -0800
>>>>>>> @@ -82,6 +82,9 @@ class ParkEvent ;
>>>>>>>       // *in that order*.  If their implementations change such 
>>>>>>> that
>>>>>>> these
>>>>>>>       // assumptions are violated, a whole lot of code will break.
>>>>>>>
>>>>>>> +// The default length of monitor name is choosen to be 64 to avoid
>>>>>>> false sharing.
>>>>>>> +static const int MONITOR_NAME_LEN = 64;
>>>>>>> +
>>>>>>>       class Monitor : public CHeapObj {
>>>>>>>
>>>>>>>        public:
>>>>>>> @@ -126,9 +129,8 @@ class Monitor : public CHeapObj {
>>>>>>>         volatile intptr_t _WaitLock [1] ;      // Protects _WaitSet
>>>>>>>         ParkEvent * volatile  _WaitSet ;       // LL of ParkEvents
>>>>>>>         volatile bool     _snuck;              // Used for sneaky
>>>>>>> locking
>>>>>>> (evil).
>>>>>>> -  const char * _name;                    // Name of mutex
>>>>>>>         int NotifyCount ;                      // diagnostic assist
>>>>>>> -  double pad [8] ;                       // avoid false sharing
>>>>>>> +  char _name[MONITOR_NAME_LEN];          // Name of mutex
>>>>>>>
>>>>>>>         // Debugging fields for naming, deadlock detection, etc.
>>>>>>> (some only
>>>>>>> used in debug mode)
>>>>>>>       #ifndef PRODUCT
>>>>>>> @@ -170,7 +172,7 @@ class Monitor : public CHeapObj {
>>>>>>>          int  ILocked () ;
>>>>>>>
>>>>>>>        protected:
>>>>>>> -   static void ClearMonitor (Monitor * m) ;
>>>>>>> +   static void ClearMonitor (Monitor * m, const char* name = 
>>>>>>> NULL) ;
>>>>>>>          Monitor() ;
>>>>>>>
>>>>>>> So the original code had an 8-double pad for avoiding false 
>>>>>>> sharing.
>>>>>>> Sounds very much like the old ObjectMonitor padding. I'm sure at 
>>>>>>> the
>>>>>>> time that Dice determined that 8-double value, the result was to 
>>>>>>> pad
>>>>>>> the size of Monitor to an even multiple of a particular cache line
>>>>>>> size.
>>>>>>>
>>>>>>> Xiobin changed the 'name' field to be an array so that the name
>>>>>>> chars could serve double duty as the cache line pad... pun 
>>>>>>> intended.
>>>>>>> Unfortunately that pad doesn't make sure that the resulting Monitor
>>>>>>> size is a multiple of the cache line size.
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>
>>>>>>>> Please review. If will also need a sponsor.
>>>>>>>>
>>>>>>>> Thanks and best regards,
>>>>>>>> Martin
>>>>>>>>
>>



More information about the hotspot-runtime-dev mailing list