RFR(M): 8166970: Adapt mutex padding according to DEFAULT_CACHE_LINE_SIZE

Coleen Phillimore coleen.phillimore at oracle.com
Tue Oct 11 00:03:20 UTC 2016


Hi,

Was the linear allocation in mutex.cpp the cause of the false sharing 
that you observed?  I think I like this change better than the original, 
because I've wondered myself why the name string was so long.  So with 
this, we could make Monitor's smaller if they're embedded in metadata or 
other structures.

Thanks,
Coleen

On 10/10/16 2:00 PM, Doerr, Martin wrote:
> Hi Claes,
>
> thank you very much for your explanations.
>
> I agree with you that it would be better to pad where the Monitors are used. It would still fulfill the purpose of this RFE without disturbing other usages.
>
> So I could introduce:
> class PaddedMonitor : public Monitor {
>    enum {
>      CACHE_LINE_PADDING = (int)DEFAULT_CACHE_LINE_SIZE - (int)sizeof(Monitor),
>      PADDING_LEN = CACHE_LINE_PADDING > 0 ? CACHE_LINE_PADDING : 0
>    };
>    char _padding[PADDING_LEN];
> };
> and similarly PaddedMutex and replace all of the ones which get allocated in a linear fashion (mutexLocker.cpp mutex_init()).
>
> Would you agree with this change?
>
> Thanks and best regards,
> Martin
>
>
> -----Original Message-----
> From: Claes Redestad [mailto:claes.redestad at oracle.com]
> Sent: Freitag, 7. Oktober 2016 12:35
> To: Doerr, Martin <martin.doerr at sap.com>; daniel.daugherty at oracle.com; hotspot-runtime-dev at openjdk.java.net; David Holmes (david.holmes at oracle.com) <david.holmes at oracle.com>; Coleen Phillimore (coleen.phillimore at oracle.com) <coleen.phillimore at oracle.com>
> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to DEFAULT_CACHE_LINE_SIZE
>
> Hi,
>
> I'm concerned that this might be an easy-but-wrong fix to a complex
> problem, and acknowledging that there are already use cases where the
> _name field is contra-productive. This change adds complexity that
> makes it even less likely such uses will be optimized for in the
> future.
>
> There are Padded* types put in place to deal with these concerns
> explicitly rather than implicitly *where it matters*, which allows us
> the choice of applying padding or not on a per use-case basis (which
> means we can also remove the _name field for those use cases that don't
> care about either, which might be most outside of the global lists).
>
> I am very concerned about false sharing, but I have no data to support
> that this change has any measurable benefit in practice: I even did an
> experiment years ago now where I turned _name into a pointer to not pad
> at all and saw nothing exceeding noise levels on any benchmark.
>
> Thanks!
>
> /Claes
>
> On 2016-10-07 12:18, Doerr, Martin wrote:
>> Hi Claes,
>>
>> what the change basically does is that the _name[] field gets enlarged by 8 bytes on platforms with 128 byte DEFAULT_CACHE_LINE_SIZE. The logic behind it is completely computed by the C++ compiler.
>> What exactly is your concern about the footprint overhead?
>> Are you not concerned about the risk of false sharing?
>>
>> Best regards,
>> Martin
>>
>> -----Original Message-----
>> From: Claes Redestad [mailto:claes.redestad at oracle.com]
>> Sent: Freitag, 7. Oktober 2016 12:00
>> To: Doerr, Martin <martin.doerr at sap.com>; daniel.daugherty at oracle.com; hotspot-runtime-dev at openjdk.java.net; David Holmes (david.holmes at oracle.com) <david.holmes at oracle.com>; Coleen Phillimore (coleen.phillimore at oracle.com) <coleen.phillimore at oracle.com>
>> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to DEFAULT_CACHE_LINE_SIZE
>>
>> Hi,
>>
>> after due consideration I strongly consider this change unacceptable
>> since it adds footprint overhead to performance critcial compiler and
>> GC code with little to no data to support this won't cause regressions.
>>
>> Changes to Monitor/Mutex needs to be done with more surgical precision
>> than this.
>>
>> If I do have a veto on the matter, here it is.
>>
>> Thanks!
>>
>> /Claes
>>
>> On 2016-10-07 11:34, Doerr, Martin wrote:
>>> Hi Dan,
>>>
>>> thank you very much for reviewing and for investigating the history.
>>>
>>> It was not intended to make the functions you mentioned public. I've fixed that.
>>> I also updated the copyright information.
>>>
>>> New webrev is here:
>>> http://cr.openjdk.java.net/~mdoerr/8166970_mutex_padding/webrev.02/
>>>
>>> @Coleen: Please use this one. I have also added reviewer attribution.
>>>
>>> Thanks and best regards,
>>> Martin
>>>
>>>
>>> -----Original Message-----
>>> From: Daniel D. Daugherty [mailto:daniel.daugherty at oracle.com]
>>> Sent: Donnerstag, 6. Oktober 2016 23:13
>>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-runtime-dev at openjdk.java.net
>>> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to DEFAULT_CACHE_LINE_SIZE
>>>
>>> On 9/30/16 9:48 AM, Doerr, Martin wrote:
>>>> Hi,
>>>>
>>>> the current implementation of Monitor padding (mutex.cpp) assumes that cache lines are 64 Bytes. There's a platform dependent define "DEFAULT_CACHE_LINE_SIZE" available which can be used. Purpose of padding is to avoid false sharing.
>>>>
>>>> My proposed change is here:
>>>> http://cr.openjdk.java.net/~mdoerr/8166970_mutex_padding/webrev.00/
>>> src/share/vm/runtime/mutex.hpp
>>>         Please update the copyright year before pushing.
>>>
>>>         L172:   // The default length of monitor name is chosen to avoid
>>> false sharing.
>>>         L173:   enum {
>>>         L174:     CACHE_LINE_PADDING = DEFAULT_CACHE_LINE_SIZE -
>>> sizeof(MonitorBase),
>>>         L175:     MONITOR_NAME_LEN = CACHE_LINE_PADDING > 64 ?
>>> CACHE_LINE_PADDING : 64
>>>         L176:   };
>>>         L177:   char _name[MONITOR_NAME_LEN];          // Name of mutex
>>>
>>>             I have to say that I'm not fond of the fact that MONITOR_NAME_LEN
>>>             can vary between platforms; I like that it is a minimum of 64 bytes
>>>             and is still a constant.
>>>
>>>             I'm also not happy that the resulting sizeof(Monitor) may not
>>> be a multiple
>>>             of the DEFAULT_CACHE_LINE_SIZE. However, I have to mitigate
>>> that unhappiness
>>>             with the fact that sizeof(Monitor) hasn't been a multiple of
>>> the cache line
>>>             size since at least 2008 and no one complained (that I know of).
>>>
>>>             So if I was making this change, I would make MONITOR_NAME_LEN
>>> 64 bytes
>>>             (like it was) and add a pad field that would bring up
>>> sizeof(Monitor)
>>>             to be a multiple of DEFAULT_CACHE_LINE_SIZE. Of course, Claes
>>> would be
>>>             unhappy with me and anyone embedding a Monitor into another data
>>>             structure would be unhappy with me, but I'm used to that :-)
>>>
>>>             So what you have is fine, especially for JDK9.
>>>
>>>         L180:  public:
>>>         L181: #ifndef PRODUCT
>>>         L182:   debug_only(static bool contains(Monitor * locks, Monitor *
>>> lock);)
>>>         L183:   debug_only(static Monitor * get_least_ranked_lock(Monitor *
>>> locks);)
>>>         L184:   debug_only(Monitor *
>>> get_least_ranked_lock_besides_this(Monitor * locks);)
>>>         L185: #endif
>>>         L186:
>>>         L187:   void set_owner_implementation(Thread*
>>> owner)                        PRODUCT_RETURN;
>>>         L188:   void check_prelock_state     (Thread*
>>> thread)                       PRODUCT_RETURN;
>>>         L189:   void check_block_state       (Thread* thread)
>>>
>>>             These were all "protected" before. Now they are "public".
>>>             Any particular reason?
>>>
>>> Thumbs up on the mechanics of this change. I'm interested in the
>>> answer to the "protected" versus "public" question, but don't
>>> considered that query to be a blocker.
>>>
>>>
>>> The rest of this isn't code review, but some of this caught
>>> my attention.
>>>
>>> src/share/vm/runtime/mutex.hpp
>>>
>>>         old L84: // The default length of monitor name is chosen to be 64
>>> to avoid false sharing.
>>>         old L85: static const int MONITOR_NAME_LEN = 64;
>>>
>>> I had to look up the history of this comment:
>>>
>>> $ hg log -r 55 src/share/vm/runtime/mutex.hpp
>>> changeset:   55:2a8eb116ebbe
>>> user:        xlu
>>> date:        Tue Feb 05 23:21:57 2008 -0800
>>> summary:     6610420: Debug VM crashes during monitor lock rank checking
>>>
>>> $ hg diff -r5{4,5} src/share/vm/runtime/mutex.hpp
>>> diff -r d4a0f561287a -r 2a8eb116ebbe src/share/vm/runtime/mutex.hpp
>>> --- a/src/share/vm/runtime/mutex.hpp    Thu Jan 31 14:56:50 2008 -0500
>>> +++ b/src/share/vm/runtime/mutex.hpp    Tue Feb 05 23:21:57 2008 -0800
>>> @@ -82,6 +82,9 @@ class ParkEvent ;
>>>      // *in that order*.  If their implementations change such that these
>>>      // assumptions are violated, a whole lot of code will break.
>>>
>>> +// The default length of monitor name is choosen to be 64 to avoid
>>> false sharing.
>>> +static const int MONITOR_NAME_LEN = 64;
>>> +
>>>      class Monitor : public CHeapObj {
>>>
>>>       public:
>>> @@ -126,9 +129,8 @@ class Monitor : public CHeapObj {
>>>        volatile intptr_t _WaitLock [1] ;      // Protects _WaitSet
>>>        ParkEvent * volatile  _WaitSet ;       // LL of ParkEvents
>>>        volatile bool     _snuck;              // Used for sneaky locking
>>> (evil).
>>> -  const char * _name;                    // Name of mutex
>>>        int NotifyCount ;                      // diagnostic assist
>>> -  double pad [8] ;                       // avoid false sharing
>>> +  char _name[MONITOR_NAME_LEN];          // Name of mutex
>>>
>>>        // Debugging fields for naming, deadlock detection, etc. (some only
>>> used in debug mode)
>>>      #ifndef PRODUCT
>>> @@ -170,7 +172,7 @@ class Monitor : public CHeapObj {
>>>         int  ILocked () ;
>>>
>>>       protected:
>>> -   static void ClearMonitor (Monitor * m) ;
>>> +   static void ClearMonitor (Monitor * m, const char* name = NULL) ;
>>>         Monitor() ;
>>>
>>> So the original code had an 8-double pad for avoiding false sharing.
>>> Sounds very much like the old ObjectMonitor padding. I'm sure at the
>>> time that Dice determined that 8-double value, the result was to pad
>>> the size of Monitor to an even multiple of a particular cache line
>>> size.
>>>
>>> Xiobin changed the 'name' field to be an array so that the name
>>> chars could serve double duty as the cache line pad... pun intended.
>>> Unfortunately that pad doesn't make sure that the resulting Monitor
>>> size is a multiple of the cache line size.
>>>
>>> Dan
>>>
>>>
>>>> Please review. If will also need a sponsor.
>>>>
>>>> Thanks and best regards,
>>>> Martin
>>>>



More information about the hotspot-runtime-dev mailing list