RFR(M): 8166970: Adapt mutex padding according to DEFAULT_CACHE_LINE_SIZE

Claes Redestad claes.redestad at oracle.com
Tue Oct 11 17:44:05 UTC 2016


I am also happy with this, thanks!

/Claes

On 2016-10-11 19:35, Coleen Phillimore wrote:
>
> I am fine with this change.  Maybe a one line comment here, something like:
>
> // Using Padded subclasses to prevent false sharing of these global monitors and mutexes.
> 172 void mutex_init() {
>   173   def(tty_lock                     , PaddedMutex  , event,       true,  Monitor::_safepoint_check_never);      // allow to lock in VM
>
>
>
> On 10/11/16 12:26 PM, Doerr, Martin wrote:
>> Hi all,
>>
>> I came to the same conclusion regarding inheritance from PaddingEnd.
>> Unfortunately, you're also right, Claes, that we should better not use 0 as minimal padding length because some compilers may have trouble with 0 length arrays. I hope 1 is ok as minimal padding length because the new operator does not allocate cache line aligned at the moment. So I don't see any benefit in more padding. (Padding length of 1 byte has the advantage that it may not enlarge the object size if the previous field leaves some space due to its type.)
>>
>> I believe 2 _LockWord fields on one cache line was basically the problem we wanted to avoid.
>>
>> Here's a new webrev:
>> http://cr.openjdk.java.net/~mdoerr/8166970_mutex_padding/webrev.03/
>>
>> It also enables changing the _name[] field to a pointer or a smaller array. I guess this should better be done in a separate change (jdk10?).
>>
>> Please take a look.
>>
>> Thanks and best regards,
>> Martin
>>
>>
>> -----Original Message-----
>> From: Claes Redestad [mailto:claes.redestad at oracle.com]
>> Sent: Dienstag, 11. Oktober 2016 12:05
>> To: Coleen Phillimore<coleen.phillimore at oracle.com>; Doerr, Martin<martin.doerr at sap.com>;daniel.daugherty at oracle.com;hotspot-runtime-dev at openjdk.java.net; David Holmes (david.holmes at oracle.com)<david.holmes at oracle.com>
>> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to DEFAULT_CACHE_LINE_SIZE
>>
>> Hi,
>>
>> On 2016-10-11 02:03, Coleen Phillimore wrote:
>>> Hi,
>>>
>>> Was the linear allocation in mutex.cpp the cause of the false sharing
>>> that you observed?  I think I like this change better than the
>>> original, because I've wondered myself why the name string was so
>>> long.  So with this, we could make Monitor's smaller if they're
>>> embedded in metadata or other structures.
>> Music to my ears!
>>
>> I even think most embedded uses would see improvements if _name was
>> removed entirely (or "simply" turned into a const char * so that it's
>> not copied and embedded into the Monitor/Mutex)
>>
>>> Thanks,
>>> Coleen
>>>
>>> On 10/10/16 2:00 PM, Doerr, Martin wrote:
>>>> Hi Claes,
>>>>
>>>> thank you very much for your explanations.
>>>>
>>>> I agree with you that it would be better to pad where the Monitors
>>>> are used. It would still fulfill the purpose of this RFE without
>>>> disturbing other usages.
>>>>
>>>> So I could introduce:
>>>> class PaddedMonitor : public Monitor {
>>>>     enum {
>>>>       CACHE_LINE_PADDING = (int)DEFAULT_CACHE_LINE_SIZE -
>>>> (int)sizeof(Monitor),
>>>>       PADDING_LEN = CACHE_LINE_PADDING > 0 ? CACHE_LINE_PADDING : 0
>>>>     };
>>>>     char _padding[PADDING_LEN];
>>>> };
>>>> and similarly PaddedMutex and replace all of the ones which get
>>>> allocated in a linear fashion (mutexLocker.cpp mutex_init()).
>> Sure!
>>
>> Some compilers may take issue with cases where PADDING_LEN == 0 (since
>> char _padding[0] is technically illegal C++, but works on gcc etc) so
>> maybe that special case will have to be (somewhat excessively):
>>
>> PADDING_LEN = CACHE_LINE_PADDING > 0 ? CACHE_LINE_PADDING :
>> DEFAULT_CACHE_LINE_SIZE
>>
>> We took a look at if it'd be feasible to express class PaddedMonitor :
>> public PaddedEnd<Monitor>, but it appears that'd require variadic
>> template arguments (C++11) to get right (since we'd need PaddedEnd to
>> transitively publish constructors of Monitor).
>>
>> Thanks!
>>
>> /Claes
>>
>>>> Would you agree with this change?
>>>>
>>>> Thanks and best regards,
>>>> Martin
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Claes Redestad [mailto:claes.redestad at oracle.com]
>>>> Sent: Freitag, 7. Oktober 2016 12:35
>>>> To: Doerr, Martin<martin.doerr at sap.com>;
>>>> daniel.daugherty at oracle.com;hotspot-runtime-dev at openjdk.java.net;
>>>> David Holmes (david.holmes at oracle.com)<david.holmes at oracle.com>;
>>>> Coleen Phillimore (coleen.phillimore at oracle.com)
>>>> <coleen.phillimore at oracle.com>
>>>> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to
>>>> DEFAULT_CACHE_LINE_SIZE
>>>>
>>>> Hi,
>>>>
>>>> I'm concerned that this might be an easy-but-wrong fix to a complex
>>>> problem, and acknowledging that there are already use cases where the
>>>> _name field is contra-productive. This change adds complexity that
>>>> makes it even less likely such uses will be optimized for in the
>>>> future.
>>>>
>>>> There are Padded* types put in place to deal with these concerns
>>>> explicitly rather than implicitly *where it matters*, which allows us
>>>> the choice of applying padding or not on a per use-case basis (which
>>>> means we can also remove the _name field for those use cases that don't
>>>> care about either, which might be most outside of the global lists).
>>>>
>>>> I am very concerned about false sharing, but I have no data to support
>>>> that this change has any measurable benefit in practice: I even did an
>>>> experiment years ago now where I turned _name into a pointer to not pad
>>>> at all and saw nothing exceeding noise levels on any benchmark.
>>>>
>>>> Thanks!
>>>>
>>>> /Claes
>>>>
>>>> On 2016-10-07 12:18, Doerr, Martin wrote:
>>>>> Hi Claes,
>>>>>
>>>>> what the change basically does is that the _name[] field gets
>>>>> enlarged by 8 bytes on platforms with 128 byte
>>>>> DEFAULT_CACHE_LINE_SIZE. The logic behind it is completely computed
>>>>> by the C++ compiler.
>>>>> What exactly is your concern about the footprint overhead?
>>>>> Are you not concerned about the risk of false sharing?
>>>>>
>>>>> Best regards,
>>>>> Martin
>>>>>
>>>>> -----Original Message-----
>>>>> From: Claes Redestad [mailto:claes.redestad at oracle.com]
>>>>> Sent: Freitag, 7. Oktober 2016 12:00
>>>>> To: Doerr, Martin<martin.doerr at sap.com>;
>>>>> daniel.daugherty at oracle.com;hotspot-runtime-dev at openjdk.java.net;
>>>>> David Holmes (david.holmes at oracle.com)<david.holmes at oracle.com>;
>>>>> Coleen Phillimore (coleen.phillimore at oracle.com)
>>>>> <coleen.phillimore at oracle.com>
>>>>> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to
>>>>> DEFAULT_CACHE_LINE_SIZE
>>>>>
>>>>> Hi,
>>>>>
>>>>> after due consideration I strongly consider this change unacceptable
>>>>> since it adds footprint overhead to performance critcial compiler and
>>>>> GC code with little to no data to support this won't cause regressions.
>>>>>
>>>>> Changes to Monitor/Mutex needs to be done with more surgical precision
>>>>> than this.
>>>>>
>>>>> If I do have a veto on the matter, here it is.
>>>>>
>>>>> Thanks!
>>>>>
>>>>> /Claes
>>>>>
>>>>> On 2016-10-07 11:34, Doerr, Martin wrote:
>>>>>> Hi Dan,
>>>>>>
>>>>>> thank you very much for reviewing and for investigating the history.
>>>>>>
>>>>>> It was not intended to make the functions you mentioned public.
>>>>>> I've fixed that.
>>>>>> I also updated the copyright information.
>>>>>>
>>>>>> New webrev is here:
>>>>>> http://cr.openjdk.java.net/~mdoerr/8166970_mutex_padding/webrev.02/
>>>>>>
>>>>>> @Coleen: Please use this one. I have also added reviewer attribution.
>>>>>>
>>>>>> Thanks and best regards,
>>>>>> Martin
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Daniel D. Daugherty [mailto:daniel.daugherty at oracle.com]
>>>>>> Sent: Donnerstag, 6. Oktober 2016 23:13
>>>>>> To: Doerr, Martin<martin.doerr at sap.com>;
>>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>>> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to
>>>>>> DEFAULT_CACHE_LINE_SIZE
>>>>>>
>>>>>> On 9/30/16 9:48 AM, Doerr, Martin wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> the current implementation of Monitor padding (mutex.cpp) assumes
>>>>>>> that cache lines are 64 Bytes. There's a platform dependent define
>>>>>>> "DEFAULT_CACHE_LINE_SIZE" available which can be used. Purpose of
>>>>>>> padding is to avoid false sharing.
>>>>>>>
>>>>>>> My proposed change is here:
>>>>>>> http://cr.openjdk.java.net/~mdoerr/8166970_mutex_padding/webrev.00/
>>>>>> src/share/vm/runtime/mutex.hpp
>>>>>>          Please update the copyright year before pushing.
>>>>>>
>>>>>>          L172:   // The default length of monitor name is chosen to
>>>>>> avoid
>>>>>> false sharing.
>>>>>>          L173:   enum {
>>>>>>          L174:     CACHE_LINE_PADDING = DEFAULT_CACHE_LINE_SIZE -
>>>>>> sizeof(MonitorBase),
>>>>>>          L175:     MONITOR_NAME_LEN = CACHE_LINE_PADDING > 64 ?
>>>>>> CACHE_LINE_PADDING : 64
>>>>>>          L176:   };
>>>>>>          L177:   char _name[MONITOR_NAME_LEN];          // Name of
>>>>>> mutex
>>>>>>
>>>>>>              I have to say that I'm not fond of the fact that
>>>>>> MONITOR_NAME_LEN
>>>>>>              can vary between platforms; I like that it is a minimum
>>>>>> of 64 bytes
>>>>>>              and is still a constant.
>>>>>>
>>>>>>              I'm also not happy that the resulting sizeof(Monitor)
>>>>>> may not
>>>>>> be a multiple
>>>>>>              of the DEFAULT_CACHE_LINE_SIZE. However, I have to
>>>>>> mitigate
>>>>>> that unhappiness
>>>>>>              with the fact that sizeof(Monitor) hasn't been a
>>>>>> multiple of
>>>>>> the cache line
>>>>>>              size since at least 2008 and no one complained (that I
>>>>>> know of).
>>>>>>
>>>>>>              So if I was making this change, I would make
>>>>>> MONITOR_NAME_LEN
>>>>>> 64 bytes
>>>>>>              (like it was) and add a pad field that would bring up
>>>>>> sizeof(Monitor)
>>>>>>              to be a multiple of DEFAULT_CACHE_LINE_SIZE. Of course,
>>>>>> Claes
>>>>>> would be
>>>>>>              unhappy with me and anyone embedding a Monitor into
>>>>>> another data
>>>>>>              structure would be unhappy with me, but I'm used to
>>>>>> that :-)
>>>>>>
>>>>>>              So what you have is fine, especially for JDK9.
>>>>>>
>>>>>>          L180:  public:
>>>>>>          L181: #ifndef PRODUCT
>>>>>>          L182:   debug_only(static bool contains(Monitor * locks,
>>>>>> Monitor *
>>>>>> lock);)
>>>>>>          L183:   debug_only(static Monitor *
>>>>>> get_least_ranked_lock(Monitor *
>>>>>> locks);)
>>>>>>          L184:   debug_only(Monitor *
>>>>>> get_least_ranked_lock_besides_this(Monitor * locks);)
>>>>>>          L185: #endif
>>>>>>          L186:
>>>>>>          L187:   void set_owner_implementation(Thread*
>>>>>> owner)                        PRODUCT_RETURN;
>>>>>>          L188:   void check_prelock_state     (Thread*
>>>>>> thread)                       PRODUCT_RETURN;
>>>>>>          L189:   void check_block_state       (Thread* thread)
>>>>>>
>>>>>>              These were all "protected" before. Now they are "public".
>>>>>>              Any particular reason?
>>>>>>
>>>>>> Thumbs up on the mechanics of this change. I'm interested in the
>>>>>> answer to the "protected" versus "public" question, but don't
>>>>>> considered that query to be a blocker.
>>>>>>
>>>>>>
>>>>>> The rest of this isn't code review, but some of this caught
>>>>>> my attention.
>>>>>>
>>>>>> src/share/vm/runtime/mutex.hpp
>>>>>>
>>>>>>          old L84: // The default length of monitor name is chosen to
>>>>>> be 64
>>>>>> to avoid false sharing.
>>>>>>          old L85: static const int MONITOR_NAME_LEN = 64;
>>>>>>
>>>>>> I had to look up the history of this comment:
>>>>>>
>>>>>> $ hg log -r 55 src/share/vm/runtime/mutex.hpp
>>>>>> changeset:   55:2a8eb116ebbe
>>>>>> user:        xlu
>>>>>> date:        Tue Feb 05 23:21:57 2008 -0800
>>>>>> summary:     6610420: Debug VM crashes during monitor lock rank
>>>>>> checking
>>>>>>
>>>>>> $ hg diff -r5{4,5} src/share/vm/runtime/mutex.hpp
>>>>>> diff -r d4a0f561287a -r 2a8eb116ebbe src/share/vm/runtime/mutex.hpp
>>>>>> --- a/src/share/vm/runtime/mutex.hpp    Thu Jan 31 14:56:50 2008 -0500
>>>>>> +++ b/src/share/vm/runtime/mutex.hpp    Tue Feb 05 23:21:57 2008 -0800
>>>>>> @@ -82,6 +82,9 @@ class ParkEvent ;
>>>>>>       // *in that order*.  If their implementations change such that
>>>>>> these
>>>>>>       // assumptions are violated, a whole lot of code will break.
>>>>>>
>>>>>> +// The default length of monitor name is choosen to be 64 to avoid
>>>>>> false sharing.
>>>>>> +static const int MONITOR_NAME_LEN = 64;
>>>>>> +
>>>>>>       class Monitor : public CHeapObj {
>>>>>>
>>>>>>        public:
>>>>>> @@ -126,9 +129,8 @@ class Monitor : public CHeapObj {
>>>>>>         volatile intptr_t _WaitLock [1] ;      // Protects _WaitSet
>>>>>>         ParkEvent * volatile  _WaitSet ;       // LL of ParkEvents
>>>>>>         volatile bool     _snuck;              // Used for sneaky
>>>>>> locking
>>>>>> (evil).
>>>>>> -  const char * _name;                    // Name of mutex
>>>>>>         int NotifyCount ;                      // diagnostic assist
>>>>>> -  double pad [8] ;                       // avoid false sharing
>>>>>> +  char _name[MONITOR_NAME_LEN];          // Name of mutex
>>>>>>
>>>>>>         // Debugging fields for naming, deadlock detection, etc.
>>>>>> (some only
>>>>>> used in debug mode)
>>>>>>       #ifndef PRODUCT
>>>>>> @@ -170,7 +172,7 @@ class Monitor : public CHeapObj {
>>>>>>          int  ILocked () ;
>>>>>>
>>>>>>        protected:
>>>>>> -   static void ClearMonitor (Monitor * m) ;
>>>>>> +   static void ClearMonitor (Monitor * m, const char* name = NULL) ;
>>>>>>          Monitor() ;
>>>>>>
>>>>>> So the original code had an 8-double pad for avoiding false sharing.
>>>>>> Sounds very much like the old ObjectMonitor padding. I'm sure at the
>>>>>> time that Dice determined that 8-double value, the result was to pad
>>>>>> the size of Monitor to an even multiple of a particular cache line
>>>>>> size.
>>>>>>
>>>>>> Xiobin changed the 'name' field to be an array so that the name
>>>>>> chars could serve double duty as the cache line pad... pun intended.
>>>>>> Unfortunately that pad doesn't make sure that the resulting Monitor
>>>>>> size is a multiple of the cache line size.
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>>> Please review. If will also need a sponsor.
>>>>>>>
>>>>>>> Thanks and best regards,
>>>>>>> Martin
>>>>>>>
>


More information about the hotspot-runtime-dev mailing list