RFR(M): 8166970: Adapt mutex padding according to DEFAULT_CACHE_LINE_SIZE
Claes Redestad
claes.redestad at oracle.com
Tue Oct 11 10:05:15 UTC 2016
Hi,
On 2016-10-11 02:03, Coleen Phillimore wrote:
>
> Hi,
>
> Was the linear allocation in mutex.cpp the cause of the false sharing
> that you observed? I think I like this change better than the
> original, because I've wondered myself why the name string was so
> long. So with this, we could make Monitor's smaller if they're
> embedded in metadata or other structures.
Music to my ears!
I even think most embedded uses would see improvements if _name was
removed entirely (or "simply" turned into a const char * so that it's
not copied and embedded into the Monitor/Mutex)
>
> Thanks,
> Coleen
>
> On 10/10/16 2:00 PM, Doerr, Martin wrote:
>> Hi Claes,
>>
>> thank you very much for your explanations.
>>
>> I agree with you that it would be better to pad where the Monitors
>> are used. It would still fulfill the purpose of this RFE without
>> disturbing other usages.
>>
>> So I could introduce:
>> class PaddedMonitor : public Monitor {
>> enum {
>> CACHE_LINE_PADDING = (int)DEFAULT_CACHE_LINE_SIZE -
>> (int)sizeof(Monitor),
>> PADDING_LEN = CACHE_LINE_PADDING > 0 ? CACHE_LINE_PADDING : 0
>> };
>> char _padding[PADDING_LEN];
>> };
>> and similarly PaddedMutex and replace all of the ones which get
>> allocated in a linear fashion (mutexLocker.cpp mutex_init()).
Sure!
Some compilers may take issue with cases where PADDING_LEN == 0 (since
char _padding[0] is technically illegal C++, but works on gcc etc) so
maybe that special case will have to be (somewhat excessively):
PADDING_LEN = CACHE_LINE_PADDING > 0 ? CACHE_LINE_PADDING :
DEFAULT_CACHE_LINE_SIZE
We took a look at if it'd be feasible to express class PaddedMonitor :
public PaddedEnd<Monitor>, but it appears that'd require variadic
template arguments (C++11) to get right (since we'd need PaddedEnd to
transitively publish constructors of Monitor).
Thanks!
/Claes
>>
>> Would you agree with this change?
>>
>> Thanks and best regards,
>> Martin
>>
>>
>> -----Original Message-----
>> From: Claes Redestad [mailto:claes.redestad at oracle.com]
>> Sent: Freitag, 7. Oktober 2016 12:35
>> To: Doerr, Martin <martin.doerr at sap.com>;
>> daniel.daugherty at oracle.com; hotspot-runtime-dev at openjdk.java.net;
>> David Holmes (david.holmes at oracle.com) <david.holmes at oracle.com>;
>> Coleen Phillimore (coleen.phillimore at oracle.com)
>> <coleen.phillimore at oracle.com>
>> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to
>> DEFAULT_CACHE_LINE_SIZE
>>
>> Hi,
>>
>> I'm concerned that this might be an easy-but-wrong fix to a complex
>> problem, and acknowledging that there are already use cases where the
>> _name field is contra-productive. This change adds complexity that
>> makes it even less likely such uses will be optimized for in the
>> future.
>>
>> There are Padded* types put in place to deal with these concerns
>> explicitly rather than implicitly *where it matters*, which allows us
>> the choice of applying padding or not on a per use-case basis (which
>> means we can also remove the _name field for those use cases that don't
>> care about either, which might be most outside of the global lists).
>>
>> I am very concerned about false sharing, but I have no data to support
>> that this change has any measurable benefit in practice: I even did an
>> experiment years ago now where I turned _name into a pointer to not pad
>> at all and saw nothing exceeding noise levels on any benchmark.
>>
>> Thanks!
>>
>> /Claes
>>
>> On 2016-10-07 12:18, Doerr, Martin wrote:
>>> Hi Claes,
>>>
>>> what the change basically does is that the _name[] field gets
>>> enlarged by 8 bytes on platforms with 128 byte
>>> DEFAULT_CACHE_LINE_SIZE. The logic behind it is completely computed
>>> by the C++ compiler.
>>> What exactly is your concern about the footprint overhead?
>>> Are you not concerned about the risk of false sharing?
>>>
>>> Best regards,
>>> Martin
>>>
>>> -----Original Message-----
>>> From: Claes Redestad [mailto:claes.redestad at oracle.com]
>>> Sent: Freitag, 7. Oktober 2016 12:00
>>> To: Doerr, Martin <martin.doerr at sap.com>;
>>> daniel.daugherty at oracle.com; hotspot-runtime-dev at openjdk.java.net;
>>> David Holmes (david.holmes at oracle.com) <david.holmes at oracle.com>;
>>> Coleen Phillimore (coleen.phillimore at oracle.com)
>>> <coleen.phillimore at oracle.com>
>>> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to
>>> DEFAULT_CACHE_LINE_SIZE
>>>
>>> Hi,
>>>
>>> after due consideration I strongly consider this change unacceptable
>>> since it adds footprint overhead to performance critcial compiler and
>>> GC code with little to no data to support this won't cause regressions.
>>>
>>> Changes to Monitor/Mutex needs to be done with more surgical precision
>>> than this.
>>>
>>> If I do have a veto on the matter, here it is.
>>>
>>> Thanks!
>>>
>>> /Claes
>>>
>>> On 2016-10-07 11:34, Doerr, Martin wrote:
>>>> Hi Dan,
>>>>
>>>> thank you very much for reviewing and for investigating the history.
>>>>
>>>> It was not intended to make the functions you mentioned public.
>>>> I've fixed that.
>>>> I also updated the copyright information.
>>>>
>>>> New webrev is here:
>>>> http://cr.openjdk.java.net/~mdoerr/8166970_mutex_padding/webrev.02/
>>>>
>>>> @Coleen: Please use this one. I have also added reviewer attribution.
>>>>
>>>> Thanks and best regards,
>>>> Martin
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Daniel D. Daugherty [mailto:daniel.daugherty at oracle.com]
>>>> Sent: Donnerstag, 6. Oktober 2016 23:13
>>>> To: Doerr, Martin <martin.doerr at sap.com>;
>>>> hotspot-runtime-dev at openjdk.java.net
>>>> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to
>>>> DEFAULT_CACHE_LINE_SIZE
>>>>
>>>> On 9/30/16 9:48 AM, Doerr, Martin wrote:
>>>>> Hi,
>>>>>
>>>>> the current implementation of Monitor padding (mutex.cpp) assumes
>>>>> that cache lines are 64 Bytes. There's a platform dependent define
>>>>> "DEFAULT_CACHE_LINE_SIZE" available which can be used. Purpose of
>>>>> padding is to avoid false sharing.
>>>>>
>>>>> My proposed change is here:
>>>>> http://cr.openjdk.java.net/~mdoerr/8166970_mutex_padding/webrev.00/
>>>> src/share/vm/runtime/mutex.hpp
>>>> Please update the copyright year before pushing.
>>>>
>>>> L172: // The default length of monitor name is chosen to
>>>> avoid
>>>> false sharing.
>>>> L173: enum {
>>>> L174: CACHE_LINE_PADDING = DEFAULT_CACHE_LINE_SIZE -
>>>> sizeof(MonitorBase),
>>>> L175: MONITOR_NAME_LEN = CACHE_LINE_PADDING > 64 ?
>>>> CACHE_LINE_PADDING : 64
>>>> L176: };
>>>> L177: char _name[MONITOR_NAME_LEN]; // Name of
>>>> mutex
>>>>
>>>> I have to say that I'm not fond of the fact that
>>>> MONITOR_NAME_LEN
>>>> can vary between platforms; I like that it is a minimum
>>>> of 64 bytes
>>>> and is still a constant.
>>>>
>>>> I'm also not happy that the resulting sizeof(Monitor)
>>>> may not
>>>> be a multiple
>>>> of the DEFAULT_CACHE_LINE_SIZE. However, I have to
>>>> mitigate
>>>> that unhappiness
>>>> with the fact that sizeof(Monitor) hasn't been a
>>>> multiple of
>>>> the cache line
>>>> size since at least 2008 and no one complained (that I
>>>> know of).
>>>>
>>>> So if I was making this change, I would make
>>>> MONITOR_NAME_LEN
>>>> 64 bytes
>>>> (like it was) and add a pad field that would bring up
>>>> sizeof(Monitor)
>>>> to be a multiple of DEFAULT_CACHE_LINE_SIZE. Of course,
>>>> Claes
>>>> would be
>>>> unhappy with me and anyone embedding a Monitor into
>>>> another data
>>>> structure would be unhappy with me, but I'm used to
>>>> that :-)
>>>>
>>>> So what you have is fine, especially for JDK9.
>>>>
>>>> L180: public:
>>>> L181: #ifndef PRODUCT
>>>> L182: debug_only(static bool contains(Monitor * locks,
>>>> Monitor *
>>>> lock);)
>>>> L183: debug_only(static Monitor *
>>>> get_least_ranked_lock(Monitor *
>>>> locks);)
>>>> L184: debug_only(Monitor *
>>>> get_least_ranked_lock_besides_this(Monitor * locks);)
>>>> L185: #endif
>>>> L186:
>>>> L187: void set_owner_implementation(Thread*
>>>> owner) PRODUCT_RETURN;
>>>> L188: void check_prelock_state (Thread*
>>>> thread) PRODUCT_RETURN;
>>>> L189: void check_block_state (Thread* thread)
>>>>
>>>> These were all "protected" before. Now they are "public".
>>>> Any particular reason?
>>>>
>>>> Thumbs up on the mechanics of this change. I'm interested in the
>>>> answer to the "protected" versus "public" question, but don't
>>>> considered that query to be a blocker.
>>>>
>>>>
>>>> The rest of this isn't code review, but some of this caught
>>>> my attention.
>>>>
>>>> src/share/vm/runtime/mutex.hpp
>>>>
>>>> old L84: // The default length of monitor name is chosen to
>>>> be 64
>>>> to avoid false sharing.
>>>> old L85: static const int MONITOR_NAME_LEN = 64;
>>>>
>>>> I had to look up the history of this comment:
>>>>
>>>> $ hg log -r 55 src/share/vm/runtime/mutex.hpp
>>>> changeset: 55:2a8eb116ebbe
>>>> user: xlu
>>>> date: Tue Feb 05 23:21:57 2008 -0800
>>>> summary: 6610420: Debug VM crashes during monitor lock rank
>>>> checking
>>>>
>>>> $ hg diff -r5{4,5} src/share/vm/runtime/mutex.hpp
>>>> diff -r d4a0f561287a -r 2a8eb116ebbe src/share/vm/runtime/mutex.hpp
>>>> --- a/src/share/vm/runtime/mutex.hpp Thu Jan 31 14:56:50 2008 -0500
>>>> +++ b/src/share/vm/runtime/mutex.hpp Tue Feb 05 23:21:57 2008 -0800
>>>> @@ -82,6 +82,9 @@ class ParkEvent ;
>>>> // *in that order*. If their implementations change such that
>>>> these
>>>> // assumptions are violated, a whole lot of code will break.
>>>>
>>>> +// The default length of monitor name is choosen to be 64 to avoid
>>>> false sharing.
>>>> +static const int MONITOR_NAME_LEN = 64;
>>>> +
>>>> class Monitor : public CHeapObj {
>>>>
>>>> public:
>>>> @@ -126,9 +129,8 @@ class Monitor : public CHeapObj {
>>>> volatile intptr_t _WaitLock [1] ; // Protects _WaitSet
>>>> ParkEvent * volatile _WaitSet ; // LL of ParkEvents
>>>> volatile bool _snuck; // Used for sneaky
>>>> locking
>>>> (evil).
>>>> - const char * _name; // Name of mutex
>>>> int NotifyCount ; // diagnostic assist
>>>> - double pad [8] ; // avoid false sharing
>>>> + char _name[MONITOR_NAME_LEN]; // Name of mutex
>>>>
>>>> // Debugging fields for naming, deadlock detection, etc.
>>>> (some only
>>>> used in debug mode)
>>>> #ifndef PRODUCT
>>>> @@ -170,7 +172,7 @@ class Monitor : public CHeapObj {
>>>> int ILocked () ;
>>>>
>>>> protected:
>>>> - static void ClearMonitor (Monitor * m) ;
>>>> + static void ClearMonitor (Monitor * m, const char* name = NULL) ;
>>>> Monitor() ;
>>>>
>>>> So the original code had an 8-double pad for avoiding false sharing.
>>>> Sounds very much like the old ObjectMonitor padding. I'm sure at the
>>>> time that Dice determined that 8-double value, the result was to pad
>>>> the size of Monitor to an even multiple of a particular cache line
>>>> size.
>>>>
>>>> Xiobin changed the 'name' field to be an array so that the name
>>>> chars could serve double duty as the cache line pad... pun intended.
>>>> Unfortunately that pad doesn't make sure that the resulting Monitor
>>>> size is a multiple of the cache line size.
>>>>
>>>> Dan
>>>>
>>>>
>>>>> Please review. If will also need a sponsor.
>>>>>
>>>>> Thanks and best regards,
>>>>> Martin
>>>>>
>
More information about the hotspot-runtime-dev
mailing list