RFR(M): 8166970: Adapt mutex padding according to DEFAULT_CACHE_LINE_SIZE

Thu Oct 6 13:33:47 UTC 2016

Hi Martin,

Thanks for bearing with me here, these optimizations are not really my 
forte.

On 6/10/2016 9:05 PM, Doerr, Martin wrote:
> Hi David,
>
> there are many Monitor instances behind each other so I think the idea of padding (which was not mine) was not bad in general.
> The ideal situation would be to have them cache line aligned and sizeof(Monitor) equals the cache line size (or a multiple). This would completely prevent cache line sharing.

So this is for all the mutexes/monitors created in mutex_init() which 
are assumed to be laid out in a linear fashion. Ok.

Has anyone actually done any metrics on this or is it all theoretical? 
ie are any adjacent, or otherwise cache-line-aligned, monitors actually 
contended at the same time? Padding to avoid false-sharing always seems 
a very local optimization to me - more obvious with hot fields in the 
same object than with distinct fields in distinct objects.

> Even without having the cache line alignment, the padding does help:
> Please note that the padding is inserted at the end. The critical fields are at the beginning.
> Especially _LockWord of 2 Monitors will never be on the same cache line when sizeof(Monitor) equals the cache line size (or a multiple).

Seems to me the existing code, as it doesn't take into account the size 
of the rest of the Monitor, isn't really addressing this correctly at 
all - even on platforms with a 64-byte cache line.

> Padding = DEFAULT_CACHE_LINE_SIZE could prevent more sharing in case of bad alignment, but I didn't want to waste more space. I'd rather prefer the alignment solution.

The other option is an operator new that only allocates on the desired 
alignment - as we do in some other places. That also avoids wasted space 
with Monitors embedded in other objects - not that I think we have that 
many of them.

Thanks,
David

> Best regards,
> Martin
>
> -----Original Message-----
> From: David Holmes [mailto:david.holmes at oracle.com]
> Sent: Donnerstag, 6. Oktober 2016 12:21
> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-runtime-dev at openjdk.java.net
> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to DEFAULT_CACHE_LINE_SIZE
>
> On 6/10/2016 7:09 PM, Doerr, Martin wrote:
>> Hi David,
>>
>> thanks for taking a look at my proposal.
>>
>> Maybe "unnecessary cache line sharing of contended memory" would be more comprehensive.
>>
>> The purpose of the padding is to avoid the following situation:
>> 2 Monitor instances are located behind each other and some fields end up on the same cache line.
>> Some threads running on some processors compete for the first Monitor while some other threads running on some processors compete for the second one.
>> The cache line needs to get transferred between all involved processors.
>>
>> If we add enough padding, the fields which are accessed by many processors end up on different cache lines. This splits the problem into 2 independent problems. The threads competing for the first Monitor don't interfere with those ones competing for the second one any more.
>
> But that only helps for the case where the two monitors are exactly the
> wrong distance apart. Two other monitors that previously did not share
> cache lines may now do so if you make the monitors bigger.
>
> This seems completely ad-hoc. ??
>
> David
> -----
>
>
>
>> The existing padding implementation is not optimal. It's a little too small on some platforms. On other platforms, it is not wrong to pad more than necessary, but ideally, one would pad to make the Monitor size equal to the cache line size. I have kept the minimum of 64 because _name is not only used for padding and I guess people don't want it too short.
>>
>> x86_64 and SPARC have DEFAULT_CACHE_LINE_SIZE=128 in typical configurations. That's like PPC64 so the change also improves the padding on these platforms as well.
>> (On x86_64 we get the same result as on PPC64: The length of _name gets extended from 64 to 72 in product build). The padding increase only gets huge on S390.
>>
>> Best regards,
>> Martin
>>
>>
>> -----Original Message-----
>> From: David Holmes [mailto:david.holmes at oracle.com]
>> Sent: Donnerstag, 6. Oktober 2016 04:31
>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-runtime-dev at openjdk.java.net
>> Subject: Re: RFR(M): 8166970: Adapt mutex padding according to DEFAULT_CACHE_LINE_SIZE
>>
>> On 1/10/2016 1:48 AM, Doerr, Martin wrote:
>>> Hi,
>>>
>>> the current implementation of Monitor padding (mutex.cpp) assumes that cache lines are 64 Bytes. There's a platform dependent define "DEFAULT_CACHE_LINE_SIZE" available which can be used. Purpose of padding is to avoid false sharing.
>>>
>>> My proposed change is here:
>>> http://cr.openjdk.java.net/~mdoerr/8166970_mutex_padding/webrev.00/
>>
>> Not sure I understand the existing padding code. What false sharing are
>> we trying to avoid?
>>
>> And if the existing code assumes a cache line size of 64 and declares
>> _name to be 64 chars, then why can't the new code declare name to be
>> DEFAULT_CACHE_LINE_SIZE chars? This suggests the existing padding code
>> is wrong (not just hard-wired).
>>
>> Which platforms will this cause an actual change in Monitor size other
>> than S390?
>>
>> Thanks,
>> David
>>
>>> Please review. If will also need a sponsor.
>>>
>>> Thanks and best regards,
>>> Martin
>>>