Review request: 8022880: False sharing between PSPromotionManager instances

Stefan Karlsson stefan.karlsson at oracle.com
Tue Aug 13 20:21:05 UTC 2013


On 8/13/13 10:13 PM, Jon Masamitsu wrote:
> Looks good.
>
> Would you consider
>
> PaddedArray::create_unfreeable()
>
> in place of
>
> PaddedArray::create_immortal()
>
> I think "unfreeable" communicates the  comment in the code
> that " The memory can't be deleted ..."  When I see "unfreeable"
> it makes me a little uneasy (as it should).
Yes, that name is much better. I'll change it.

Thanks for reviewing this.

StefanK
>
> Jon
>
>
> On 8/13/13 4:38 AM, Stefan Karlsson wrote:
>> http://cr.openjdk.java.net/~stefank/8022880/webrev.00/
>>
>> We've seen a couple of instances of false sharing when accessing 
>> fields from the beginning and the end of the PSPromotionManager 
>> instances. This both decreases the performance of the Parallel 
>> Scavenge young GC and makes it hard to do reliable GC benchmarks on 
>> bigger machines.
>>
>> This was first seen in:
>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7196911 : command 
>> line length affects performance
>>
>> The patch makes sure that each PSPromotionManager starts at a 
>> cache-line-aligned address and is padded to have a cache-line-aligned 
>> size.
>>
>> It doesn't use the exiting Padded<T> class, since it (unnecessarily) 
>> wastes too much memory, but instead introduces a PaddedEnd<T> class. 
>> This class only pads enough to get the cache-line-aligned size and 
>> it's up to the user to align the start of the instance. This works 
>> well in this specific case, where all the PSPromotionManagers are 
>> together in an Array. A PaddedArray<T> class was added to hide the 
>> memory layout code.
>>
>> Testing:
>>
>> 1) JPRT
>>
>> 2) SPECjbb2005 - 2 socket, 8 core, HT machine on JDK8-b57 + recent 
>> HotSpot + the patch
>>
>> Flags:
>> -showversion -Xmx29g -Xms29g -Xmn27g -XX:SurvivorRatio=60 
>> -XX:TargetSurvivorRatio=90 -XX:ParallelGCThreads=16 
>> -XX:AllocatePrefetchDistance=256 -XX:AllocatePrefetchLines=4 
>> -XX:LoopUnrollLimit=45 -XX:InitialTenuringThreshold=12 
>> -XX:MaxTenuringThreshold=15 -XX:InlineSmallCode=4300 
>> -XX:MaxInlineSize=270 -XX:FreqInlineSize=2700 -XX:+AggressiveOpts 
>> -XX:+UseParallelOldGC -XX:-UseAdaptiveSizePolicy -XX:+PrintGC
>>
>> Young GC times without cache aligned PSPromotionManager (ms):
>> 36.1608
>> 36.0164
>> 36.3001
>> 36.0763
>> 36.2086
>> 35.8151
>>
>> with cache aligned PSPromotionManager:
>> 26.2168
>> 26.9931
>> 27.3672
>> 26.5155
>> 26.0182
>> 26.8202
>>
>> Extra thanks goes to Claes Redestad for helping out with performance 
>> analysis and implementation-detail discussions.
>>
>> thanks,
>> StefanK
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130813/b2236783/attachment.htm>


More information about the hotspot-gc-dev mailing list