Request for review (S): 7068625 Testing 8 bytes of card table entries at a time speeds up card-scanning

成滔 chengtao.fh at taobao.com
Thu Mar 1 03:50:28 UTC 2012


I did some testing with id=7068625¡¯s patch. Sun6u25 with hotspot20 VS Sun6u25 with hotspot20+patch, the results show as follow.

1.       Special case

I designed a special example, creating 30 large arrayes of 500 m. Because of the small young gen(384m), these arrays will be allocated in the old gen. Then i create many small objects, leading to frequently young gc. The result shows as follows, young gc performance improves 5 times.
¡¡

1st,YGC(s)

2nd,YGC(s)

3rd,YGC(s)

Average,YGC(s)

Sun6u25 , Hotspot20

19.5393

19.6194

19.6167

19.5918

Sun6u25,Hotspot20 with patch, ParGCCardsPerStrideChunk=4096

2.89477

2.85132

2.60825

2.78478


Jvm options is: -Xmn384m -Xss20m -Xms16g -Xmx16g -XX:PermSize=96m -XX:MaxPermSize=256m -XX:SurvivorRatio=10 -XX:VMThreadStackSize=30720 -XX:+UseConcMarkSweepGC -XX:+UseCMSCompactAtFullCollection -XX:CMSMaxAbortablePrecleanTime=5000 -XX:+CMSClassUnloadingEnabled -XX:CMSInitiatingOccupancyFraction=80 -XX:+DisableExplicitGC -verbose:gc -Xloggc:log/test.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseCompressedOops -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=4096

2.       GCbench

I download GCBench.java from http://code.google.com/p/hotpy/source/browse/trunk/benchmarks/java/GCBench.java, and change kStretchTreeDepth    = 25, kLongLivedTreeDepth  = 22, kMaxTreeDepth = 25; the result shows as follows
¡¡

1st,YGC(s)

2nd,YGC(s)

3rd,YGC(s)

Average,YGC(s)

Sun6u25 , Hotspot20

1.15687

1.15133

1.14496

1.1510533

Sun6u25,Hotspot20 with patch, ParGCCardsPerStrideChunk =64

1.16177

1.16077

1.17036

1.1643

Sun6u25,Hotspot20 with patch, ParGCCardsPerStrideChunk =128

1.15831

1.17001

1.16075

1.1630233

Sun6u25,Hotspot20 with patch, ParGCCardsPerStrideChunk =256

1.14909

1.15332

1.15261

1.1516733

Sun6u25,Hotspot20 with patch, ParGCCardsPerStrideChunk =512

1.13964

1.14849

1.15045

1.1461933

Sun6u25,Hotspot20 with patch,ParGCCardsPerStrideChunk =1024

1.14355

1.15015

1.13443

1.14271

Sun6u25,Hotspot20 with patch,ParGCCardsPerStrideChunk =2048

1.13742

1.14496

1.14833

1.14357

Sun6u25,Hotspot20 with patch,ParGCCardsPerStrideChunk =4096

1.14581

1.14538

1.13679

1.14266


Jvm options is: ¨CXmn2560m -Xss20m ¨CXms4g ¨CXmx4g -XX:PermSize=96m -XX:MaxPermSize=256m -XX:SurvivorRatio=10 -XX:VMThreadStackSize=30720 -XX:+UseConcMarkSweepGC -XX:+UseCMSCompactAtFullCollection -XX:CMSMaxAbortablePrecleanTime=5000 -XX:+CMSClassUnloadingEnabled -XX:CMSInitiatingOccupancyFraction=80 -XX:+DisableExplicitGC -verbose:gc -Xloggc:log/test.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseCompressedOops

3.       Specjvm2008

SPECJVM2008¡¯ result shows as follow
¡¡

1st,YGC(s)

2nd,YGC(s)

3rd,YGC(s)

Average,YGC(s)

Sun6u25 , Hotspot20

9.76089

¡¡no test

¡¡no test

9.76089

Sun6u25,Hotspot20 with patch,ParGCCardsPerStrideChunk =256

9.98591

9.48948

10.0271

9.8341633

Sun6u25,Hotspot20 with patch,ParGCCardsPerStrideChunk =4096

9.11581

9.18337

9.30351

9.2008967


Jvm options is: -Xmn2560m -Xss20m -Xms4g -Xmx4g -XX:PermSize=96m -XX:MaxPermSize=256m -XX:SurvivorRatio=10 -XX:VMThreadStackSize=30720 -XX:+UseConcMarkSweepGC -XX:+UseCMSCompactAtFullCollection -XX:CMSMaxAbortablePrecleanTime=5000 -XX:+CMSClassUnloadingEnabled -XX:CMSInitiatingOccupancyFraction=80 -XX:+DisableExplicitGC -verbose:gc -Xloggc:log/test.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseCompressedOops -jar SPECjvm2008.jar -ikv --lagom -bt 8 -ops 15 -i 5 -pf  props/specjvm.properties

4.       Specjbb2005

Specjbb2005¡¯s result shows as follow
¡¡

1st,YGC(s)

2nd,YGC(s)

3rd,YGC(s)

Average,YGC(s)

Sun6u25 , Hotspot20

97.5287

¡¡no test

¡¡no test

97.5287

Sun6u25,Hotspot20 with patch,ParGCCardsPerStrideChunk =256

96.8332

96.6885

98.348

97.2899

Sun6u25,Hotspot20 with patch,ParGCCardsPerStrideChunk =4096

96.1914

96.3589

97.6005

96.716933


Jvm options is:

-Xmn2560m -Xss20m -Xms4g -Xmx4g -XX:PermSize=96m -XX:MaxPermSize=256m -XX:SurvivorRatio=10 -XX:VMThreadStackSize=30720 -XX:+UseConcMarkSweepGC -XX:+UseCMSCompactAtFullCollection -XX:CMSMaxAbortablePrecleanTime=5000 -XX:+CMSClassUnloadingEnabled -XX:CMSInitiatingOccupancyFraction=80 -XX:+DisableExplicitGC -verbose:gc -Xloggc:log/test.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseCompressedOops pec.jbb.JBBmain -propfile SPECjbb.props
Summary

1.       if a large heap is allocated in Old Gen and there are no or very few references from Old Gen to Young Gen, the patch improves gc performance a lot.

2.       Generally, Such as GCbench, SPECJVM2008 and specjbb2005, the patch improves a little with proper ParGCCardsPerStrideChunk, but maybe bad a little with improper ParGCCardsPerStrideChunk.

3.     this patch maybe is effective to app which allocates big heap and old gen is big

Regards,
Chengtao

________________________________

This email (including any attachments) is confidential and may be legally privileged. If you received this email in error, please delete it immediately and do not copy it or use it for any purpose or disclose its contents to any other person. Thank you.

±¾µçÓÊ(°üÀ¨Èκθ½¼þ)¿ÉÄܺ¬ÓлúÃÜ×ÊÁϲ¢ÊÜ·¨Âɱ£»¤¡£ÈçÄú²»ÊÇÕýÈ·µÄÊÕ¼þÈË£¬ÇëÄúÁ¢¼´É¾³ý±¾Óʼþ¡£Çë²»Òª½«±¾µçÓʽøÐи´ÖƲ¢ÓÃ×÷ÈκÎÆäËûÓÃ;¡¢»ò͸¶±¾ÓʼþÖ®ÄÚÈÝ¡£Ð»Ð»¡£
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20120301/6b87e2ca/attachment.htm>


More information about the hotspot-gc-dev mailing list