RFR: JDK-8086056: ParNew: auto-tune ParGCCardsPerStrideChunk

Tony Printezis tprintezis at twitter.com
Tue Jun 23 16:19:11 UTC 2015


Hi Bengt,

I’ll try factoring out the interpolation and tell me what you think. Sorry, I’m a bit behind on this as I’ve been busy with a couple of more urgent tasks hee. I’ll get this done shortly.

Tony

On June 22, 2015 at 3:36:10 AM, Bengt Rutisson (bengt.rutisson at oracle.com) wrote:


Hi Tony,

I agree with your comments. Thanks for fixing this!

One answer at the end of this message...

On 2015-06-18 15:50, Tony Printezis wrote:
Hi Bengt,

Thanks for looking at it. Inline.

On June 18, 2015 at 3:02:45 AM, Bengt Rutisson (bengt.rutisson at oracle.com) wrote:


Hi Tony,

NIce to hear from you!


On 18/06/15 00:30, Tony Printezis wrote:
Hi all,

A small patch for your consideration:

http://cr.openjdk.java.net/~tonyp/8086056/webrev.0/

A couple of style comments:

 891   const size_t MinOldGenCapacity =      G;
 892   const size_t MaxOldGenCapacity = 16 * G;
 893   assert(MinOldGenCapacity <= MaxOldGenCapacity, "sanity");

Local variables are normmaly named using lower case and underscore. 


Yeah, I of course usually do that. But as they are constants I thought I’d make them look like constants. :-)



So, min_old_gen_capacity and max_old_gen_capacity. Same for the rest of the local variables in ParNewGeneration::adjust_cards_per_stride().

Having said that, I think it would be good to turn some of these constants into flags (which means that the naming should be as it is now ;) ). 


Hang on! Are you and Masa encourating me to add 5 (!!!) more cmd line args to HotSpot? ;-)



Especially since you say that you haven't done much performance testing of these values.


To be clear: I did performance testing with chunks sizes in the 256-8k range and with three old gen sizes (2g, 10g, and 20g) and auto-tuning comes pretty close to optimal; I have numbers on the JIRA. But, yes, that was only with a couple of (synthetic) tests and only on one architecture (Linux/x64). If someone there could try this on a couple more apps and maybe on sparc, as we don’t happen to have any sparc boxes lying around :-), it’d be helpful.

But, yes, your suggestion of making the chunk size / old gen capacity limits into cmd line args definitely makes sense and I had also considered that myself. But I had decided against it as I thought you wouldn’t want more cmd line args. :-) How about:

+UseDynamicParGCStrides

DynamicParGCStrides{Min,Max}OldGenCapacity

DynamicParGCStrides{Min,Max}Size

I should also make them all manageable.



 I would prefer that we could adjust them at runtime. Best would of course be if you could take the time to do the performance testing.


So, as I said earlier, I did. The numbers are on the JIRA. Is there anything else I could maybe run? Maybe larger old gen than 20g? (but that’s reaching the memory limits of my workstation, FWIW)




I realize that the assert on line 893 might have been useful if you were playing around with the numbers when you were experimenting with the patch. But now it looks mostly like line noise to me. Same with the assert on line 898.


Actually, it as basically stating the invariant for those parameters (in case someone wants to play around with them). But if we make the constants into cmd line args this will go away...




I would also consider factoring the actual computation out into a separate helper method.


You mean the interpolation? Like:

(base_min, base_max, value_min, value_max, value) -> result in range [base_min, base_max]?


Yes, that's what I meant. But if you, as suggested above, get rid of most (all?) of the constant definitions in adjust_cards_per_stride() then I guess it is mostly only the interpolation left. So, in that case it may not be necessary to factor it out. I would have to see a new webrev to have more comments.

Thanks,
Bengt

Tony




I also second Jon's comment on that it would be good with a more explicit way of turning this feature off.

Thanks,
Bengt


(BTW, for some reason some of the webrev output is a bit messed up. Not sure why, maybe some hg incompatibility I guess. The diffs look OK though. I also attached the patch to this e-mail.)

There’s a bit of info on the JIRA on the rationale for the patch:

https://bugs.openjdk.java.net/browse/JDK-8086056

The min / max values for the old gen capacity and ParGCCardsPerStrideChunk were chosen empircally after running a few (mostly synthetic tests) on Linux x64. If someone has the cycles to do a more extensive performance study, I’d be happy to revise them accordingly.

Regards,

Tony

-----

Tony Printezis | JVM/GC Engineer / VM Team | Twitter

@TonyPrintezis
tprintezis at twitter.com







-----

Tony Printezis | JVM/GC Engineer / VM Team | Twitter

@TonyPrintezis
tprintezis at twitter.com


-----

Tony Printezis | JVM/GC Engineer / VM Team | Twitter

@TonyPrintezis
tprintezis at twitter.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20150623/8c631dfb/attachment.htm>


More information about the hotspot-gc-dev mailing list