OpenJDK G1 Patch

Tue May 22 17:26:14 UTC 2018

> On May 22, 2018, at 11:20 AM, Ruslan Synytsky <synytskyy at jelastic.com> wrote:
> 
> 
> 
> On 22 May 2018 at 03:50, Kirk Pepperdine <kirk at kodewerk.com <mailto:kirk at kodewerk.com>> wrote:
> 
>> On May 21, 2018, at 6:15 PM, Ruslan Synytsky <synytskyy at jelastic.com <mailto:synytskyy at jelastic.com>> wrote:
>> 
>> Dear Kirk, thank you for the feedback. Please see inline.
>> 
>> On Mon, May 21, 2018 at 07:45 Kirk Pepperdine <kirk at kodewerk.com <mailto:kirk at kodewerk.com>> wrote:
>> Hi Rodrigo,
>> 
>> Interesting idea. IMHO, this solution is too simplistic as it focuses on the needs of providers at the detriment of the goals of the consumers.
>> I’m very concerned on this statement. Which part of the patch description gives you the feeling that this work has been done in favor of providers? Believe me, based on my experience of working with hosting providers worldwide, cloud vendors are least interested in resource usage optimization, because more resources customers use more money they pay. So, please give us a hint how the description can be improved.
> 
> Well, I do have to apologize as my comment seems a bit harsh. 
>  
> Of course reducing resources is a good thing 
> Kirk, no problem. Good that we both agree on that. The main goal of this work is to make Java less greedy on the memory usage and more cost effective for applications with not very intensive load.   

Great, there are three aspects to G1 that you need to be concerned about; allocation rates, mutation rates and live data set size. Pause time is a function of LDSS, GC frequency a function of allocation rate, and run time overhead is a combination of allocation and mutation.
>  
> however I have this aversion to full stop the world collections when using a concurrent collection. It feels like there should be a solution that doesn’t require calling for a full. I’m sure it won’t be as easy as calling for a full.
> I agree too that it will be great to have a solution w/o calling Full GC. This is why I personally like Shenandoah. 

I think your problems with G1 might be this “bug” that I’ve encountered that leaves regions that should be collected uncollected. The Full GC gives you the illusion that you’ve cleaned things up and there are other cases that I don’t fully understand that result in all regions being cleaned up without a full but one thing I do know is that if you starve G1 for memory, you’re asking for trouble. I’ve seen GC pause time overheads of 40–50%.
> 
> From other side, the current patch solves the problem good enough, because customers do not really care how it works inside, the mass market users calculate how much money they pay for the cloud hosting and how it compares to the resource usage with another languages. If anyone hits performance issues, it requires a deep dive into various of options and a fine tuning of JVM is needed anyway. The desired behavior is to tigger RAM compaction and release only at an idle stage and only when it's enabled. There are additional options that should prevent executing Full GC at an active stage. 

Ok, but I don’t believe you need a full collection to release memory back to the OS. Released memory should come from high memory regions and those regions will be empty under most circumstances after a young gen collection. A reduction of heap size should put the live data set size over the IHOP meaning you should almost immediately trigger a concurrent mark cycle.

As for the idle comment, if you’re idle why not simply shutdown JVMs in your cluster?

>  
> MaxLoadGC - Max CPU usage that should still trigger periodic GCs. Above this value, no periodic GC will not be triggered.
> MaxOverCommitted - guarantees that Full GC is not triggered if memory if not overcommitted and there is nothing to release back to OS. 
> Even PHP-FPM implemented options that allow users to choose a most convenient configuration: static - for a high performance, dynamic - for load with predictable spikes, and ondemand - for cost efficiency (related article <https://community.webcore.cloud/tutorials/php_fpm_ondemand_process_manager_vs_dynamic/> and official documentation <http://php.net/manual/en/install.fpm.configuration.php#pm>). Why should we limit Java users and force everyone to think in terms of high performance only and always? Java users are struggling with no solution for cost efficiency for years. 

This isn’t about highest performance, it’s about minimizing it’s effect on tail latencies.

Kind regards,
Kirk

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20180522/16390f52/attachment.htm>