RFR(M) 8186834:Expanding old area without full GC in parallel GC

Thomas Schatzl thomas.schatzl at oracle.com
Wed Oct 18 08:18:37 UTC 2017


On Tue, 2017-10-17 at 21:09 +0900, Michihiro Horie wrote:
> Hi Thomas,
> 
> Thanks a lot for your response!
> 
> >what is the difference (in performance) to simply set -Xms==-Xmx
> here?
> This change assumes -Xms==-Xmx is not set. 
> 
> Please let me explain our situation. We have a real project where we
> need to run multiple Java processes per node with limited memory
> resource for job schedulers of parallel distributed computing
> framework such as Spark. Arbitrary Java processes actually need the
> Xmx heap, although the same JVM arguments are uniformly set for these
> job schedulers.

I am still trying to understand why in this situation the new
(additional) flag would be preferable to the mentioned alternative.

Maybe there is something about argument passing, but the description
seems to be a bit unclear.

Let me recap if I understood the problem and the need for this
solution correctly:

- there are at least two different kinds of VMs, job schedulers and the
big data processing worker VMs
- (assumption) the job schedulers and the worker VMs have different
memory requirements
- to ease VM management (assumption), both job schedulers and the
worker VMs need to be passed the same VM arguments?

So in your case you would add the new
-XX:+UseAdaptiveGenerationSizePolicyBeforeMajorCollection to both, and
the worker VM would benefit from it, while the job scheduler would
never ever expand the heap anyway?

Otherwise, if you were able to pass different VM arguments to the
different VMs, the use of -Xms (instead of that new flag) would seem
straightforward to me (Only specifying -Xms will not actually commit
the memory, so there is no difference in actual memory use).

Particularly if, as you mention, full gc will not yield a significant
amount of freed memory, both methods seem to achieve the exact same
effect.

Or is there another difference between passing -Xms instead of
-XX:UseAdaptiveGenerationSizePolicyBeforeMajorCollection?

> Besides, only a limited number of objects are
> collected in the full GCs that occur during the heap expansion. So,
> full GC here is especially expensive.

Did you ever try G1 for these workloads? There are some (old) reports
[0] where G1 outperforms Parallel GC with some tuning.

It generally does not use full gcs to expand the heap.

With recent improvements in JDK9, it should perform even slightly
better, but I am not sure if Spark already works with JDK9.

> >And why not make the (first) full gc expand the heap more
> > aggressively?
> >(I think there is at least one way to do that, something like
> >Min/MaxFreeHeapRatio or so, I can look it up if needed).
> Thank you for telling the Min/MaxHeapFreeRatio. I think they surely
> help for our purpose, but I think this change would be still
> effective with them.
> 
> Best regards,

Thanks,
  Thomas

[0] https://databricks.com/blog/2015/05/28/tuning-java-garbage-collecti
on-for-spark-applications.html

> --
> Michihiro,
> IBM Research - Tokyo
> 
> Thomas Schatzl ---2017/10/13 22:04:38---Hi, On Tue, 2017-08-29 at
> 00:20 +0900, Michihiro Horie wrote:
> 
> From: Thomas Schatzl <thomas.schatzl at oracle.com>
> To: Michihiro Horie <HORIE at jp.ibm.com>, hotspot-dev at openjdk.java.net
> Cc: Hiroshi H Horii <HORII at jp.ibm.com>
> Date: 2017/10/13 22:04
> Subject: Re: RFR(M) 8186834:Expanding old area without full GC in
> parallel GC
> 
> 
> 
> Hi,
> 
> On Tue, 2017-08-29 at 00:20 +0900, Michihiro Horie wrote:
> > Dear all,
>> > Would you please review the following change?
> > bug: https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.open
> jdk.java.net_browse_JDK-2D8186834&d=DwIFaQ&c=jf_iaSHvJObTbx-
> siA1ZOg&r=oecsIpYF-cifqq2i1JEH0Q&m=CaV8n9mhlYuwwkSthJ3tAKsxYWXA8YW-
> A_scv5JwjxE&s=RN7_XLvlvAligv4Bmsj1fMFsKTHsrQQFEaLRIrjYm9Y&e=
> > webrev: https://urldefense.proofpoint.com/v2/url?u=http-3A__cr.open
> jdk.java.net_-7Emhorie_8186834_webrev.00_&d=DwIFaQ&c=jf_iaSHvJObTbx-
> siA1ZOg&r=oecsIpYF-cifqq2i1JEH0Q&m=CaV8n9mhlYuwwkSthJ3tAKsxYWXA8YW-
> A_scv5JwjxE&s=Lkjbx2hQv0H19iIiNH-7wwN0HKn5xxhXinMHhoPIvqI&e=
>> > In parallel GC, old area is expanded only after a full GC occurs.
> > I am wondering if we could give an option to expand old area
> without
> > full GC. So, I added an option
> > UseAdaptiveGenerationSizePolicyBeforeMajorCollection
> 
> Sorry for the late (and probably stupid) question, but what is the
> difference (in performance) to simply set -Xms==-Xmx here?
> 
> And why not make the (first) full gc expand the heap more
> aggressively?
> (I think there is at least one way to do that, something like
> Min/MaxFreeHeapRatio or so, I can look it up if needed).
> 
> Thanks,
>  Thomas
> 
> > Following is a simple micro benchmark I used to see the benefit of
> > this change.
> > As a result, pause time of full GC reduced by 30%. Full GC count
> > reduced by 54%.
> > Elapsed time reduced by 7%.
>> > import java.util.HashMap;
> > import java.util.Map;
> > public class HeapExpandTest {
> >   static Map<Integer, byte[]> map = new HashMap<>();
> >   public static void main(String[] args) throws Exception {
> >     long start = System.currentTimeMillis();
> >     for (int i = 0; i < 2200; ++i) {
> >       map.put(i, new byte[1024*1024]); // 1MB
> >     }
> >     System.out.println("elapsed= " + (System.currentTimeMillis() -
> > start));
> >   }
> > }
>> > JVM options: -XX:+UseParallelGC -XX:+UseAdaptiveSizePolicy
> > -XX:ParallelGCThreads=8 -Xms64m -Xmx3g
> > -XX:+UseAdaptiveGenerationSizePolicyBeforeMajorCollection
> 
> 
> 



More information about the hotspot-dev mailing list