RFC: Epsilon GC JEP

Tue Jul 18 12:45:25 UTC 2017

Hi Aleksey,

what speaks against doing full GCs when memory runs out?

I can imagine scenarios when it could be useful to allow full-GCs:

1. Allow full-GCs only on System.gc()... for testing? Or for control
fanatics?
2. Allow full-GCs only on OOM.. for containerized apps or as replacement
for letting the process die and respawn (i.e. don't care at all about
pauses, but care about throughput and absolutely-no-barriers)
3. Allow full-GCs in both cases

I can see this enabled/disabled selectively by flags.

Yes, I know, complexity, maintenance, etc blah blah ;-) But it should be
very simple to do. Reuse markSweep.cpp should do it.

Basically serial GC without the generational barriers.

What do you think?

Roman

Am 18.07.2017 um 13:23 schrieb Aleksey Shipilev:
> Hi Erik,
>
> Thanks for looking into this!
>
> On 07/18/2017 12:09 PM, Erik Helin wrote:
>> first of all, thanks for trying this out and starting a discussion. Regarding
>> the JEP, I have a few questions/comments:
>> - the JEP specifies "last-drop performance improvements" as a
>>   motivation. However, I think you also know that taking a pause and
>>   compacting a heap that is mostly filled with garbage most likely
>>   results in higher throughput*. So are you thinking in terms of pauses
>>   here when you say performance?
> This cuts both ways: while it is true that moving GC improves locality [1], it
> is also true that the runtime overhead from barriers can be quite high [2, 3,
> 4]. So, "performance" in that section is tied to both throughput (no barriers)
> and pauses (no pauses).
>
> [1] https://shipilev.net/jvm-anatomy-park/11-moving-gc-locality
> [2] https://shipilev.net/jvm-anatomy-park/13-intergenerational-barriers
> [3] Also, remember the reason for UseCondCardMark
> [4] Also, remember the whole thing about G1 barriers
>
>> - why do you think Epsilon GC is a good baseline? IMHO, no barriers is
>>   not the perfect baseline, since it is just a theoretical exercise.
>>   Just cranking up the heap and using Serial is more realistic
>>   baseline, but even using that as a baseline is questionable.
> It sometimes is. Non-generational GC is a good baseline for some workloads. Even
> Serial does not cut it, because even if you crank up old and trim down young,
> there is no way to disable reference write barrier store that maintains card tables.
>
>> - the JEP specifies this as an experimental feature, meaning that you
>>   intend non-JVM developers to be able to run this. Have you considered
>>   the cost of supporting this option? You say "New jtreg tests under
>>   hotspot/gc/epsilon would be enough to assert correctness". For which
>>   platforms? How often should these tests be run, every night? 
> I think for all platforms, somewhere in hs-tier3? IMO, current test set in
> hotspot/gc/epsilon is fairly complete, and it takes less than a minute on my
> 4-core i7.
>
>> Whenever we want to do large changes, like updating logging, tracing, etc, 
>> will we have to take Epsilon GC into account? Will there be serviceability
>> support for Epsilon GC, like jstat, MXBeans, perf counters etc?
> I tried to address the maintenance costs in the JEP? It is unlikely to cause
> trouble, since it mostly calls into the shared code. And GC interface work would
> hopefully make BarrierSet into more shareable chunk of interface, which makes
> the whole thing even more self-contained. There is some new code in MemoryPools
> that handles the minimal diagnostics. MXBeans still work, at least ThreadMXBean
> that reports allocation pressure, although I'd need to add a test to assert that.
>
> To me, if the no-op GC requires much maintenance whenever something in JVM is
> changing, that points to the insanity of GC interface. No-op GC is a good canary
> in the coalmine for this. This is why one of the motivations is seeing what
> exactly a minimal GC should support to be functional.
>
>
>> - You quote "The experience, however, tells that many players in the
>>   Java ecosystem already did this exercise with expunging GC from their
>>   custom-built JVMs". So it seems that those users that want something
>>   like Epsilon GC are fine with building OpenJDK themselves? Having
>>   -XX:+UseEpsilonGC as a developer flag is much different compared to
>>   exposing it (and supporting, even if in experimental mode) to users.
> There is a fair share of survivorship bias: we know about people who succeeded,
> do we know how many failed or given up? I think developers who do day-to-day
> Hotspot development grossly underestimate the effort required to even build a
> custom JVM. Most power users I know have did this exercise with great pains. I
> used to sing the same song to them: just build OpenJDK yourself, but then pesky
> details pour in. Like: oh, Windows, oh, Cygwin, oh MacOS, oh XCode, oh FreeType,
> oh new compilers that build OpenJDK with warnings and build does treat warnings
> as errors, oh actual API mismatches against msvcrt, glibc, whatever, etc. etc.
> etc. As much as OpenJDK build improved over the years, I am not audacious enough
> to claim it would ever be a completely smooth experience :) Now I am just
> willingly hand them binary builds.
>
> So I think having the experimental feature available in the actual product build
> extends the feature exposure. For example, suppose you are the academic writing
> a paper on GC, would you accept custom-build JVM into your results, or would you
> rather pick up the "gold" binary build from a standard distribution and run with it?
>
>
>> I guess most of my question can be summarized as: this seems like it perhaps
>> could be useful tool for JVM GC developers, why do you want to expose the flag
>> to non-JVM developers (given all the work/support/maintenance that comes with
>> that)?
> My initial thought was that the discussion about the costs should involve
> discussing the actual code. This is why there is a complete implementation in
> the Sandbox, and also the webrev posted.
>
> In the months following my initial (crazy) experiments, I had multiple people
> coming to me and asking when Epsilon is going to be in JDK, because they want to
> use it. And those were the ultra-power-users who actually know what they are
> doing with their garbage-free applications.
>
> So the short answer about why Epsilon is good to have in product is because the
> cost seems low, the benefits are present, and so cost/benefit is still low.
>
>
>> It is _great_ that you are experimenting and trying out new ideas in the VM,
>> please continue doing that! Please don't interpret my questions/comments as
>> to grumpy, this is just my experience from maintaining 5-6 different GC
>> algorithms for more than five years that is speaking. There is _always_ a
>> maintenance cost :)
> Yeah, I know how that feels. Look at the actual Epsilon changes, do they look
> scary to you, given your experience maintaining the related code?
>
> Thanks,
> -Aleksey
>