RFC: Epsilon GC JEP

Tue Jul 18 12:37:19 UTC 2017

On 07/18/2017 01:23 PM, Aleksey Shipilev wrote:
> Hi Erik,
>
> Thanks for looking into this!
>
> On 07/18/2017 12:09 PM, Erik Helin wrote:
>> first of all, thanks for trying this out and starting a discussion. Regarding
>> the JEP, I have a few questions/comments:
>> - the JEP specifies "last-drop performance improvements" as a
>>   motivation. However, I think you also know that taking a pause and
>>   compacting a heap that is mostly filled with garbage most likely
>>   results in higher throughput*. So are you thinking in terms of pauses
>>   here when you say performance?
>
> This cuts both ways: while it is true that moving GC improves locality [1], it
> is also true that the runtime overhead from barriers can be quite high [2, 3,
> 4]. So, "performance" in that section is tied to both throughput (no barriers)
> and pauses (no pauses).
>
> [1] https://shipilev.net/jvm-anatomy-park/11-moving-gc-locality
> [2] https://shipilev.net/jvm-anatomy-park/13-intergenerational-barriers
> [3] Also, remember the reason for UseCondCardMark
> [4] Also, remember the whole thing about G1 barriers

Absolutely, barriers can come with an overhead. But a barrier that 
consists of dirtying a card does not come with a quite high overhead. In 
fact, it comes with a very low overhead :)

>> - why do you think Epsilon GC is a good baseline? IMHO, no barriers is
>>   not the perfect baseline, since it is just a theoretical exercise.
>>   Just cranking up the heap and using Serial is more realistic
>>   baseline, but even using that as a baseline is questionable.
>
> It sometimes is. Non-generational GC is a good baseline for some workloads. Even
> Serial does not cut it, because even if you crank up old and trim down young,
> there is no way to disable reference write barrier store that maintains card tables.

I will still point out though that a GC without a barrier is still just 
a theoretical baseline. One could imagine a single-gen mark-compact GC 
for OpenJDK (that would require no barriers), but AFAIK almost all users 
prefer the slight overhead of dirtying a card (and in return get a 
generational GC) for the use cases where a single-gen mark-compact 
algorithm would be applicable.

>> - the JEP specifies this as an experimental feature, meaning that you
>>   intend non-JVM developers to be able to run this. Have you considered
>>   the cost of supporting this option? You say "New jtreg tests under
>>   hotspot/gc/epsilon would be enough to assert correctness". For which
>>   platforms? How often should these tests be run, every night?
>
> I think for all platforms, somewhere in hs-tier3? IMO, current test set in
> hotspot/gc/epsilon is fairly complete, and it takes less than a minute on my
> 4-core i7.
>
>> Whenever we want to do large changes, like updating logging, tracing, etc,
>> will we have to take Epsilon GC into account? Will there be serviceability
>> support for Epsilon GC, like jstat, MXBeans, perf counters etc?
> I tried to address the maintenance costs in the JEP? It is unlikely to cause
> trouble, since it mostly calls into the shared code. And GC interface work would
> hopefully make BarrierSet into more shareable chunk of interface, which makes
> the whole thing even more self-contained. There is some new code in MemoryPools
> that handles the minimal diagnostics. MXBeans still work, at least ThreadMXBean
> that reports allocation pressure, although I'd need to add a test to assert that.
>
> To me, if the no-op GC requires much maintenance whenever something in JVM is
> changing, that points to the insanity of GC interface. No-op GC is a good canary
> in the coalmine for this. This is why one of the motivations is seeing what
> exactly a minimal GC should support to be functional.

Again, our opinions differ on this. Am I all for changing the GC 
interface? Yes, I have expressed nothing but full support of the great 
work that Roman is doing. Do I think we need something like a canary in 
the coalmine for JVM internal, GC internal, code? No. If you want 
anything resembling a canary, write a unit test using googletest that 
exercises the interface.

However, again, this might be useful for someone who wants try to do 
some changes to the JVM GC code. But that, to me, is not enough to 
expose it to non-JVM developers. It could be useful to have in the 
source code though, maybe like a --with-jvm-feature kind of thing?

>> - You quote "The experience, however, tells that many players in the
>>   Java ecosystem already did this exercise with expunging GC from their
>>   custom-built JVMs". So it seems that those users that want something
>>   like Epsilon GC are fine with building OpenJDK themselves? Having
>>   -XX:+UseEpsilonGC as a developer flag is much different compared to
>>   exposing it (and supporting, even if in experimental mode) to users.
>
> There is a fair share of survivorship bias: we know about people who succeeded,
> do we know how many failed or given up? I think developers who do day-to-day
> Hotspot development grossly underestimate the effort required to even build a
> custom JVM. Most power users I know have did this exercise with great pains. I
> used to sing the same song to them: just build OpenJDK yourself, but then pesky
> details pour in. Like: oh, Windows, oh, Cygwin, oh MacOS, oh XCode, oh FreeType,
> oh new compilers that build OpenJDK with warnings and build does treat warnings
> as errors, oh actual API mismatches against msvcrt, glibc, whatever, etc. etc.
> etc. As much as OpenJDK build improved over the years, I am not audacious enough
> to claim it would ever be a completely smooth experience :) Now I am just
> willingly hand them binary builds.

Such users will still be able to get binary builds if someone is willing 
to produce them with Epsilon GC. There are plenty of OpenJDK binary 
builds available from various organizations/companies.

> So I think having the experimental feature available in the actual product build
> extends the feature exposure. For example, suppose you are the academic writing
> a paper on GC, would you accept custom-build JVM into your results, or would you
> rather pick up the "gold" binary build from a standard distribution and run with it?

I guess such researcher would be producing a build from the same source 
as the one the made changes to? How could they otherwise do any kind of 
reasonable comparison?

>> I guess most of my question can be summarized as: this seems like it perhaps
>> could be useful tool for JVM GC developers, why do you want to expose the flag
>> to non-JVM developers (given all the work/support/maintenance that comes with
>> that)?
>
> My initial thought was that the discussion about the costs should involve
> discussing the actual code. This is why there is a complete implementation in
> the Sandbox, and also the webrev posted.
>
> In the months following my initial (crazy) experiments, I had multiple people
> coming to me and asking when Epsilon is going to be in JDK, because they want to
> use it. And those were the ultra-power-users who actually know what they are
> doing with their garbage-free applications.
>
> So the short answer about why Epsilon is good to have in product is because the
> cost seems low, the benefits are present, and so cost/benefit is still low.

And it is here that our opinions differ :) For you the maintenance cost 
is low, whereas for me, having yet another command-line flag, yet 
another code path, gets in the way. You have to respect that we have 
different background and experiences here.

>> It is _great_ that you are experimenting and trying out new ideas in the VM,
>> please continue doing that! Please don't interpret my questions/comments as
>> to grumpy, this is just my experience from maintaining 5-6 different GC
>> algorithms for more than five years that is speaking. There is _always_ a
>> maintenance cost :)
>
> Yeah, I know how that feels. Look at the actual Epsilon changes, do they look
> scary to you, given your experience maintaining the related code?

I don't like taking the role of the grumpy open source maintainer :) No, 
the code is not scary, code is rarely scary IMO, it is just code. 
Running tests, fixing that a test -Xmx1g isn't run on a RPi, having 
additional code paths, more cases to take into consideration when 
refactoring, is burdensome. And to me, the benefits of benchmarking 
against Epsilon vs benchmarking against Serial/Parallel isn't that high 
to me.

But, I can understand that it is useful when trying to evaluate for 
example the cost of stores into a HashMap. Which is why I'm not against 
the code, but I'm not keen on exposing this to non-JVM developers.

Thanks,
Erik

> Thanks,
> -Aleksey
>