RFC: Epsilon GC JEP

Tue Jul 18 13:34:41 UTC 2017

Hi Aleksey,

  I would like to expand this cost/benefit analysis a bit; I think the
most contentious point brought up by Erik has been the develop vs.
experimental flag issue.

For that, let me present you my understanding of the size and costs of
making this an experimental (actually product) vs. develop flag for the
intended target group as presented here.

On Tue, 2017-07-18 at 13:23 +0200, Aleksey Shipilev wrote:
> Hi Erik,
> 
> Thanks for looking into this!
> 
> On 07/18/2017 12:09 PM, Erik Helin wrote:
> > 
> > first of all, thanks for trying this out and starting a discussion.
> > Regarding the JEP, I have a few questions/comments:
[...]
> 
> > - why do you think Epsilon GC is a good baseline? IMHO, no barriers
> > is not the perfect baseline, since it is just a theoretical
> > exercise. Just cranking up the heap and using Serial is more
> > realistic   baseline, but even using that as a baseline is
> > questionable.
> It sometimes is. Non-generational GC is a good baseline for some
> workloads. Even Serial does not cut it, because even if you crank up
> old and trim down young, there is no way to disable reference write
> barrier store that maintains card tables.

Not prevented by making it a develop option.

> > - the JEP specifies this as an experimental feature, meaning that
> > you intend non-JVM developers to be able to run this. Have you
> > considered the cost of supporting this option? You say "New jtreg
> > tests under hotspot/gc/epsilon would be enough to assert
> > correctness". For which platforms? How often should these tests be
> > run, every night? 
> I think for all platforms, somewhere in hs-tier3? IMO, current test
> set in hotspot/gc/epsilon is fairly complete, and it takes less than
> a minute on my 4-core i7.

Running it daily, on X platforms on Y OSes for Z releases adds up
quickly. Could run something else instead. And there is always
something else to run on these machines, trust me. :)

> >
> > Whenever we want to do large changes, like updating logging,
> > tracing, etc, will we have to take Epsilon GC into account? Will
> > there be serviceability support for Epsilon GC, like jstat,
> > MXBeans, perf counters etc?
> I tried to address the maintenance costs in the JEP? It is unlikely
> to cause trouble, since it mostly calls into the shared code. And GC
> interface work would hopefully make BarrierSet into more shareable
> chunk of interface, which makes the whole thing even more self-
> contained. There is some new code in MemoryPools that handles the
> minimal diagnostics. MXBeans still work, at least ThreadMXBean
> that reports allocation pressure, although I'd need to add a test to
> assert that.
> 
> To me, if the no-op GC requires much maintenance whenever something
> in JVM is changing, that points to the insanity of GC interface. No-
> op GC is a good canary in the coalmine for this. This is why one of
> the motivations is seeing what exactly a minimal GC should support to
> be functional.

Sanity checking of the interfaces is not prevented by a develop option.

> > 
> > - You quote "The experience, however, tells that many players in
> > the Java ecosystem already did this exercise with expunging GC from
> > their custom-built JVMs". So it seems that those users that want
> > something like Epsilon GC are fine with building OpenJDK
> > themselves? Having -XX:+UseEpsilonGC as a developer flag is much
> > different compared to exposing it (and supporting, even if in
> > experimental mode) to users.
>
> There is a fair share of survivorship bias: we know about people who
> succeeded, do we know how many failed or given up? I think developers
> who do day-to-day Hotspot development grossly underestimate the
> effort required to even build a custom JVM. Most power users I know
> have did this exercise with great pains. I used to sing the same song
> to them: just build OpenJDK yourself, but then pesky details pour in.
> Like: oh, Windows, oh, Cygwin, oh MacOS, oh XCode, oh FreeType, oh
> new compilers that build OpenJDK with warnings and build does treat
> warnings as errors, oh actual API mismatches against msvcrt, glibc,
> whatever, etc. etc. etc. As much as OpenJDK build improved over the
> years, I am not audacious enough to claim it would ever be a
> completely smooth experience :) Now I am just willingly hand them
> binary builds.
> 
> So I think having the experimental feature available in the actual
> product build extends the feature exposure.

I agree here.

The question is, by how much. So academics (and I am not trying to hit
on academics here, you brought them up ;)) that write a paper on GC but
never need to rebuild the VM (including the JDK here) because they
don't do any changes would be inconvenienced.

Let me ask, how many do you expect these to be? From my understanding there seems to be a very manageable yearly total GC paper output at the usual conferences. Not sure how putting Epsilon GC in product would improve that.

So, even after all these target group concerns, how much time do you think these persons writing that paper (that do not need to recompile the VM and need to show their numbers in Epsilon GC) are going to spend on getting numbers compared to the hypothetical time for compiling the VM?

[My personal experience is that when developing any changes by far most of the time is spent on waiting for the machine(s) to complete testing, not writing any actual changes or building. When writing a paper I my experience is that a very large part of the time is spent on running and re-running tests over and over again to be able to understand and explain results, or tweaking changes, or simply fixing bugs for some results]

> For example, suppose you are the academic writing a paper on GC,
> would you accept custom-build JVM into your results, or would you
> rather pick up the "gold" binary build from a standard distribution
> and run with it?

Not sure what you meant with this latter argument, if it is actually an
argument. If I wanted to effect a change in the VM and measure it, I
would already need to change and recompile the VM. So it is not a big
stretch to imagine that baselines could come from something recompiled.
I have seen quite a few papers using modified baselines for one or the
other reason (like adding necessary instrumentation, maybe fixing
obvious bugs).

>From experience I know that for many reasons it is already often
basically impossible for somebody else to reproduce particular results
(without extreme effort) if not impossible. Even understanding some
baseline results may require some imagination how they were obtained.
Not even talking about reproducing them. There seems to be a very small
step from trusting results from a "gold" official binary to trusting a
slightly modified one.

As for the amount of inconvenience, I think the users that already need
to recompile for their changes are not very much inconvenienced. I.e.
changing a single "develop" to "product" seems to be a very small
effort.

> > I guess most of my question can be summarized as: this seems like
> > it perhaps could be useful tool for JVM GC developers, why do you
> > want to expose the flag to non-JVM developers (given all the
> > work/support/maintenance that comes with that)?
> My initial thought was that the discussion about the costs should
> involve discussing the actual code. This is why there is a complete
> implementation in the Sandbox, and also the webrev posted.
> 
> In the months following my initial (crazy) experiments, I had
> multiple people coming to me and asking when Epsilon is going to be
> in JDK, because they want to use it. And those were the ultra-power-
> users who actually know what they are doing with their garbage-free
> applications.

Aren't ultra-power-users able to rebuild the VM? What is their cost vs.
the effort spent into making their applications garbage-free or
implementing the necessary workarounds to be able to use that gc
(mentioned load-balancer trickery etc)?

> So the short answer about why Epsilon is good to have in product is
> because the cost seems low, the benefits are present, and so
> cost/benefit is still low.

The number of people benefitting from having this available in a
product build seems to be extremely small. And so seem to be their
relative costs to fix that.

Increased exposure seems to be a real recurring cost for maintenance in
the product, although it seems relatively small compared to other
features. Still somebody has to do it.

> > It is _great_ that you are experimenting and trying out new ideas
> > in the VM, please continue doing that! Please don't interpret my
> > questions/comments as to grumpy, this is just my experience from
> > maintaining 5-6 different GC algorithms for more than five years
> > that is speaking. There is _always_ a maintenance cost :)
> Yeah, I know how that feels. Look at the actual Epsilon changes, do
> they look scary to you, given your experience maintaining the related
> code?

Well, 1500 LOC (well, ~800 without the tests) of changes do look scary
to me, whatever they do :)

Overall, on the question of develop vs. experimental option, I would
tend to prefer a develop option.
In this area there simply seem to be too many downsides compared to the
upsides for an extremely limited user group.

Thanks,
  Thomas