EpsilonGC and throughput.

Thomas Schatzl thomas.schatzl at oracle.com
Mon Jan 8 16:45:04 UTC 2018


Hi Aleksey,

  I apologize for my somewhat inappropriate words, this has been due to
some frustration; also for the long delay that were due to the winter
holidays.

Let's try to start all over with this... I will try to be constructive
this time. Feel free to remind me if needed.

One purpose of the JEP is to share a problem and propose an idea (often
already accompanied by a solution) to solve them. This problem and the
idea is then discussed by the community, eventually refining it along
the way.

The community then evaluates that idea based on its contents, of course
starting with the people trying to determine whether there is a
problem, what the problem is, and whether the proposed idea will fix
the problem.

For this evaluation to happen, the JEP needs to clearly state the
problem, it's seriousness, and the proposed idea.

It also helps if the JEP is written in a way to make it interesting for
the community to read it, and respond. The less thinking a reader has
to do to answer whether he is impacted or not, and whether and by how
much it would simplify the life of himself or in general Java users,
the more people will feel urged to get this in (or at least not
deterred).

Finally, I assume you do understand that, in general, although there is
always a certain level of duplication in the VM, but if a change only
solves the problems that existing code already solves, or solves
problems almost nobody has, or it does not give enough benefit (also
dependent on the complexity of a change), it makes it a hard(er) sell?

So the JEP template (http://openjdk.java.net/jeps/2) provides some
questions on how to structure this idea proposal and what to put into
the various sections.

In general this is to help you providing the relevant information to
the community. While this might be onerous for a writer at first
glance, it saves everyone else lots of time trying to find out what and
how you want to solve something.


I am going over the Motivation section in detail in the remainder of
this email, with some comments at the end about the Alternatives one
which seem to be the most important here.

The JEP template states under the Motivation section:

"Motivation
----------

// Why should this work be done?  What are its benefits?  Who's asking
// for it?  How does it compare to the competition, if any?"


Now let me try to associate these questions to the relevant parts of
the existing JEP 318 (http://openjdk.java.net/jeps/318) text.

And please, before reading below, I really do not want to shoot down
the proposal if you see a question mark. It should indicate just that
there is a question where I honestly do not know the answer to, but
which I hope you do. Similarly if I raise some concerns about some
statements I expect you to notice that there may be something missing
here, nothing else. I.e. not necessarily that I am "right" about
something. You said you already talked about it many times with other
people in the field, thought it over for a long time, so hopefully
these questions can be answered quickly, and in the future the JEP also
contains this information for other people too.

Some may not need an answer as they only try to make you think about
the seriousness of a stated problem.

JEP text: "Java implementations are well known for a broad choice of
highly configurable GC implementations."

Potential answer to "Why should this work be done?". Or does the
sentence indicate we need another GC because we already have so many,
and another does not hurt? I am asking this in full seriousness, I
really do not know. Or is this only an introductory sentence without
meaning?

JEP text: "There are four use cases where a trivial no-op GC proves
useful."

This seems to be a transition sentence, but is fine to make it flow
better.

Reading this, and given that only a list of benefits follows, I assume
that these two sentences were supposed to answer the "Why should this
work be done? Who's asking for it?" questions from the JEP.

In the earlier email you mentioned these power users that want full
control. Mention them here. Define them. Also mention other user groups
that might be interested. Particularly groups the benefits list could
refer to.

Let's go into these benefits in more detail:

JEP text: "Performance testing. Having a GC that does almost nothing is
a useful tool to do differential performance analysis for other, real
GCs. Having a no-op GC can help to filter out GC-induced performance
artifacts."

Benefit. Maybe it would be useful to list a few of these performance
artifacts here ("... , e.g. barrier code, concurrent threads"). 

Who are the benefactors of this? Not sure about these "power users"
(see M. Berger's response in this exact thread). Probably developers of
new GC algorithms?

An alternative could be a developer just nop'ing out the relevant GC
interface section. That is somewhat cumbersome, but for how many users
is this a problem? Spell that out in the appropriate Alternatives
section.

Also tell that using Epsilon GC for barrier testing may not be an ideal
tool, because all other existing collectors are generational (but in
the future it might apply to Shenandoah unless it goes generational
too, idk), and testing generational barriers on a non-generational heap
may not give a complete picture of barrier overhead.

JEP text: "Functional testing. For Java code testing, a way to
establish a threshold for allocated memory is useful to assert memory
pressure invariants. Today, we have to pick up the allocation data from
MXBeans, or even resort to parsing GC logs. Having a GC that accepts
only the bounded number of allocations, and fails on heap exhaustion,
simplifies testing."

Benefit. For regression testing, in how many cases do you think it is
sufficient (or in what circumstances) to get a fail/no-fail answer
only?
This seems to pass work on a failure to the dev, them needing to write
another test that also prints and monitors the memory usage increases
over time anyway.
How much work, given that you already need to monitor memory usage is
the test to fail when heap usage goes above a threshold then?

"VM interface testing. For VM development purposes, having a simple GC
helps to understand the absolute minimum required from the VM-GC
interface to have a functional allocator. This serves as proof that the
VM-GC interface is sane, which is important in lieu of JEP 304
("Garbage Collector Interface")."

Benefit. Who are the (main) benefactors for that - probably developers?
For a developer, how much is that benefit if there are already 5 or 6
implementations of that interface?

"Last-drop performance improvements. For ultra-latency-sensitive
applications, where developers are conscious about memory allocations
and know the application memory footprint exactly, or even have
(almost) completely garbage-free applications. In those applications,
GC cycles may be considered an implementation bug that wastes CPU
cycles for no good reason."

This is the only benefit in this list that actually mentions its target
group. I assume it is those power users (not necessarily developers
only?), that are ultra-latency aware. This paragraph further
characterizes them that they are also throughput conscious.
The discussion earlier also characterized them as also being very
conscious about memory layout etc, they do not want object reordering
because it is inconsistent between GCs (which is a different issue, and
I do not want to discuss it here).

>From what I gathered so far, they want absolute control over memory
management - but the real question is whether this is their real or
only problem with the Java VM to achieve consistent VM behavior.
There are certainly more components in the VM that introduce
potentially more significant jitter (now assuming that that power user
can set heap sizes accordingly to use e.g. Serial GC).

This execution consistency is maybe another goal that is even more
important than last-drop performance.

It may be useful to investigate the problem of these power users in
more detail, and see if we could provide a (more?) complete solution
for them.


"Extremely short lived jobs are one example of this."

I do not understand the use of Epsilon in such use case. The
alternative I can see would be to restart the VM after every short
lived job (something for the Alternatives section). That seems strange
to me, depending on the definition of a "short lived job", particularly
if nothing survives after execution of that short lived job, a GC will
be extremely fast.

Further I assume this example is about FaaS (Function-as-a-service) and
their users, and while there may be an overlap with those "power
users", I would expect the "regular java users" a way larger group than
the power users. There may be an overlap with those power users, power
users probably would not want to incur the associated loss of control.

"There are also cases when restarting the JVM -- letting load balancers
figure out failover -- is sometimes a better recovery strategy than
accepting a GC cycle."

I really can't find a good example where a GC, particularly in the
situation that has been described so far, also for these short-lived
jobs, where a GC (on an almost empty heap) is not at least as fast as a
restart.

It would make for a very good paragraph explaining this use case in the
alternatives section.

Another problem with these two sentences to me is (and I am by no means
a "FaaS power user") that I believe that waiting for the VM to
crash/shut down to steer the load balancers is not a good strategy.
Maybe you can give some more information about this use case?

"Even for non-allocating workloads, the choice of GC means choosing the
set of GC barriers that the workload has to use, even if no GC cycle
actually happens. Most JDK GCs are generational, and they emit at least
one reference write barrier. Avoiding this barrier brings the last bit
of performance improvement."

(_All_ JDK GCs are currently generational)

Now, as mentioned earlier in the thread, when talking about performance
improvements, it would be nice to mention the potential gains that can
be made (or elsewhere, like in the alternatives section). There is
already an implementation, and so you can measure this too.

Please make your comparison in context: since this whole paragraph is
about last-drop performance improvements for power users, a balanced
comparison would probably only be a comparison that such a power user
would do - i.e. not running the VM with randomly selected default
options that arbitrarily penalizes your competition.

In the earlier email I only directly asked for performance numbers
because in order to streamline this discussion, and given that you are
a well-known performance and benchmark guru (afaik you were "R"eviewer
long before me joining) it seemed a logical request. If you can't find
numbers, there is also the reference ("Barriers, Friendlier Still" or
so from Blackburn et al I think) I got that is also mentioned iirc in
the very good Jones GC book.
"Real" newbies I would just ask to perform this test.


In our discussion we found at least one more, actually unique benefit
(the one about getting correct heap dumps on failure).


Of course there is a limit on the length of that section and others
(i.e. considering the attention span of your readership), but all
questions asked by the JEP template should be answered in the
corresponding section. There is some intentional overlap in the JEP,
particularly in the first three sections, similar to a scientific paper
so that different groups of readers need only read the sections they
are interested in to see whether this change is actually affecting them
(and interesting to follow).

It shouldn't be as long as a scientific paper though, so if you think a
section is too long, drop the less impactful benefits, and other parts
of the JEP will automatically follow.

Again, given your experience with the VM I assume you know alternatives
as good or even better than me to make a balanced assessment here. 

Otherwise, keep them and please raise specific questions.

As for the Alternatives section, it is the same procedure, start with
answering the questions raised in the template:

"Alternatives
------------

// Did you consider any alternative approaches or technologies?  If so
// then please describe them here and explain why they were not 
// chosen."

I would assume that for all of these benefits we can easily come up
with alternative ways of doing the same or a similar thing (I already
stated a few alternatives that I think are very valid in this or
previous emails; some valid ones are already in the JEP), and why we
would want to particularly do it this way given the context of that
benefit (e.g. the user group). If there is no alternative, add a
sentence that says so in that section.

Again, try to make these alternative review balanced, and in context of
the users the benefit is for.

This section should imho also include a discussion of "mostly complete
alternatives", as suggested in this email thread already, e.g. adding a
-XX:+DieOnFirstGC switch, and reasons for and against it.

Please understand that the JEP will be the reference to talk about, not
some email or private offline discussions. Keeping that in mind I think
discussions will go much smoother.

I hope I made clear now why I, unfortunately not in a very friendly way
(apologies again), suggested that the current JEP text lacks the
required answers to the questions stated in the JEP template to (re-
)start a hopefully more focused discussion.

Thanks,
  Thomas




More information about the hotspot-gc-dev mailing list