RFC: Epsilon GC JEP

Tue Jul 18 13:26:03 UTC 2017

On 07/18/2017 02:37 PM, Erik Helin wrote:
>> [1] https://shipilev.net/jvm-anatomy-park/11-moving-gc-locality
>> [2] https://shipilev.net/jvm-anatomy-park/13-intergenerational-barriers
>> [3] Also, remember the reason for UseCondCardMark
>> [4] Also, remember the whole thing about G1 barriers
> 
> Absolutely, barriers can come with an overhead. But a barrier that consists of
> dirtying a card does not come with a quite high overhead. In fact, it comes with
> a very low overhead :)

Mhm! "Low" is in the eye of beholder. You can't beat zero overhead. And there
are people who literally count instructions on their hot paths, while still
developing in Java.

Let me ask you a trick question: how do you *know* the card mark overhead is
small, if you don't have a no-barrier GC to compare against?

>>> - why do you think Epsilon GC is a good baseline? IMHO, no barriers is
>>>   not the perfect baseline, since it is just a theoretical exercise.
>>>   Just cranking up the heap and using Serial is more realistic
>>>   baseline, but even using that as a baseline is questionable.
>>
>> It sometimes is. Non-generational GC is a good baseline for some workloads. Even
>> Serial does not cut it, because even if you crank up old and trim down young,
>> there is no way to disable reference write barrier store that maintains card
>> tables.
> 
> I will still point out though that a GC without a barrier is still just a
> theoretical baseline. One could imagine a single-gen mark-compact GC for OpenJDK
> (that would require no barriers), but AFAIK almost all users prefer the slight
> overhead of dirtying a card (and in return get a generational GC) for the use
> cases where a single-gen mark-compact algorithm would be applicable.

Mark-compact, maybe. But single-gen mark-sweep algorithms are plenty, see e.g.
Go runtime. I have hard time seeing how is that theoretical.

> However, again, this might be useful for someone who wants try to do some
> changes to the JVM GC code. But that, to me, is not enough to expose it to
> non-JVM developers. It could be useful to have in the source code though, maybe
> like a --with-jvm-feature kind of thing?

That would go against the maintainability argument, no? Because you will still
have to maintain the code, *and* it will require building a special JVM flavor.
So it is a lose-lose: neither users get it, nor maintainers have simpler lives.

> [snip] Such users will still be able to get binary builds if someone is willing to
> produce them with Epsilon GC. There are plenty of OpenJDK binary builds
> available from various organizations/companies.

Well, yes. I actually happen to know the company which can distribute this in
the downstream OpenJDK builds, and reap the ultra-power-users loyalty. But, I am
maintaining that having the code upstream is beneficial, even if that company is
going to do maintenance work either way.

>> So the short answer about why Epsilon is good to have in product is because the
>> cost seems low, the benefits are present, and so cost/benefit is still low.
> 
> And it is here that our opinions differ :) For you the maintenance cost is low,
> whereas for me, having yet another command-line flag, yet another code path,
> gets in the way. You have to respect that we have different background and
> experiences here.

I am not trying to challenge your background or experience here, I am
challenging the cost estimates though. Because ad absurdum, we can shoot down
any feature change coming into JVM, just because it introduces yet another flag,
yet another code path, etc.

I cannot see where the Epsilon maintenance would be a burden: it comes with
automated tests that run fast, its implementation seemss trivial, its exposure
to VM code seems trivial too (apart from the BarrierSet thing that would be
trimmed down with GC interface work).

>> Yeah, I know how that feels. Look at the actual Epsilon changes, do they look
>> scary to you, given your experience maintaining the related code?
> 
> I don't like taking the role of the grumpy open source maintainer :) No, the
> code is not scary, code is rarely scary IMO, it is just code. Running tests,
> fixing that a test -Xmx1g isn't run on a RPi, having additional code paths, more
> cases to take into consideration when refactoring, is burdensome. And to me, the
> benefits of benchmarking against Epsilon vs benchmarking against Serial/Parallel
> isn't that high to me.
> 
> But, I can understand that it is useful when trying to evaluate for example the
> cost of stores into a HashMap. Which is why I'm not against the code, but I'm
> not keen on exposing this to non-JVM developers.

I hear you, but thing is, Epsilon does not seem a coding exercise anymore.
Epsilon is useful for GC performance work especially when readily available, and
there are willing users to adopt it. Similarly how we respect maintainers'
burden in the product, we have to also see what benefits users, especially the
ones who are championing our project performance even by cutting corners with
e.g. no-op GCs.

Thanks,
-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170718/090840c0/signature.asc>