RFR (11u, XXL): Upstream/backport Shenandoah to JDK11u

Thu Jul 9 05:57:27 UTC 2020

On Thu, 2020-07-09 at 04:53 +0000, Gil Tene wrote:
> > On Jul 8, 2020, at 4:07 PM, Roman Kennke <rkennke at redhat.com>
> > wrote:
> > 
> > On Wed, 2020-07-08 at 22:35 +0000, Gil Tene wrote:
> > > 
> > > > On Jul 8, 2020, at 2:58 PM, Roman Kennke <rkennke at redhat.com>
> > > > wrote:
> > > > 
> > > > ...
> > > > In other words: what exactly is it that you want to prove then?
> > > > I
> > > > don't
> > > > understand it.
> > > 
> > > I'm not sure I understand it either. ;-)
> > 
> > Haha, ok right :-)
> > 
> > > I'm basically asking the "if we prove it now", how do we know
> > > that
> > > the
> > > same holds later?" question.
> > > 
> > > Delaying some big impact by one update is clearly not the goal
> > > you
> > > are proposing.
> > 
> > I don't understand this sentence (but it's quite late here...)
> > 
> > > I believe we agree that the quality you suggest would need to be
> > > retained into future
> > > builds. Having nothing to prevent it from breaking (e.g. by
> > > accident)
> > > is probably
> > > dangerous. So the question becomes "how do we ask the question of
> > > whether or not
> > > we still have not introduced any differences?" in future updates.
> > 
> > Ok.
> > 
> > > Let's start from the assumption that we can prove something now
> > > (and
> > > I
> > > do think we should be able to find a mechanical thing to do
> > > that).
> > > That will show
> > > that the quality we seek to keep has been kept at a point in
> > > time...
> > 
> > Yes. Let's assume that. (As an aside, if you'd actually look at the
> > patch, I hope you'd find it quite obvious that it does the right
> > thing.
> > And you might be surprised how relatively few such changes we
> > actually
> > have.)
> > 
> > > For as long as the statement of "without Shenandoah enabled, the
> > > build
> > > does not have any actual code bits related to the Shenandoah
> > > back-
> > > port"
> > > needs to hold, we need a way to show that the same still holds,
> > > Either by
> > > proving that the original proof still holds in the presence of
> > > changes, or
> > > by re-proving it somehow.
> > > 
> > 
> > Let me elaborate a little bit on my experiences with backports, and
> > mixing both Shenandoah and non-Shenandoah backports.
> > 
> > In the many months since we maintain Shenandoah in our 11u, the
> > overhelming majority of Shenandoah-related backports did not touch
> > any
> > shared code at all. That's because the existing GC interfaces
> > isolate
> > GCs really well in 11u. The few occasions where we did change
> > shared-
> > code was 1. When we switched to LRB. We would not usually backport
> > such
> > drastic changes, but it seemed prudent here, because it did
> > actually
> > decrease the shared-code exposure drastically. 2. in preparation
> > for
> > this upstreaming patch - more pruning and isolation of shared-code
> > changes.
> > 
> > In the very rare situations where a backport would require shared-
> > code
> > changes (I can't actually remember any, apart from the two that I
> > just
> > mentioned), we carefully consider if it's actually necessary to
> > backport. For example, we have not backported concurrent class-
> > unloading support to 11u precisely because it would require
> > (significant) changes outside of Shenandoah. *If* a critical
> > backport
> > (say, a bugfix) requires changes outside of Shenandoah, it would
> > have
> > to be properly guarded by the same means as we do in the proposed
> > upstreaming patch. We - the Shenandoah team - would be aware of
> > that
> > and mention it in relevant reviews. It would also prominently show
> > up
> > in a patch because it has files without 'shenandoah' in their path
> > names. And from there, it's a matter of carefully considering,
> > reviewing and testing it. I don't think this would silently sneak
> > in
> > somehow.
> > 
> > I can't think of a situation where we ever had the reverse problem:
> > a
> > shared-code change touching on something Shenandoah-related.
> > 
> > Also, while we did have a flurry of Shenandoah-related backports in
> > the
> > past, because of stabilization and new features (e.g. LRB and
> > friends),
> > this is most likely not going to continue into the future. I expect
> > less Shenandoah-backporting traffic, basically limited to bugfixes
> > and
> > improvements that don't touch shared-code. We have a couple of
> > features
> > on our todo list for later JDKs, but none of them sound like
> > candidates
> > for backporting.
> > 
> > > The notion that our normal review processes will catch everything
> > > that can break
> > > that initial prooved state seems a bit optimistic to me. The
> > > review
> > > process will be
> > > well intentioned, and we'll try to tag things right, but one
> > > mistaken
> > > application or
> > > move of code across ifdef lines, or integration of some mis-
> > > tagged or
> > > unnoticed tag
> > > thing into shared code will break the statement…
> > > 
> > 
> > How is that any different from the situation that we already have?
> > Hotspot code already has similar #ifdefs sprinkled all around, e.g.
> > for
> > other GCs, for JFR, platform #ifdefs, and so on. How is the
> > situation
> > different for Shenandoah, why do we need special rules or process
> > or
> > even proofs for that? As far as I can see, our current high-
> > standard
> > development and review practices already cover it very well.
> > 
> > > I believe that we can avoid this once, at the start, with a
> > > mechanical proof. Can
> > > we keep it up? What sort fo rules or invariants can we come up
> > > with
> > > that we can
> > > use to verify and show that the quality we seek to keep has been
> > > kept
> > > through later
> > > updates?
> > 
> > Well yes, the usual proper care that everybody is taking in making
> > the
> > backports, reviewing the backports, testing it, etc.
> > 
> > > Let's try to think of some….
> > 
> > As long as we don't know what the exact question/problem even might
> > be,
> > we can hardly come up with an answer/solution.
> 
> I'm not trying to be difficult here. I'm just going with the basic
> line
> of logic, and looking for a way to implement it.
> 

Yeah. Let me just state for the record, that I find this is getting
slightly ridiculous ;-)

Let me try anyway.

It looks to me that this is a prime example for proof-by-induction.
Let's try to formulate it.

State 0 is our current jdk11u state. Shenandoah does obviously not leak
into the build, because it's not included yet.

State 1 is the state after the initial inclusion of Shenandoah. Let's
assume we can prove that between 0 and 1, nothing leaks into the build
when building with --with-jvm-features=-shenandoahgc. That proof will
have to be determined, maybe comparing object files would work.

Then, for any further changeset to be backported, this would be the N-
>N+1 case:

- if the changeset is not Shenandoah-related, it'll obviously change
the outcome of the build, but also obviously doesn't leak any
Shenandoah-related changes.
- otherwise, if the changeset is Shenandoah-related, we can run the
same proof that we did from 0->1 for N->N+1, and proof that no
additional Shenandoah-related changes leaks into non-Shenandoah build.

Right?

It depends on the correct classification what constitutes a Shenandoah-
related change and what doesn't. But it must, I see no way around that.
>From my perspective and in my experience, this is really easy though,
and can be achieved by applying some common sense. (Hey, when
reviewing, you really ought to look at the bug - and spot gc-shenandoah 
label - and also look at the patch and understand what it all does, and
come to some conclusion.)

Now, about that proof: I will spend the day looking if we can do it by
comparing object files of builds. E.g. do builds (-shenandoahgc)
before/after a change is applied, run checksums over each object file,
and compare those checksums. Let's see if there's any pitfalls there.

What do you think?

Roman