RFR (11u, XXL): Upstream/backport Shenandoah to JDK11u

Wed Jul 8 23:07:15 UTC 2020

On Wed, 2020-07-08 at 22:35 +0000, Gil Tene wrote:
> 
> 
> > On Jul 8, 2020, at 2:58 PM, Roman Kennke <rkennke at redhat.com>
> > wrote:
> > 
> > On Wed, 2020-07-08 at 21:39 +0000, Gil Tene wrote:
> > > > On Jul 8, 2020, at 11:31 AM, Roman Kennke <rkennke at redhat.com>
> > > > wrote:
> > > > 
> > > > Hi Gil & all,
> > > > 
> > > > I put in some extra work and had a deep look at the remaining
> > > > unguarded
> > > > changes in shared-code, with an eye to either eliminating them
> > > > altogether, or - where this is not possible - guard them with
> > > > build-
> > > > time #if INCLUDE_SHENANDOAHGC or similar. The guiding principle
> > > > in
> > > > this
> > > > effort has been that building without Shenandoah (--with-jvm-
> > > > features=-
> > > > shenandoahgc) would compile *the exact same* code that it does
> > > > now.
> > > > I
> > > > believe I achieved this with the following proposed changes:
> > > > 
> > > > Full webrev (incl. Shenandoah-only files):
> > > > https://cr.openjdk.java.net/~rkennke/shenandoah-jdk11u-upstream/webrev.06-all/
> > > > 
> > > > Shared-code-only changes:
> > > > https://cr.openjdk.java.net/~rkennke/shenandoah-jdk11u-upstream/webrev.06-shared/
> > > > 
> > > > The latter webrev exludes all files that are only compiled with
> > > > Shenandoah enabled.
> > > > 
> > > > Also, compared to previous webrevs, this excludes Shenandoah by
> > > > default.
> > > > 
> > > > As far as I can tell, this literally compiles the same code
> > > > into
> > > > libjvm.so as it does now, when building without Shenandoah. The
> > > > only
> > > > exception is the inclusion of the global flag UseShenandoahGC,
> > > > which is
> > > > by-design. When this is enabled in a build without Shenandoah,
> > > > it
> > > > prints the appropriate error message and exist (exactly like
> > > > with
> > > > any
> > > > other GC that is not built-in).
> > > 
> > > For the purpose of this thought exercise, I think we could
> > > consider
> > > that flag
> > > addition as a separate change (which would happen before all the
> > > other
> > > stuff). So that we can consider the "would compile *the exact
> > > same*
> > > code"
> > > statements in the context of a builds that already include the
> > > flag.
> > > 
> > 
> > Ok.
> > 
> > > > I've been thinking about if we can come up with a mechanical
> > > > proof
> > > > of
> > > > correctness. The first best thing would have been reproducible
> > > > builds,
> > > > but we don't have that (and the UseShenandoahGC flag would mess
> > > > it
> > > > up
> > > > too). The other option would be to dump preprocessed sources
> > > > and
> > > > diff
> > > > them, but that is disturbed by the necessary inclusion of
> > > > "utilities/macros.hpp" which gives us the INCLUDE_SHENANDOAHGC
> > > > macro.
> > > 
> > > I think that a mechanized way to verify the assumption will be
> > > key,
> > > as you
> > > will want it verified not just initially, but with every future
> > > change. figuring
> > > out how to do that is a key question...
> > > 
> > > Yes, reproducible builds would be great, but we probably don't
> > > want
> > > to
> > > hold our breath for that. Maybe something around comparing
> > > individual
> > > object file contents that go into the final link can work?
> > > 
> > 
> > Maybe. I'm not familiar enough with object file formats to see
> > quickly
> > how to do that, or if it's even feasibible
> > 
> > > > Which leaves us with carefully inspecting the webrev by
> > > > multiple
> > > > sets
> > > > of eyes and brains.
> > > 
> > > Eye and brain inspection is unlikely to catch innocent looking
> > > "leaks" in
> > > this model 6 months after the initial merge effort is completed,
> > > when
> > > someone touches a line on the wrong side of an ifdef, or makes a
> > > change to the scope of an ifdef or to what turns it on it.
> > > 
> > > I think that an ongoing means of verification is needed as part
> > > of a
> > > solution like this for it to be viable maintainable. I'm not sure
> > > what
> > > that would look like. But it may be worth thinking up some
> > > techniques.
> > 
> > Sorry I don't understand. Why is the usual practices of
> > backporting,
> > reviewing, testing for jdk11u etc not good enough to police ongoing
> > work?
> > 
> > From our long experience with backporting both non-Shenandoah and
> > Shenandoah stuff to 11u, we know that they rarely, if ever, come
> > into
> > conflict with each other, and it's usually quite clear what is
> > what. 
> > 
> > > I'm also not sure how one would police future code changes and
> > > backports of fixes from usptream versions, unless each and every
> > > review of code carefully takes into consideration the "is this
> > > Shenandoah
> > > related or not? Should it go in ifdefs or not?" questions.
> > 
> > Easy: if the change has 'gc-shenandoah' label in the bug report,
> > it's
> > Shenandoah, otherwise it's not. :-) We are very careful to get that
> > part right for our own tracking. Shared-code is very rarely touched
> > by
> > Shenandoah changes anyway.
> > 
> > > Can we come up with systemic ways to know that a code change
> > > being
> > > backported is Shenandoah-related? E.g. if the upstream code shape
> > > and
> > > ifdef'ing rules remained similar to the downstream,
> > 
> > The ifdef'ing rules are already different: there are (almost) no
> > such
> > #ifdefs in later JDKs, because it's all isolated by the proper GC
> > interfaces. Most of those exist in 11u, but not all, that's why we
> > had
> > to put those #ifs in the patch. Some stuff has been accepted as
> > general
> > change in later JDKs (e.g. the stuff in ciInstanceKlass.cpp), it
> > appears as Shenandoah change here, because we are the only users of
> > it.
> > 
> > > we may be able to
> > > keep things workable, but I suspect trhat the upstream
> > > (especially
> > > after
> > > 15u) will be tempted to remove the strict ifdef'ing pretty
> > > quickly,
> > > and start
> > > placing code changes in common code, and as those changes will
> > > likely
> > > be needed in backporting gixes to the "Shenandoah enabled 11u"
> > > mode,
> > > carefully examination on a case by case basis will be needed, at
> > > least as
> > > long as "no shenandoah exists" downstreams of 11u still depend on
> > > it
> > > for updates.
> > 
> > Sorry, I still don't see this as a problem. Careful review of
> > backports, looking at the gc-shenandoah tag in bug-reports, testing
> > of
> > builds both with and without Shenandoah, that should cover it?
> > 
> > > > Gil, would the change like this ease your concerns?
> > > 
> > > If we could provably show that no code differences exist in
> > > builds
> > > that
> > > do no choose to include the change, it would be hard for me to
> > > argue
> > > against upstream inclusion based on it having a potential for
> > > breaking
> > > a downstream distro's stability, no matte rhow neccesary or not
> > > it
> > > is…
> > > 
> > > So yes, the proposed approach could work if we can figure out
> > > the technical ways to achieve it and "keep it achieved" for some
> > > time
> > > into the future.
> > > 
> > > We will likely need to be able to mechanically verify that things
> > > remain
> > > like this even as code is introduced in the future. Manual review
> > > cannot
> > > achieve that alone probably, so we will likely need some build
> > > tooling
> > > to verify this every time.
> > 
> > Another problem with that is this: we can probably mechanically
> > prove
> > it now, because we have the vanilla 11u build to compare against.
> > Once
> > Shenandoah is integrated, what would you compare the -shenandoahgc
> > build against?
> > 
> > In other words: what exactly is it that you want to prove then? I
> > don't
> > understand it.
> 
> I'm not sure I understand it either. ;-)

Haha, ok right :-)

> I'm basically asking the "if we prove it now", how do we know that
> the
> same holds later?" question.
> 
> Delaying some big impact by one update is clearly not the goal you
> are proposing.

I don't understand this sentence (but it's quite late here...)

> I believe we agree that the quality you suggest would need to be
> retained into future
> builds. Having nothing to prevent it from breaking (e.g. by accident)
> is probably
> dangerous. So the question becomes "how do we ask the question of
> whether or not
> we still have not introduced any differences?" in future updates.

Ok.

> Let's start from the assumption that we can prove something now (and
> I
> do think we should be able to find a mechanical thing to do that).
> That will show
> that the quality we seek to keep has been kept at a point in time...

Yes. Let's assume that. (As an aside, if you'd actually look at the
patch, I hope you'd find it quite obvious that it does the right thing.
And you might be surprised how relatively few such changes we actually
have.)

> For as long as the statement of "without Shenandoah enabled, the
> build
> does not have any actual code bits related to the Shenandoah back-
> port"
> needs to hold, we need a way to show that the same still holds,
> Either by
> proving that the original proof still holds in the presence of
> changes, or 
> by re-proving it somehow.
> 

Let me elaborate a little bit on my experiences with backports, and
mixing both Shenandoah and non-Shenandoah backports.

In the many months since we maintain Shenandoah in our 11u, the
overhelming majority of Shenandoah-related backports did not touch any
shared code at all. That's because the existing GC interfaces isolate
GCs really well in 11u. The few occasions where we did change shared-
code was 1. When we switched to LRB. We would not usually backport such
drastic changes, but it seemed prudent here, because it did actually
decrease the shared-code exposure drastically. 2. in preparation for
this upstreaming patch - more pruning and isolation of shared-code
changes.

In the very rare situations where a backport would require shared-code
changes (I can't actually remember any, apart from the two that I just
mentioned), we carefully consider if it's actually necessary to
backport. For example, we have not backported concurrent class-
unloading support to 11u precisely because it would require
(significant) changes outside of Shenandoah. *If* a critical backport
(say, a bugfix) requires changes outside of Shenandoah, it would have
to be properly guarded by the same means as we do in the proposed
upstreaming patch. We - the Shenandoah team - would be aware of that
and mention it in relevant reviews. It would also prominently show up
in a patch because it has files without 'shenandoah' in their path
names. And from there, it's a matter of carefully considering,
reviewing and testing it. I don't think this would silently sneak in
somehow.

I can't think of a situation where we ever had the reverse problem: a
shared-code change touching on something Shenandoah-related.

Also, while we did have a flurry of Shenandoah-related backports in the
past, because of stabilization and new features (e.g. LRB and friends),
this is most likely not going to continue into the future. I expect
less Shenandoah-backporting traffic, basically limited to bugfixes and
improvements that don't touch shared-code. We have a couple of features
on our todo list for later JDKs, but none of them sound like candidates
for backporting.

> The notion that our normal review processes will catch everything
> that can break
> that initial prooved state seems a bit optimistic to me. The review
> process will be
> well intentioned, and we'll try to tag things right, but one mistaken
> application or
> move of code across ifdef lines, or integration of some mis-tagged or
> unnoticed tag
> thing into shared code will break the statement…
> 

How is that any different from the situation that we already have?
Hotspot code already has similar #ifdefs sprinkled all around, e.g. for
other GCs, for JFR, platform #ifdefs, and so on. How is the situation
different for Shenandoah, why do we need special rules or process or
even proofs for that? As far as I can see, our current high-standard
development and review practices already cover it very well.

> I believe that we can avoid this once, at the start, with a
> mechanical proof. Can
> we keep it up? What sort fo rules or invariants can we come up with
> that we can
> use to verify and show that the quality we seek to keep has been kept
> through later
> updates?

Well yes, the usual proper care that everybody is taking in making the
backports, reviewing the backports, testing it, etc.

> Let's try to think of some….

As long as we don't know what the exact question/problem even might be,
we can hardly come up with an answer/solution.

Roman