...

Mon Jun 29 21:56:19 UTC 2020

From what I read, there seems to be consensus that the risk of the proposed RFR is in fact minimal, assuming deep review and other careful measures, as for the JFR 8u backport. So we are focusing on the reward side.

>> Necessity is a binary bar.
Don’t we all always ask “how necessary” or “necessary for what”? Isn’t necessity simply a strong reading on the reward meter? And, indeed, we ask “necessary for whom”.

>> This list tends to be dominated by developers of OpenJDK.
>> I claim that our community (for the updates projects) is the *users* of OpenJDK.

The developer minority that is writing to each other here has significant exposure to user opinions and issues.

Aren’t average and tail latencies wide-spread user pain points? Shouldn’t we proliferate concurrent GC to address that? And isn’t upgrading to another JDK release also a huge user pain point, at least for some users? See Goetz’s email for an existence proof, but there are many other cases in other companies. Isn’t sticking to LTS versions what users normally do? My experience is that, whether justified or not, most do just that. If you put these aspects together, the necessity to do something about Shenandoah sooner rather than later reveals itself. Individual examples of users to whom this does not apply do not change this. In fact, there are often enough good reasons to not use concurrent GC, depending on the use case. But without concurrent GC we have a serious gap in our ability to run latency-sensitive workloads right now, and we need to fix that.

BTW, we were under the impression that Azul fully supports the idea of backporting Shenandoah.
https://mail.openjdk.java.net/pipermail/jdk8u-dev/2019-July/009808.html

Looking for preceding guidance on the matter, there is this email thread (spread over two months).
https://mail.openjdk.java.net/pipermail/jdk-updates-dev/2019-October/002053.html
https://mail.openjdk.java.net/pipermail/jdk-updates-dev/2019-November/002075.html

I heard that at the Committer’s Workshop session on update releases in February, there was consensus that “features” should only be backported to an update release if they are already in use and maintained by at least two other major contributors to the updates project. Furthermore, at that point, it becomes a big waste of resources to stay in sync, and by definition there already is a support commitment. Regarding the RFR at hand, there is more than the necessary commitment in this very email thread.

Bernd

On 6/29/20, 12:05 PM, "Gil Tene" <gil at azul.com> wrote:

    > On Jun 29, 2020, at 10:10 AM, Andrew Haley <aph at redhat.com> wrote:
    > 
    > On 29/06/2020 17:38, Gil Tene wrote:
    >>> Or is it that all changes to JDK11u are in-principle bad, regardless
    >>> of their actual risk, which we should not consider?
    >> 
    >> All feature additions in updates for an already released version (as
    >> opposed to bug fixes and security updates, which those updates
    >> are the vehicle for) are in-principle bad IMO. Sometimes we have to
    >> do the bad thing because it is less bad than the alternative (the LTS
    >> will break for many people if we don't do it). But it is still bad.
    > 
    > OK, so that is your position, and the in-principle argument is AFAICS
    > independent of the practical risk.

    Right. For non-neccesary *feature additions* to an existing LTS.

    > 
    >>> That possibility seems a little odd. Given that most of your post
    >>> is couched in terms of risk, to say that we shouldn't strive to
    >>> evaluate that risk would be downright irrational.
    >> 
    >> The weighing of risk for adding a feature in a new OpenJDK version
    >> should be very different from the weighing of the same in an update
    >> to an existing version.
    > 
    > No question.
    > 
    >> When we add features to new versions (which we now get to do every
    >> six months), we don't risk regressing production behavior of people
    >> that depend on security updates to keep operating. We then consider
    >> the risk in terms of "will this impact adoption rate or make us look
    >> bad", and the benefits can be weighed against that risk. That's
    >> where the benefits of new features often get to win, and we can
    >> choose ways to minimize risk to help them win.
    > 
    > I agree. The benefits can be weighed against that risk.
    > 
    >> In contrast, when we consider adding features in an updates for an
    >> existing version (which will be forced on all production users that
    >> need to keep up with security, for example), the benefit of a new
    >> (did not previously exist in that Java version) feature should
    >> simply not be part of the equation unless it meets a base
    >> "necessity" requirement.  Without meeting that requirement, it
    >> should score a 0 on weight IMO.
    > 
    > That fits with my understanding of your position: no matter how
    > helpful a feature is, or how low the risk of adding it is, it should
    > not happen. So, your conclusion is *entirely independent* of the
    > actual ratio of risk to reward. No evaluation of either is necessary
    > because the conclusion will always be the same: no.

    Exactly. For non-neccesary *feature additions* to an existing LTS.

    Necessity is a binary bar. One can argue that a new feature is
    necessary or not. But how risky adding that feature  is should
    not play a part in that argument.

    Some features that are necessary may be so risky that we dare
    not do them. But the opposite should not be happening in stable
    release updates that people are forced to consume for security
    reasons. Not even if we are VERY tempted and see all sorts of
    cool benefits in a feature.

    > 
    > I hope that you will agree that your position on this matter is an
    > extreme one; in fact is the most extreme position it is possible for
    > anyone to take.

    I disagree. I don't think this is an extreme position at all. In fact, I
    think it is quite mainstream, and represents the most common
    understanding of what stable releases are, and what most consumers
    of updates to such releases think is going on.

    When those consumers find that's not quite what's going on, we
    (all) get a lot of flak for doing active feature development on their
    precious parts of critical infrastructure that depend on stability above
    and security all else. Attila's sentiments are an example of that.

    > You also know that this is not the position taken by
    > the whole community.

    Obviously ;-)

    But I think that where you think the vast majority of the community
    sits on this position being "extreme" or the opposite being the
    extreme one depends on who you think the community is.

    This list tends to be dominated by developers of OpenJDK.

    I claim that our community (for the updates projects) is the
     *users* of OpenJDK.

    There is a 10,000:1 or higher ratio between the number of
    people in those groups. And the larger one tends to be
    [naturally] underrepresented here, and much much less active.
    But their interests and their dependence of what we do
    here is pretty critical. And our dependence on them for
    having a reason for having these updates projects is critical
    as well.

    > 
    >> Necessity has to do with fixing broken things, or preventing things
    >> from breaking. Not working on some new piece of hardware or a new
    >> version of OSs counts as "breaking" IMO, and security, correctness,
    >> and stability bugs do to. Even some seriously degenerate performance
    >> issues may be (although performance 2+ years into a release should
    >> not be something we should be working on improving within that
    >> release).
    > 
    > Please allow me to suggest a little thought experiment.
    > 
    > Let's say that we could import Feature X (which is Shenandoah, but I'd
    > like to do make this discussion more general) without any risk. Sure,
    > risk is never zero, but let's say:
    > 
    > 1. All changes to the source code are surrounded by compile-time
    >   #if USE_SHENANDOAH_GC.
    > 
    > 2. USE_SHENANDOAH_GC is never set unless it is explicitly requested by
    >   a configure argument.
    > 
    > 3. All changes to the source code within the USE_SHENANDOAH_GC markers
    >   are also guarded by runtime tests of the form
    >   if (UseShenandoahGC) { ...
    > 
    > I maintain that no-one will ever be forced to use X. Firstly, no
    > downstream builder of our release JDK 11u will even build X unless
    > they explicitly choose to. Also, unless UseShenandoahGC is enabled on
    > the Java command line, no downstream user will be forced to use it
    > either.
    > 
    > What, in your opinion, are the practical risks to production users of
    > the above? I'm assuming that we can confirm that the above plan works
    > and does not touch anything it shouldn't by a combination of tooling
    > and several sets of eyeballs. That's not perfect, but nothing is.

    Let me answer both in general, and then in the specific:

    General:

    First, my point i that the practical risk does not matter, as no necessity
    has been demonstrated or even argued for. And I claim that this is not
    an extreme position (and that if we want to label things as extreme,
    the opposite would be the extreme).

    Next, I'll certainly accept that with careful review of some things it is
    quite possible to keep risks very low. E.g. in cases where ALL
    code is protected by #ifdef the risk is somewhat mitigated.
    And that in cases where ALL non-ifdef-covered runtime flags
    protect code paths the risk mitigation is not quite as good, but
    stlll could be good.

    I'll even go farther and say that risk can (and should) be seriously
    reduced  even when the above stated protections are not practical.
    Deep review (with an eye to minimizing risk around change, as
    opposed to the typical review that focuses on correctness, code
    structure, cleanliness, elegance, reuse, maintainability, etc.) is key to
    minimizing risk when risk HAS to be taken. And as you well know, we
    are going through that excercize right now with TLS1.3 in 8u.

    None of these are good enough to overcome the "no necessity" bar
    IMO, because how good and low risk a change is is irrelevant
    (in an LTS update) when it is unnecessary.

    Specific:
    The above two are generic policy approaches. When it comes to the
    specific case here (the addition of a new collector to 11u), which touches
    a huge number of source files, some of the above description only apply
    "whereever reasonably possible", and actual code changes to common
    paths that are not protected by #ifdefs avoided statically via launch time
    configuration flags exist (i.e. every dynamic execution of some common
    code is now goigg through different logic that is considering the
    configuration flags, or worse, common code changes exist where no
    conditionals are applied at all).
    That is not a subtle difference.

    With that said, this last argument is not a logic path I'm looking to go
    down, because I don't think the additional risk involved in something as
    big and as intrusive as adding Shenandoah actually matters here. I'd
    be making the same neccesity-based arguments for other features that
    don't share that level change and risk, so I don't want to rathole in
    discussing the specific code risks in this case.

    > 
    > --
    > Andrew Haley  (he/him)
    > Java Platform Lead Engineer
    > Red Hat UK Ltd. <https://www.redhat.com>
    > https://keybase.io/andrewhaley
    > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671