...

Mon Jun 29 19:04:54 UTC 2020

> On Jun 29, 2020, at 10:10 AM, Andrew Haley <aph at redhat.com> wrote:
> 
> On 29/06/2020 17:38, Gil Tene wrote:
>>> Or is it that all changes to JDK11u are in-principle bad, regardless
>>> of their actual risk, which we should not consider?
>> 
>> All feature additions in updates for an already released version (as
>> opposed to bug fixes and security updates, which those updates
>> are the vehicle for) are in-principle bad IMO. Sometimes we have to
>> do the bad thing because it is less bad than the alternative (the LTS
>> will break for many people if we don't do it). But it is still bad.
> 
> OK, so that is your position, and the in-principle argument is AFAICS
> independent of the practical risk.

Right. For non-neccesary *feature additions* to an existing LTS.

> 
>>> That possibility seems a little odd. Given that most of your post
>>> is couched in terms of risk, to say that we shouldn't strive to
>>> evaluate that risk would be downright irrational.
>> 
>> The weighing of risk for adding a feature in a new OpenJDK version
>> should be very different from the weighing of the same in an update
>> to an existing version.
> 
> No question.
> 
>> When we add features to new versions (which we now get to do every
>> six months), we don't risk regressing production behavior of people
>> that depend on security updates to keep operating. We then consider
>> the risk in terms of "will this impact adoption rate or make us look
>> bad", and the benefits can be weighed against that risk. That's
>> where the benefits of new features often get to win, and we can
>> choose ways to minimize risk to help them win.
> 
> I agree. The benefits can be weighed against that risk.
> 
>> In contrast, when we consider adding features in an updates for an
>> existing version (which will be forced on all production users that
>> need to keep up with security, for example), the benefit of a new
>> (did not previously exist in that Java version) feature should
>> simply not be part of the equation unless it meets a base
>> "necessity" requirement.  Without meeting that requirement, it
>> should score a 0 on weight IMO.
> 
> That fits with my understanding of your position: no matter how
> helpful a feature is, or how low the risk of adding it is, it should
> not happen. So, your conclusion is *entirely independent* of the
> actual ratio of risk to reward. No evaluation of either is necessary
> because the conclusion will always be the same: no.

Exactly. For non-neccesary *feature additions* to an existing LTS.

Necessity is a binary bar. One can argue that a new feature is
necessary or not. But how risky adding that feature  is should
not play a part in that argument.

Some features that are necessary may be so risky that we dare
not do them. But the opposite should not be happening in stable
release updates that people are forced to consume for security
reasons. Not even if we are VERY tempted and see all sorts of
cool benefits in a feature.

> 
> I hope that you will agree that your position on this matter is an
> extreme one; in fact is the most extreme position it is possible for
> anyone to take.

I disagree. I don't think this is an extreme position at all. In fact, I
think it is quite mainstream, and represents the most common
understanding of what stable releases are, and what most consumers
of updates to such releases think is going on.

When those consumers find that's not quite what's going on, we
(all) get a lot of flak for doing active feature development on their
precious parts of critical infrastructure that depend on stability above
and security all else. Attila's sentiments are an example of that.

> You also know that this is not the position taken by
> the whole community.

Obviously ;-)

But I think that where you think the vast majority of the community
sits on this position being "extreme" or the opposite being the
extreme one depends on who you think the community is.

This list tends to be dominated by developers of OpenJDK.

I claim that our community (for the updates projects) is the
 *users* of OpenJDK.

There is a 10,000:1 or higher ratio between the number of
people in those groups. And the larger one tends to be
[naturally] underrepresented here, and much much less active.
But their interests and their dependence of what we do
here is pretty critical. And our dependence on them for
having a reason for having these updates projects is critical
as well.

> 
>> Necessity has to do with fixing broken things, or preventing things
>> from breaking. Not working on some new piece of hardware or a new
>> version of OSs counts as "breaking" IMO, and security, correctness,
>> and stability bugs do to. Even some seriously degenerate performance
>> issues may be (although performance 2+ years into a release should
>> not be something we should be working on improving within that
>> release).
> 
> Please allow me to suggest a little thought experiment.
> 
> Let's say that we could import Feature X (which is Shenandoah, but I'd
> like to do make this discussion more general) without any risk. Sure,
> risk is never zero, but let's say:
> 
> 1. All changes to the source code are surrounded by compile-time
>   #if USE_SHENANDOAH_GC.
> 
> 2. USE_SHENANDOAH_GC is never set unless it is explicitly requested by
>   a configure argument.
> 
> 3. All changes to the source code within the USE_SHENANDOAH_GC markers
>   are also guarded by runtime tests of the form
>   if (UseShenandoahGC) { ...
> 
> I maintain that no-one will ever be forced to use X. Firstly, no
> downstream builder of our release JDK 11u will even build X unless
> they explicitly choose to. Also, unless UseShenandoahGC is enabled on
> the Java command line, no downstream user will be forced to use it
> either.
> 
> What, in your opinion, are the practical risks to production users of
> the above? I'm assuming that we can confirm that the above plan works
> and does not touch anything it shouldn't by a combination of tooling
> and several sets of eyeballs. That's not perfect, but nothing is.

Let me answer both in general, and then in the specific:

General:

First, my point i that the practical risk does not matter, as no necessity
has been demonstrated or even argued for. And I claim that this is not
an extreme position (and that if we want to label things as extreme,
the opposite would be the extreme).

Next, I'll certainly accept that with careful review of some things it is
quite possible to keep risks very low. E.g. in cases where ALL
code is protected by #ifdef the risk is somewhat mitigated.
And that in cases where ALL non-ifdef-covered runtime flags
protect code paths the risk mitigation is not quite as good, but
stlll could be good.

I'll even go farther and say that risk can (and should) be seriously
reduced  even when the above stated protections are not practical.
Deep review (with an eye to minimizing risk around change, as
opposed to the typical review that focuses on correctness, code
structure, cleanliness, elegance, reuse, maintainability, etc.) is key to
minimizing risk when risk HAS to be taken. And as you well know, we
are going through that excercize right now with TLS1.3 in 8u.

None of these are good enough to overcome the "no necessity" bar
IMO, because how good and low risk a change is is irrelevant
(in an LTS update) when it is unnecessary.

Specific:
The above two are generic policy approaches. When it comes to the
specific case here (the addition of a new collector to 11u), which touches
a huge number of source files, some of the above description only apply
"whereever reasonably possible", and actual code changes to common
paths that are not protected by #ifdefs avoided statically via launch time
configuration flags exist (i.e. every dynamic execution of some common
code is now goigg through different logic that is considering the
configuration flags, or worse, common code changes exist where no
conditionals are applied at all).
That is not a subtle difference.

With that said, this last argument is not a logic path I'm looking to go
down, because I don't think the additional risk involved in something as
big and as intrusive as adding Shenandoah actually matters here. I'd
be making the same neccesity-based arguments for other features that
don't share that level change and risk, so I don't want to rathole in
discussing the specific code risks in this case.

> 
> --
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671