<div dir="ltr"><div dir="ltr">On Thu, Nov 3, 2022 at 8:07 AM Brian Goetz <<a href="mailto:brian.goetz@oracle.com" target="_blank">brian.goetz@oracle.com</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
While we could do more warnings as ordinary RFEs, there's
potentially still value in grouping these all under a JEP, for the
reasons you hint at and more. Let's work through the design, figure
out if we can actually provide any new safety guarantees (as opposed
to just more warnings), and then figure out the best vehicle. <br></div></blockquote><div><br></div><div>Sounds good.</div><div><br></div><div>I'll try to throw out a starting point...<br></div><div><br></div><div>Meta-comment: it's probably prudent to start with something relatively narrow and well-defined. Once that idea is fully understood we can expand from there.<br></div><div><br></div><div>OK let's define the problem. IMHO the most basic problem relates to final fields. Obviously, Java is advertising them as special by guaranteeing immutability and safe publication after construction. Yet it's still easy for a final field to be read incorrectly before it's been properly initialized. The <span style="font-family:monospace">FilteredSet</span> example demonstrates this, and also shows how non-obvious it can be. So having compiler automation that would help you avoid this problem makes sense.<br></div></div><div><br></div><div>Ideally we would like to have some warning X for which we can say "If a class A has a constructor that (a) compiles without generating any X warnings, and (b) has no <a class="gmail_plusreply" id="m_1261716256961106157m_3479417024175891924m_3336334522996783239m_-1588422738598198265m_-7051417076792024992m_2086982426684953752m_-2133772670264269446m_-3465632717001003687m_-1639841853547388853m_-1570129750560505429m_-5745044272604272823m_-3866796466459884399m_-8117221402870163111m_8058360780698678965m_-4660520915480240539gmail-plusReplyChip-0"><span style="font-family:monospace">@SuppressWarnings("X")</span> annotation</a><a class="gmail_plusreply" id="m_1261716256961106157m_3479417024175891924m_3336334522996783239m_-1588422738598198265m_-7051417076792024992m_2086982426684953752m_-2133772670264269446m_-3465632717001003687m_-1639841853547388853m_-1570129750560505429m_-5745044272604272823m_-3866796466459884399m_-8117221402870163111m_8058360780698678965m_-4660520915480240539plusReplyChip-0">,</a> then the compiler guarantees that no final field in class A can ever be read before it's initialized".<br></div><div><br></div><div>Note the narrowness of this: we are only talking about final fields in class A. Not fields in A's superclass or subclasses. We're not talking about "object initialization", we're just talking about individual fields. From a developer point of view, this is a feature: you can assess (and address) the situation one class and one field at a time.<br></div><div><br></div><div>Even so, from a practical point of view, solving this problem is likely to cover the vast majority of the real-world cases where "this escape" causes trouble, at least in my experience. The "this escape" problems I've seen invariably can be solved by moving some final field initialization to before the <span style="font-family:monospace">super()</span> call (which will now be possible), as in the <span style="font-family:monospace">FilteredSet</span> example.</div><div><br></div><div>There is another subtlety behind this idea: the warning needs to be emitted at the point of offense, so we need to define where that is.<br></div><div><br></div><div>Consider the <span style="font-family:monospace">FilteredSet</span> example: which of <span style="font-family:monospace">FilteredSet</span>, <span style="font-family:monospace">HashSet</span>, or <span style="font-family:monospace">AbstractCollection</span> is "at fault"??</div><br><div>Is the problem the fault of the <span style="font-family:monospace">HashSet()</span> constructor, for invoking <span style="font-family:monospace">addAll()</span>? That's a superclass method, not a subclass method, so that seems reasonable. Is it the fault of <span style="font-family:monospace">AbstractCollection.addAll()</span>, which calls <span style="font-family:monospace">add()</span>, which is very likely to be overridden in a subclass? That seems perfectly normal - after all <span style="font-family:monospace">addAll()</span> is a regular method, not a constructor. Or is it the fault of <span style="font-family:monospace">FilteredSet</span> for not initializing the <span style="font-family:monospace">filter</span> field prior to invoking <span style="font-family:monospace">super()</span>?</div><div><br></div><div>My thought on this is sort of akin to the old saying that "good fences make for good neighbors". In other words, don't rely on others to not overstep when you have a simple way to prevent that yourself. Now that we're giving classes the ability to protect themselves by doing final field initializations prior to <span style="font-family:monospace">super()</span> calls, let's put the burden of protection on them, not on some unrelated superclass. In other words, <span style="font-family:monospace">FilteredSet</span> is "to blame" here.<br></div><div><br></div><div>OK so now let's analyze when a final field <span style="font-family:monospace">myFinalField</span> in class <span style="font-family:monospace">MyClass</span> could possibly be read prior to initialization...</div><div><br></div><div>Consider any path of execution through a constructor. It will invoke either <span style="font-family:monospace">this()</span> or <span style="font-family:monospace">super()</span> exactly once. If it invokes <span style="font-family:monospace">super()</span>, it will also initialize <span style="font-family:monospace">myFinalField</span> at some point, either explicitly, or implicitly - immediately after the <span style="font-family:monospace">super()</span> call if the field has an initializer or is assigned in an initialization block.<br></div><div><br></div><div>Observation #1: Prior to invoking <span style="font-family:monospace">this()</span> or <span style="font-family:monospace">super()</span>, it's impossible for any field to be read (JVM guarantee), so clearly it's impossible for <span style="font-family:monospace">myFinalField</span> to be read prior to initialization. No need to check here.<br></div><div><br></div><div>Observation #2: If the constructor invokes <span style="font-family:monospace">this()</span>, we assume by induction that the other constructor has already been checked and any warnings generated. Upon its return, <span style="font-family:monospace">myFinalField</span> must be initialized already so we're done here too.<br></div><div><br></div><div>So we only need to be concerned with the time during and after a <span style="font-family:monospace">super()</span> call.<br></div><div><br></div><div>If <span style="font-family:monospace">myFinalField</span> is explicitly assigned prior to invoking <span style="font-family:monospace">super()</span>, then we're done - it's impossible for there to be any reference prior to initialization by Observation #1.<br></div><div><br></div><div>Therefore, we only need to be concerned with the case where <span style="font-family:monospace">myFinalField</span> is initialized after the <span style="font-family:monospace">super()</span> call.<br></div><div><div><br></div><div>Let's call the time from the <span style="font-family:monospace">super()</span> invocation to the point where <span style="font-family:monospace">myFinalField</span> gets initialized the "vulnerability window for <span style="font-family:monospace">myFinalField</span>" and any access to <span style="font-family:monospace">myFinalField</span> within this window an "early access".</div><div><br></div><div>We want the compiler to generate a warning unless it can prove to itself that there are no early accesses to <span style="font-family:monospace">myFinalField<span style="font-family:arial,sans-serif">.</span></span></div></div><div><br></div><div>There are two cases: early accesses during the <span style="font-family:monospace">super()</span> call, and early accesses after the <span style="font-family:monospace">super()</span> call.<br></div><div><br></div><div>Case #1 - Early accesses during the <span style="font-family:monospace">super()</span> call</div><div><br></div><div><div>We can make a special case here for classes that extend <span style="font-family:monospace">java.lang.Object</span>, where we can assume the superclass constructor does nothing. Done.<br></div><div><br></div><div>Otherwise, we observe the following: a superclass constructor could downcast itself and, if <span style="font-family:monospace">myFinalField<span style="font-family:arial,sans-serif"></span></span> is public (or if in the same package, any non-private), access it directly. It could also access any public (or if in the same package, any non-private) method of <span style="font-family:monospace">MyClass</span> without necessarily having to downcast - and one of those methods is very likely to be able to access <span style="font-family:monospace">myFinalField<span style="font-family:arial,sans-serif"></span></span> because that's very common.<br></div><div><br></div><div>So it seems pretty hopeless trying to prove there are no possible early accesses to <span style="font-family:monospace">myFinalField</span> when the superclass is not <span style="font-family:monospace">Object</span> and we can't inspect its constructor. It might be possible if <span style="font-family:monospace">myFinalField<span style="font-family:arial,sans-serif"> </span></span>were private, and none of <span style="font-family:monospace">MyClass</span>'s non-private methods accessed it, even indirectly. Even then it would take work to prove, and this would only help in a few uncommon cases.<br></div></div><div><br></div><div>Hmm... moving on...</div><div><br></div><div>Case #2 - Early accesses after the <span style="font-family:monospace">super()</span> call</div><div><br></div><div>The only way <span style="font-family:monospace">myFinalField</span> could be accessed is if the constructor itself does so directly, passes off <span style="font-family:monospace">this</span> to somewhere we can't follow it, or if it (perhaps indirectly) invokes a non-static method of <span style="font-family:monospace">MyClass</span> (or a subclass of MyClass if <span style="font-family:monospace">myFinalField<span style="font-family:arial,sans-serif"></span></span> is not private) which contains an early access.</div><div><br></div><div>Here there may be a little more hope.</div><div><div><br></div><div><div><span style="font-family:monospace"><span style="font-family:monospace"><span style="font-family:arial,sans-serif"><span style="font-family:monospace"><span style="font-family:arial,sans-serif">Let's say an AST
subtree is "nested" if it appears inside a lambda or inner
class.<br></span></span></span></span></span></div><div><br><div>Let's define these fairly broad/conservative categories of AST subtress that could possibly lead to an early access of <span style="font-family:monospace">myFinalField</span>:<br></div><div><ol><li>Non-nested expressions of the form <span style="font-family:monospace">myFinalField</span>, <span style="font-family:monospace">this.<span style="font-family:monospace">myFinalField</span></span>, or <span style="font-family:monospace"><span style="font-family:monospace">MyClass</span>.this.<span style="font-family:monospace">myFinalField</span></span></li><li><span style="font-family:monospace"></span>Non-nested non-static method invocations explicitly targeting <span style="font-family:monospace">this</span>, <span style="font-family:monospace"><span style="font-family:monospace">MyClass</span>.this<span style="font-family:arial,sans-serif">, </span>super</span> or <span style="font-family:monospace">SomeType.super</span>, or implicitly targeting this instance, e.g., <span style="font-family:monospace">hashCode()</span></li><ol><li><span style="font-family:monospace"><span style="font-family:monospace"><span style="font-family:monospace"><span style="font-family:arial,sans-serif"><span style="font-family:monospace"><span style="font-family:monospace"><span style="font-family:arial,sans-serif"><span style="font-family:monospace"><span style="font-family:monospace"><span style="font-family:arial,sans-serif">Except for</span></span></span></span></span></span> private or final methods declared in <span style="font-family:monospace">MyClass</span> that themselves have no early access</span></span></span></span></li></ol><li>Non-nested expressions <span style="font-family:monospace">this</span> or <span style="font-family:monospace"><span style="font-family:monospace">MyClass</span>.this<span style="font-family:monospace"><span style="font-family:arial,sans-serif"> that are not part of a field or method access (e.g., method parameters)<br></span></span></span></li><li><span style="font-family:monospace"><span style="font-family:monospace"><span style="font-family:arial,sans-serif">Non-nested instantiations of any inner class for which <span style="font-family:monospace">MyClass</span> is the outer class</span></span></span></li><li><span style="font-family:monospace"><span style="font-family:monospace"><span style="font-family:arial,sans-serif">Non-nested lambda expressions<br></span></span></span></li></ol></div><div><div><div>Notes:</div><div><ul><li>We can define these modulo extraneous parentheses, so for example <span style="font-family:monospace">((this))<span style="font-family:monospace">.method()</span><span style="font-family:arial,sans-serif"> would be in category #2</span></span></li><li>The "Except for" methods in #2 are defined recursively, but they should still be well-defined.<span style="font-family:monospace"><span style="font-family:arial,sans-serif"></span></span></li><li>In constructors, category #1 is already taken care of for us - these expressions already generate a compiler error (once <a href="https://github.com/openjdk/jdk/pull/10956" target="_blank">pr#10956</a> is accepted :)</li><ul><li>But not in regular methods, so we'd need checks in that case<br></li></ul></ul></div></div><div>I *think* every early access after the <span style="font-family:monospace">super()</span> call would have to result from an AST matching one of the above categories.<br><div><br><div><div><span style="font-family:monospace"><span style="font-family:monospace"><span style="font-family:arial,sans-serif"><span style="font-family:monospace"><span style="font-family:arial,sans-serif">I'll stop here for now... thoughts?</span></span></span></span></span></div><div><span style="font-family:monospace"><span style="font-family:monospace"><span style="font-family:arial,sans-serif"><span style="font-family:monospace"><span style="font-family:arial,sans-serif"><br></span></span></span></span></span></div><div><span style="font-family:monospace"><span style="font-family:monospace"><span style="font-family:arial,sans-serif"><span style="font-family:monospace"><span style="font-family:arial,sans-serif">-Archie<br></span></span></span></span></span></div><div><span style="font-family:monospace"><span style="font-family:monospace"><span style="font-family:arial,sans-serif"><span style="font-family:monospace"><span style="font-family:arial,sans-serif"><br></span></span></span></span></span></div></div></div>-- <br><div dir="ltr">Archie L. Cobbs<br></div></div></div></div></div></div></div>