RFR: 8015831: Add lint check for calling overridable methods from a constructor [v7]

Maurizio Cimadamore mcimadamore at openjdk.org
Thu Jan 12 12:31:39 UTC 2023


On Wed, 11 Jan 2023 03:30:03 GMT, Archie L. Cobbs <duke at openjdk.org> wrote:

>> This PR adds a new lint warning category `this-escape`.
>> 
>> It also adds `@SuppressWarnings` annotations as needed to the JDK itself to allow the JDK to continue to compile with `-Xlint:all`.
>> 
>> A 'this' escape warning is generated for a constructor `A()` in a class `A` when the compiler detects that the following situation is _in theory possible:_
>> * Some subclass `B extends A` exists, and `B` is defined in a separate source file (i.e., compilation unit)
>> * Some constructor `B()` of `B` invokes `A()` as its superclass constructor
>> * During the execution of `A()`, some non-static method of `B.foo()` could get invoked, perhaps indirectly
>> 
>> In the above scenario, `B.foo()` would execute before `A()` has returned and before `B()` has performed any initialization. To the extent `B.foo()` accesses any fields of `B` - all of which are still uninitialized - it is likely to function incorrectly.
>> 
>> Note, when determining if a 'this' escape is possible, the compiler makes no assumptions about code outside of the current compilation unit. It doesn't look outside of the current source file to see what might actually happen when a method is invoked. It does follow method and constructors within the current compilation unit, and applies a simplified union-of-all-possible-branches data flow analysis to see where 'this' could end up.
>> 
>> From my review, virtually all of the warnings generated in the various JDK modules are valid warnings in the sense that a 'this' escape, as defined above, is really and truly possible. However, that doesn't imply that any bugs were found within the JDK - only that the possibility of a certain type of bug exists if certain superclass constructors are used by someone, somewhere, someday.
>> 
>> For several "popular" classes, this PR also adds `@implNote`'s to the offending constructors so that subclass implementors are made aware of the threat. For one example, `TreeMap(Map)` invokes `putAll()` and `put()`.
>> 
>> More details and a couple of motivating examples are given in an included [doc file](https://github.com/archiecobbs/jdk/blob/ThisEscape/src/java.base/share/classes/java/lang/doc-files/ThisEscape.html) that these `@implNote`'s link to. See also the recent thread on `amber-dev` for some background.
>> 
>> Ideally, over time the owners of the various modules would review their `@SuppressWarnings("this-escape")` annotations and determine which other constructors also warranted such an `@implNote`.
>> 
>> Because of all the`@SuppressWarnings` annotations, this PR touches a bunch of different JDK modules. My apologies for that. Adding these annotations was determined to be the more conservative approach, as compared to just excepting `this-escape` from various module builds globally.
>> 
>> **Patch Navigation Guide**
>> 
>> * Non-trivial compiler changes:
>>     * `src/jdk.compiler/share/classes/com/sun/tools/javac/code/Lint.java`
>>     * `src/jdk.compiler/share/classes/com/sun/tools/javac/code/Types.java`
>>     * `src/jdk.compiler/share/classes/com/sun/tools/javac/comp/Flow.java`
>>     * `src/jdk.compiler/share/classes/com/sun/tools/javac/tree/TreeInfo.java`
>>     * `src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ThisEscapeAnalyzer.java`
>>     * `src/jdk.compiler/share/classes/com/sun/tools/javac/resources/compiler.properties`
>>     * `src/jdk.compiler/share/classes/com/sun/tools/javac/resources/javac.properties`
>> 
>> * Javadoc additions of `@implNote`:
>> 
>>     * `src/java.base/share/classes/java/io/PipedReader.java`
>>     * `src/java.base/share/classes/java/io/PipedWriter.java`
>>     * `src/java.base/share/classes/java/lang/Throwable.java`
>>     * `src/java.base/share/classes/java/util/ArrayDeque.java`
>>     * `src/java.base/share/classes/java/util/EnumMap.java`
>>     * `src/java.base/share/classes/java/util/HashSet.java`
>>     * `src/java.base/share/classes/java/util/Hashtable.java`
>>     * `src/java.base/share/classes/java/util/LinkedList.java`
>>     * `src/java.base/share/classes/java/util/TreeMap.java`
>>     * `src/java.base/share/classes/java/util/TreeSet.java`
>> 
>> * New unit tests
>>     * `test/langtools/tools/javac/warnings/ThisEscape/*.java`
>> 
>> * **Everything else** is just adding `@SuppressWarnings("this-escape")`
>
> Archie L. Cobbs has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - Use the more appropriate Type comparison method Types.isSameType().
>  - Add some more comments to clarify how the analysis works.

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ThisEscapeAnalyzer.java line 163:

> 161:      *  invoked from the target constructor; if empty, we're still in the constructor.
> 162:      */
> 163:     private final ArrayDeque<DiagnosticPosition> callStack = new ArrayDeque<>();

There is a concept of push/popScope and then there's a separate concept of call stack (which is just a list of diagnostic position up to the point). I wonder if this could be better modeled by using a single class e.g. Scope/Frame which has a diagnostic position, plus other useful things. Perhaps it might even be helpful to have a ref set on each scope, so that you don't have to attach a "depth" to each ref - the depth of the ref would be determined by the "scope" in which it appears.

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ThisEscapeAnalyzer.java line 175:

> 173:     private DiagnosticPosition[] pendingWarning;
> 174: 
> 175: // These fields are scoped to the CONSTRUCTOR OR INVOKED METHOD BEING ANALYZED

Watch out for "all caps"

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ThisEscapeAnalyzer.java line 218:

> 216:         new TreeScanner() {
> 217: 
> 218:             private Lint lint = ThisEscapeAnalyzer.this.lint;

On a first look I'm not sure about the granularity of suppression here. I believe that suppressing at the class level, or at the constructor level is enough. Allowing to further annotate var declarations and non-constructor methods, while doable, might actually be counter productive - in the sense that the annotation does not occur in the constructor (where you'd want to see it) but in some other method. I think the fact that a constructor is escaping (willingly) should be a visible thing. Something to consider.

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ThisEscapeAnalyzer.java line 220:

> 218:             private Lint lint = ThisEscapeAnalyzer.this.lint;
> 219:             private JCClassDecl currentClass;
> 220:             private boolean privateOuter;

I don't think this is needed - see below

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ThisEscapeAnalyzer.java line 227:

> 225:                 final boolean privateOuterPrev = this.privateOuter;
> 226:                 final Lint lintPrev = this.lint;
> 227:                 this.lint = this.lint.augment(tree.sym);

general stylistic comment - I see here and everywhere in this class `this.xyz` - this is not the norm in the rest of the javac codebase, and it would be better to replace it with just `xyz`

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ThisEscapeAnalyzer.java line 230:

> 228:                 try {
> 229:                     this.currentClass = tree;
> 230:                     this.privateOuter |= tree.sym.isAnonymous();

These can be inlined in the check below

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ThisEscapeAnalyzer.java line 270:

> 268:                     final boolean analyzable = this.currentClassIsExternallyExtendable() &&
> 269:                         TreeInfo.isConstructor(tree) &&
> 270:                         !tree.sym.isPrivate() &&

Why aren't private constructors analyzed? If we have a class with a private constructor and public static factory invoking said constructor, and the constructor makes `this` escape, isn't that an issue we should detect?

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ThisEscapeAnalyzer.java line 294:

> 292:                   !(this.currentClass.sym.isSealed() && this.currentClass.permitting.isEmpty()) &&
> 293:                   !(this.currentClass.sym.owner.kind == MTH) &&
> 294:                   !this.privateOuter;

Here, note that is the owner of the current class symbol is a method, that covers anonymous classes too, which is a part of `privateOuter`. So the only think we need to check here is whether "currentClass" is private, which is a simple predicate. No need to carry `privateOuter` I believe

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ThisEscapeAnalyzer.java line 304:

> 302: 
> 303:             // We are looking for analyzable constructors only
> 304:             final Symbol sym = entry.getKey();

This seems unused. And, if so, perhaps we only need a `Set<MethodInfo>`, not a map.

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ThisEscapeAnalyzer.java line 348:

> 346:         final Comparator<DiagnosticPosition[]> ordering = (warning1, warning2) -> {
> 347:             for (int index1 = 0, index2 = 0; true; index1++, index2++) {
> 348:                 final boolean end1 = index1 >= warning1.length;

Another stylistic comment - the `final` here is not super helpful. The compiler performs effectively final analysis, so if your locals are only written once, you are good to go - and can even use them inside lambdas. From a documentation perspective it might carry a bit of value, but again, the rest of the javac code generally doesn't do that.

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ThisEscapeAnalyzer.java line 411:

> 409:         final boolean referenceExpressionNode;
> 410:         switch (tree.getTag()) {
> 411:         case CASE:

surprised to see `CASE` here - as that's not an expression

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ThisEscapeAnalyzer.java line 444:

> 442:         if (referenceExpressionNode) {
> 443: 
> 444:             // We treat instance methods as having a "value" equal to their instance

The comment is slightly misleading - e.g. I'd suggest clarifying "having a "value" whose type is the same as that of the class in which the method is defined"

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ThisEscapeAnalyzer.java line 454:

> 452: 
> 453:             // If the expression type is incompatible with 'this', discard it
> 454:             if (type != null && !this.isSubtype(this.targetClass.sym.type, type))

Instead of adding the direct reference, and then having to check if the reference needs to be removed, would it be possible not to add the reference in the first place if the types mismatch?

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ThisEscapeAnalyzer.java line 504:

> 502:         // Recurse on method expression
> 503:         this.scan(invoke.meth);
> 504:         final boolean direct = this.refs.remove(ExprRef.direct(this.depth));

Understanding checkpoint. Considering this code:


rec.m()

There are two cases:

* `rec` might be a direct reference to this (e.g. a local)
* `rec` might be an indirect reference to this (a lambda containing `this`)

So the receiver of the method might be direct or indirect. This will then determine how to interpret `this` in the context of that method analysis - e.g. when we see a `JCIdent` for `this`, we create a direct/indirect `ExprRef` based on what the receiver kind was. Correct?

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ThisEscapeAnalyzer.java line 817:

> 815:         // Methods - the "value" of a non-static method is a reference to its instance
> 816:         final Symbol sym = tree.sym;
> 817:         if (sym.kind == MTH) {

This is perhaps where filtering based on the declaring class could make sense (to avoid having to filter later) ? Perhaps this could also be centralized - e.g. whenever you create an ExprRef you also pass the type for it, and if the type matches that for the current class you create it and add to the list, otherwise you skip it.

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ThisEscapeAnalyzer.java line 875:

> 873:         // Reference to this?
> 874:         if (tree.name == names._this || tree.name == names._super) {
> 875:             if (this.refs.contains(ThisRef.direct()))

This idiom occurs quite a lot. If I'm correct, this basically amounts at asking as to whether the receiver of the method we're currently evaluating is direct or not (which is an invariant, given a method body - e.g. for a given method this "fact" should stay the same). If that's the case, perhaps capturing this in a flag could be better - then you could have just have a single method e.g. `XYZRef.create(boolean direct)`, and remove the branching (here and elsewhere).

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ThisEscapeAnalyzer.java line 900:

> 898:             final Type.ClassType currentClassType = (Type.ClassType)this.methodClass.sym.type;
> 899:             final Type methodOwnerType = sym.owner.type;
> 900:             if (this.isSubtype(currentClassType, methodOwnerType)) {

I believe what you need here is not subtyping but subclassing - see `Symbol.isSubclass`

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ThisEscapeAnalyzer.java line 909:

> 907: 
> 908:             // Check for implicit outer 'this' reference
> 909:             if (this.types.hasOuterClass(currentClassType, methodOwnerType)) {

Similarly here - look for `Symbol.isEnclosedBy`

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ThisEscapeAnalyzer.java line 1302:

> 1300:      *  In other words, a reference that's sitting on top of the stack.
> 1301:      */
> 1302:     private static class ExprRef extends Ref {

Maybe I'm wrong - but it seems that the only thing we need to track is whether the top of stack (e.g. last evaluated expression) has a direct reference or indirect reference (or none). Do we really need a set for this? (this seems the same as for ThisRef which is used inside method, which can be one of the same three options). While I do see value for tracking which variables are aliases for this (either direct or indirect), it seems like (at least on a superficial look) that expression refs might be replaced by a visitor field/parameter.

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ThisEscapeAnalyzer.java line 1319:

> 1317:     /** A reference from the return value of the current method being "invoked".
> 1318:      */
> 1319:     private static class ReturnRef extends Ref {

Isn't this just an ExprRef? This might also be related with the fact that we deal with return values in different ways than with e.g. values returned from a nested scope (where we just pop, and then copy all pending expression to the outer depth).

-------------

PR: https://git.openjdk.org/jdk/pull/11874



More information about the build-dev mailing list