RFR: 8015831: Add lint check for calling overridable methods from a constructor [v5]

Wed Jan 11 21:54:17 UTC 2023

On Wed, 11 Jan 2023 21:45:20 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:

>> Regarding the assignment graph approach, I think that would work if the references are bouncing around strictly between variables, but what if the chain includes any of the more complicated stuff that is currently being tracked, such as various Java expressions, method invocations, conditionals, etc.?
>> 
>> Consider this example:
>> 
>> public class ThisEscapeLoop { 
>>     
>>     public ThisEscapeLoop() { 
>>         ThisEscapeLoop ref1 = this;
>>         ThisEscapeLoop ref2 = null;
>>         ThisEscapeLoop ref3 = null;
>>         ThisEscapeLoop ref4 = null;
>>         for (int i = 0; i < 100; i++) {
>>             ref4 = this.returnMe(ref3);
>>             ref3 = ref2;
>>             ref2 = ref1;
>>             if (ref4 != null)
>>                 ref4.mightLeak();
>>         }
>>     }
>> 
>>     public <T> T returnMe(T x) {
>>         return x;
>>     }
>> 
>>     public void mightLeak() {
>>     }
>> }
>> 
>> If you are only looking at variable assignments with values that are other variables, then the chain "breaks" when `ref4` is assigned indirectly via `returnMe()`, and you miss the leak.
>
>> Good idea. Looks like the difference is in the noise, at least on my Macbook:
>> 
>> ```
>> Builds of master (jdk-21+3-69-gc6588d5bb3f)
>> ==================================
>> 
>> Build times:
>> 
>>     real	2m24.650s
>>     user	13m46.727s
>>     sys	2m33.554s
>> 
>>     real	2m27.224s
>>     user	13m43.464s
>>     sys	2m37.251s
>> 
>>     real	2m26.658s
>>     user	13m42.578s
>>     sys	2m36.133s
>> 
>> Builds of ThisEscape (jdk-21+3-125-g6e96a7d76f8)
>> ==================================
>> 
>> Modifications:
>> 
>>     - Reverted files in the make/ subdirectory to enable warning
>>     - Commented out lines 363-382 in ThisEscapeAnalyzer.java
>>       so no warnings are actually reported
>> 
>> Build times:
>> 
>>     real	2m25.912s
>>     user	13m45.860s
>>     sys	2m32.741s
>> 
>>     real	2m27.213s
>>     user	13m44.830s
>>     sys	2m36.596s
>> 
>>     real	2m25.756s
>>     user	13m42.889s
>>     sys	2m35.659s
>> ```
> 
> Thanks for trying it out - good to know that build time isn't affected.

> Regarding the assignment graph approach, I think that would work if the references are bouncing around strictly between variables, but what if the chain includes any of the more complicated stuff that is currently being tracked, such as various Java expressions, method invocations, conditionals, etc.?
> 
> Consider this example:
> 
> ```java
> public class ThisEscapeLoop { 
>     
>     public ThisEscapeLoop() { 
>         ThisEscapeLoop ref1 = this;
>         ThisEscapeLoop ref2 = null;
>         ThisEscapeLoop ref3 = null;
>         ThisEscapeLoop ref4 = null;
>         for (int i = 0; i < 100; i++) {
>             ref4 = this.returnMe(ref3);
>             ref3 = ref2;
>             ref2 = ref1;
>             if (ref4 != null)
>                 ref4.mightLeak();
>         }
>     }
> 
>     public <T> T returnMe(T x) {
>         return x;
>     }
> 
>     public void mightLeak() {
>     }
> }
> ```
> 
> If you are only looking at variable assignments with values that are other variables, then the chain "breaks" when `ref4` is assigned indirectly via `returnMe()`, and you miss the leak.

So, in this example though you are calling an instance method before the object is initialized, which would seem to me like a leak (leaving aside heroics to try and chase the method body and see what the body of the method actually does).

And, if the method is static, same story - you are passing `ref3` somewhere else, and `ref3` potentially contains `this`. 

While I know that this is not perfect, and bound to generate false positives, I get the spirit of my question is, really: is there a (much) simpler scheme we can get away with, which has bounded complexity, and which has the property we care about (which seems to be no false negative). I'm less worried about contrived cases emitting false positives, as it's an optional warning that can be shut down - but I'd like perhaps to move the discussion from trying to detect _precisely_ if the leak happens to try to detect if _potentially_ a leak can happen, and see if there's some simpler analysis that can be used to get there (e.g. one which doesn't require flooding). It's possible that you have already considered all these options and the analysis you have here is the best trade off between complexity and precision - but I'd like to have a better understanding of what the trade offs are, and, more importantly, what happens to real code when we tweak the analysis this or that way (as this is
  a problem where I feel it's easy to get in the land of diminishing returns).

-------------

PR: https://git.openjdk.org/jdk/pull/11874