Loosening requirements for super() invocation
Archie Cobbs
archie.cobbs at gmail.com
Thu Nov 3 22:50:42 UTC 2022
On Thu, Nov 3, 2022 at 8:07 AM Brian Goetz <brian.goetz at oracle.com> wrote:
> While we could do more warnings as ordinary RFEs, there's potentially
> still value in grouping these all under a JEP, for the reasons you hint at
> and more. Let's work through the design, figure out if we can actually
> provide any new safety guarantees (as opposed to just more warnings), and
> then figure out the best vehicle.
>
Sounds good.
I'll try to throw out a starting point...
Meta-comment: it's probably prudent to start with something relatively
narrow and well-defined. Once that idea is fully understood we can expand
from there.
OK let's define the problem. IMHO the most basic problem relates to final
fields. Obviously, Java is advertising them as special by guaranteeing
immutability and safe publication after construction. Yet it's still easy
for a final field to be read incorrectly before it's been properly
initialized. The FilteredSet example demonstrates this, and also shows how
non-obvious it can be. So having compiler automation that would help you
avoid this problem makes sense.
Ideally we would like to have some warning X for which we can say "If a
class A has a constructor that (a) compiles without generating any X
warnings, and (b) has no @SuppressWarnings("X") annotation, then the
compiler guarantees that no final field in class A can ever be read before
it's initialized".
Note the narrowness of this: we are only talking about final fields in
class A. Not fields in A's superclass or subclasses. We're not talking
about "object initialization", we're just talking about individual fields.
>From a developer point of view, this is a feature: you can assess (and
address) the situation one class and one field at a time.
Even so, from a practical point of view, solving this problem is likely to
cover the vast majority of the real-world cases where "this escape" causes
trouble, at least in my experience. The "this escape" problems I've seen
invariably can be solved by moving some final field initialization to
before the super() call (which will now be possible), as in the FilteredSet
example.
There is another subtlety behind this idea: the warning needs to be emitted
at the point of offense, so we need to define where that is.
Consider the FilteredSet example: which of FilteredSet, HashSet, or
AbstractCollection is "at fault"??
Is the problem the fault of the HashSet() constructor, for invoking addAll()?
That's a superclass method, not a subclass method, so that seems
reasonable. Is it the fault of AbstractCollection.addAll(), which calls
add(), which is very likely to be overridden in a subclass? That seems
perfectly normal - after all addAll() is a regular method, not a
constructor. Or is it the fault of FilteredSet for not initializing the
filter field prior to invoking super()?
My thought on this is sort of akin to the old saying that "good fences make
for good neighbors". In other words, don't rely on others to not overstep
when you have a simple way to prevent that yourself. Now that we're giving
classes the ability to protect themselves by doing final field
initializations prior to super() calls, let's put the burden of protection
on them, not on some unrelated superclass. In other words, FilteredSet is
"to blame" here.
OK so now let's analyze when a final field myFinalField in class MyClass
could possibly be read prior to initialization...
Consider any path of execution through a constructor. It will invoke either
this() or super() exactly once. If it invokes super(), it will also
initialize myFinalField at some point, either explicitly, or implicitly -
immediately after the super() call if the field has an initializer or is
assigned in an initialization block.
Observation #1: Prior to invoking this() or super(), it's impossible for
any field to be read (JVM guarantee), so clearly it's impossible for
myFinalField to be read prior to initialization. No need to check here.
Observation #2: If the constructor invokes this(), we assume by induction
that the other constructor has already been checked and any warnings
generated. Upon its return, myFinalField must be initialized already so
we're done here too.
So we only need to be concerned with the time during and after a super()
call.
If myFinalField is explicitly assigned prior to invoking super(), then
we're done - it's impossible for there to be any reference prior to
initialization by Observation #1.
Therefore, we only need to be concerned with the case where myFinalField is
initialized after the super() call.
Let's call the time from the super() invocation to the point where
myFinalField gets initialized the "vulnerability window for myFinalField"
and any access to myFinalField within this window an "early access".
We want the compiler to generate a warning unless it can prove to itself
that there are no early accesses to myFinalField.
There are two cases: early accesses during the super() call, and early
accesses after the super() call.
Case #1 - Early accesses during the super() call
We can make a special case here for classes that extend java.lang.Object,
where we can assume the superclass constructor does nothing. Done.
Otherwise, we observe the following: a superclass constructor could
downcast itself and, if myFinalField is public (or if in the same package,
any non-private), access it directly. It could also access any public (or
if in the same package, any non-private) method of MyClass without
necessarily having to downcast - and one of those methods is very likely to
be able to access myFinalField because that's very common.
So it seems pretty hopeless trying to prove there are no possible early
accesses to myFinalField when the superclass is not Object and we can't
inspect its constructor. It might be possible if myFinalField were private,
and none of MyClass's non-private methods accessed it, even indirectly.
Even then it would take work to prove, and this would only help in a few
uncommon cases.
Hmm... moving on...
Case #2 - Early accesses after the super() call
The only way myFinalField could be accessed is if the constructor itself
does so directly, passes off this to somewhere we can't follow it, or if it
(perhaps indirectly) invokes a non-static method of MyClass (or a subclass
of MyClass if myFinalField is not private) which contains an early access.
Here there may be a little more hope.
Let's say an AST subtree is "nested" if it appears inside a lambda or inner
class.
Let's define these fairly broad/conservative categories of AST subtress
that could possibly lead to an early access of myFinalField:
1. Non-nested expressions of the form myFinalField, this.myFinalField,
or MyClass.this.myFinalField
2. Non-nested non-static method invocations explicitly targeting this,
MyClass.this, super or SomeType.super, or implicitly targeting this
instance, e.g., hashCode()
1. Except for private or final methods declared in MyClass that
themselves have no early access
3. Non-nested expressions this or MyClass.this that are not part of a
field or method access (e.g., method parameters)
4. Non-nested instantiations of any inner class for which MyClass is the
outer class
5. Non-nested lambda expressions
Notes:
- We can define these modulo extraneous parentheses, so for example
((this)).method() would be in category #2
- The "Except for" methods in #2 are defined recursively, but they
should still be well-defined.
- In constructors, category #1 is already taken care of for us - these
expressions already generate a compiler error (once pr#10956
<https://github.com/openjdk/jdk/pull/10956> is accepted :)
- But not in regular methods, so we'd need checks in that case
I *think* every early access after the super() call would have to result
from an AST matching one of the above categories.
I'll stop here for now... thoughts?
-Archie
--
Archie L. Cobbs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-dev/attachments/20221103/4ac9cf20/attachment-0001.htm>
More information about the amber-dev
mailing list