<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p><br>
</p>
<div class="moz-cite-prefix">On 12/06/2024 22:06, Archie Cobbs
wrote:<br>
</div>
<blockquote type="cite" cite="mid:CANSoFxtc_Q+Q2sc-5UAyjS03dzA23K+qSC+WRi5MkcLo_8wzGQ@mail.gmail.com">
<div dir="ltr">
<div>Hi Maurizio,</div>
<div><br>
</div>
<div>Thanks for looking into these issues. I agree this is
definitely one of those "we need to rethink and refactor"
opportunities. Please let me know if/how I can help. So far
I've done some complaining but not much else :)</div>
<div><br>
</div>
<div>I'm trying to wrap my head around what's needed and have
tried to write this out in detail mainly for my own
understanding. Please tell me if the below sounds correct to
you.<br>
</div>
<div><br>
</div>
<div>Observation: this issue doesn't really affect non-static
member inner classes: they can never be declared in an early
construction context of their directly enclosing class. So
they always have an immediate outer instance, and it's always
available and usable (or explicitly provided via qualified
new). So all the complexity comes with local and anonymous
classes, especially when they are declared inside early
construction contexts with various levels of nesting.
Fortunately, the JEP treats local and anonymous classes
basically the same way.<br>
</div>
</div>
</blockquote>
Correct. Btw, I think I like the term "Member inner classes" to mean
"non-static". We typically used "nested class" to mean the static
kind. In any case, these classes are *not* problematic. Local and
anonymous (where the latter is just a special case of the former)
are. And the first big issue implementation-wise is that javac is
attempting to translate _all_ kinds of inner classes (whether
member, or local) in the same way, which leads to issue.<br>
<blockquote type="cite" cite="mid:CANSoFxtc_Q+Q2sc-5UAyjS03dzA23K+qSC+WRi5MkcLo_8wzGQ@mail.gmail.com">
<div dir="ltr">
<div><br>
</div>
<div>Some terminology (just placeholders for now, I'm mainly
trying to nail down the logic):<br>
</div>
<div>
<ol>
<li>Every class declaration C has zero or more lexically
enclosing class declarations; let's call them E₁, E₂,
E₃... ordered from the inside out.</li>
<li>For ease of terminology, define the degenerate case E₀ =
C<br>
</li>
<li>For each Eᵢ, there is possibly an <b>outer instance</b>
Oᵢ <b>defined</b> for C (see below)<br>
</li>
<li>For ease of terminology, define the degenerate case O₀ =
C.this.</li>
<li>When is Oᵢ <b>defined</b> for C?</li>
<ol>
<li>O₀ is always defined</li>
<li>Let k ≥ 0 be the smallest value such that Eₖ is static
(enum, record, declared in a static context, etc.)<br>
</li>
<li>For all i ≤ k, the outer instance Oᵢ is defined for C
and it has type Eₖ</li>
<li>For all i > k there is no outer instance Oᵢ defined
for C</li>
</ol>
<li>If Oᵢ is defined, it may also be the case that Oᵢ is <b>accessible</b>
in C (in non-static contexts):</li>
<ol>
<li>Oᵢ is <b>accessible</b> in C if and only if C does
not appear in an early construction context of Eᵢ</li>
</ol>
</ol>
</div>
</div>
</blockquote>
Correct. The key observation here is that the set of lexically
enclosing classes does not map 1:1 with the set of outer instances.<br>
<blockquote type="cite" cite="mid:CANSoFxtc_Q+Q2sc-5UAyjS03dzA23K+qSC+WRi5MkcLo_8wzGQ@mail.gmail.com">
<div dir="ltr">
<div>From the above, you can see that the outer classes for
which a corresponding outer instance is defined includes C and
is a contiguous range up until you hit the first static
class/context. However, the set of outer classes for which the
corresponding outer instance is both defined <i>and
accessible</i> is an arbitrary subset of that contiguous
range. In effect we create an "inaccessibility hole" at any
class Eₖ for which C is inside an early construction context
of Eₖ.</div>
</div>
</blockquote>
Yep, we sometimes refer to issues like these as "swiss cheese"
problem :-)<br>
<blockquote type="cite" cite="mid:CANSoFxtc_Q+Q2sc-5UAyjS03dzA23K+qSC+WRi5MkcLo_8wzGQ@mail.gmail.com">
<div dir="ltr">
<div><br>
</div>
<div>Observation: Suppose Oₖ of type Eₖ is defined for C. Then
for any h > k, Oₕ is defined (accessible) for Eₖ if and
only if it is defined (accessible) for C. In other words, if a
class F encloses both C and Eₖ, then C and Eₖ agree on whether
F has a defined and/or accessible outer instance. This is
simply because C and Eₖ are lexically in the "same place" with
respect to F.<br>
</div>
</div>
</blockquote>
Yes.<br>
<blockquote type="cite" cite="mid:CANSoFxtc_Q+Q2sc-5UAyjS03dzA23K+qSC+WRi5MkcLo_8wzGQ@mail.gmail.com">
<div dir="ltr">
<div><br>
</div>
<div>Define the <b>compiler outer instance</b> for C to be that
Oₖ for which k > 0, Oₖ is both defined and accessible, and
where k is minimal (if such a thing exists).</div>
</div>
</blockquote>
Yes, that is a possible strategy to tackle that. There is also
another (see below).<br>
<blockquote type="cite" cite="mid:CANSoFxtc_Q+Q2sc-5UAyjS03dzA23K+qSC+WRi5MkcLo_8wzGQ@mail.gmail.com">
<div dir="ltr"><br>
<div>OK now given the above what does the compiler need to do?</div>
<div>
<div>
<div><br>
</div>
<div>Part 0: The compiler needs to be able to calculate,
given any class C, whether C has a compiler outer instance
(let's call it Oₖ) and its type (let's call it Eₖ)</div>
</div>
</div>
</div>
</blockquote>
Yup<br>
<blockquote type="cite" cite="mid:CANSoFxtc_Q+Q2sc-5UAyjS03dzA23K+qSC+WRi5MkcLo_8wzGQ@mail.gmail.com">
<div dir="ltr">
<div>
<div><br>
</div>
</div>
<div>Part 1: When compiling some class C for which a compiler
outer instance exists:</div>
<div>
<ol>
<li>Calculate the type Eₖ</li>
<li>Add a synthetic constructor parameter Oₖ of type Eₖ to
each constructor of C<br>
</li>
<li>Add a synthetic field of type Eₖ in which to store Oₖ
("this$n")<br>
</li>
<li>Store Oₖ in that field in each constructor of C (very
first thing)<br>
</li>
</ol>
</div>
</div>
</blockquote>
Yes. Step (3) is done "on-demand", but essence is the same.<br>
<blockquote type="cite" cite="mid:CANSoFxtc_Q+Q2sc-5UAyjS03dzA23K+qSC+WRi5MkcLo_8wzGQ@mail.gmail.com">
<div dir="ltr">
<div>
<div>Part 2: Compiling the expression "Foo.this" in the
context of some class C:<br>
</div>
<ol>
<li>If C has no compiler outer instance, then error unless
Foo = C.</li>
<li>Otherwise let Oₖ with type Eₖ be the compiler outer
instance for C<br>
</li>
<li>For expressions like "Eₕ.this" where 0 < h < k:
generate an error ("no instance in scope")<br>
</li>
<li>For expressions like "Eₖ.this": evaluate to Oₖ as
follows:</li>
<ol>
<li>If expression occurs in an early construction context
of C, then read the synthetic constructor parameter of
type Eₖ (we can do this because we are necessarily in a
C constructor)</li>
<li>Otherwise. read the synthetic field "this$0"</li>
</ol>
<li>For expressions like "Eₕ.this" where h > k as
follows:</li>
<ol>
<li>Evaluate "Eₖ.this" per previous steps to get Oₖ</li>
<li>Recurse, i.e., evaluate "Eₕ.this" in the context of
class Eₖ. This is valid due to the earlier observation.<br>
</li>
</ol>
</ol>
</div>
</div>
</blockquote>
Yes - note the slight complication in step 4 due to the fact that
accessing the field might not always be possible.<br>
<blockquote type="cite" cite="mid:CANSoFxtc_Q+Q2sc-5UAyjS03dzA23K+qSC+WRi5MkcLo_8wzGQ@mail.gmail.com">
<div dir="ltr">
<div>Part 3: When compiling "new C()" (or "C::new") where C is
some class having a compiler outer instance:</div>
<div>
<ol>
<li>Calculate the type Eₖ - the type of the synthetic
parameter Oₖ we need to prepend to the C() constructor
invocation<br>
</li>
<li>Let's assume "new C()" is inside some method,
constructor, or initializer of some class T</li>
<li>Evaluate "Eₖ.this" in the context of class T (see Part
2)<br>
</li>
<li>Proceed with construction, prepending the synthetic Oₖ
parameter<br>
</li>
</ol>
</div>
<div>Hopefully this is close to capturing how it "should"
work...?<br>
</div>
</div>
</blockquote>
<p>Yes. This is one way to do it. Another way to do it would be to
say that local/anonymous class do not get a blessed "compiler
outer instance", period. Instead, they just capture all the
enclosing "this" they need in order to get the job done, without
the expectation that one enclosing this will be reachable from the
other.</p>
<p>This is based on the observation that there's really two ways to
get to the same result:</p>
<p>* Lower has this mechanism, called "outerThisStack" which more or
less tracks what you have called "compiler outer instance". E.g.
at given point in the code, what enclosing instances are available
to me?<br>
* But there's also another mechanism - that for captured values
(Lower calls this "proxies"). This is yet another set of values
that are accessible to the local/anon class via a field (or a
constructor parameter, if inside a constructor)</p>
<p>The approach you described tries to tweak "outerThisStack" so
that the stack contains the right bits (and skips over
inaccessible enclosing instances). The alternate approach is to
just ditch "outerThisStack" for local/anon classes, and switch to
a more capture-oriented translation. Note that lambda translation
(LambdaToMethod) is more in the latter camp - e.g. if a lambda
occurs in a pre-construction context, and needs enclosing
instances O1, O2, O3, then it will accept _all_ of them as
captured parameters.</p>
<p><br>
</p>
<p>There's also a second, more subtle, implementation issue, which
I'm in the process of evaluating as we speak. It seems to me that
the ordering of the compiler steps Lower and LambdaToMethod is
backwards. E.g. LambdaToMethod goes first, then we translate inner
classes with Lower. But this leads to issues: the lambda
translation code doesn't see the full picture as local classes
have not been translated yet. So it basically has to "guess" which
variables will need to be captured for local classes inside the
lambda to work. This is not great, and leads to bugs that are very
difficult to solve, such as this:</p>
<p><a class="moz-txt-link-freetext" href="https://bugs.openjdk.org/browse/JDK-8334037">https://bugs.openjdk.org/browse/JDK-8334037</a></p>
<p>I'm currently doing some experiments to see if we can move the
compiler phases the "right way" (but this is a big change, and not
something we can do for 23). My working theory is that if we (a)
fix translation of local classes as described above (using either
approaches) AND (b) move LambdaToMethod _after_ Lower, then most
of the pesky translation issues with pre-construction context
should disappear.<br>
</p>
<p>Maurizio<br>
</p>
<p><br>
</p>
<blockquote type="cite" cite="mid:CANSoFxtc_Q+Q2sc-5UAyjS03dzA23K+qSC+WRi5MkcLo_8wzGQ@mail.gmail.com">
<div dir="ltr">
<div><br>
</div>
<div>-Archie<br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed, Jun 12, 2024 at
7:07 AM Maurizio Cimadamore <<a href="mailto:maurizio.cimadamore@oracle.com" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">maurizio.cimadamore@oracle.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<br>
as we discussed through the details of the new JLS rules in
JEP 482, we <br>
have found several issues where javac is emitting incorrect
code (e.g. <br>
attempting to capture enclosing instances that are still under
<br>
construction).<br>
<br>
In order to allow us to look at these issues more
holistically, I've <br>
created a new JBS label, namely "javac-pre-capture". Here's a
query to <br>
show them all:<br>
<br>
<a href="https://bugs.openjdk.org/issues/?jql=labels%20%3D%20javac-pre-capture" rel="noreferrer" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">https://bugs.openjdk.org/issues/?jql=labels%20%3D%20javac-pre-capture</a><br>
<br>
As stated above, this label is meant for issue related to
issues that <br>
have to do broken translation strategy of enclosing instance
references. <br>
Such issues can manifest them in different ways:<br>
<br>
* compiler crashes when compiling correct code (e.g. <br>
<a href="https://bugs.openjdk.org/browse/JDK-8334037" rel="noreferrer" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">https://bugs.openjdk.org/browse/JDK-8334037</a>)<br>
* "late" compiler error (e.g. in Lower) as the compiler can't
resolve <br>
an enclosing instance (e.g. <a href="https://bugs.openjdk.org/browse/JDK-8334121" rel="noreferrer" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">https://bugs.openjdk.org/browse/JDK-8334121</a>)<br>
* compiler succeds when compiling a correct program, but the
compiled <br>
program doesn't verify (we don't have an instance of this,
yet, but it's <br>
something bound to occur)<br>
<br>
The scope of this label is somewhat (deliberately) narrow. As
such, the <br>
label does NOT cover other issues pertaining to JEP 482, such
as<br>
<br>
* compiler attempting to translate a bad program <br>
(<a href="https://bugs.openjdk.org/browse/JDK-8334043" rel="noreferrer" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">https://bugs.openjdk.org/browse/JDK-8334043</a>)<br>
* compiler generating invalid code for reasons that have
nothing to do <br>
with capture (<a href="https://bugs.openjdk.org/browse/JDK-8332106" rel="noreferrer" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">https://bugs.openjdk.org/browse/JDK-8332106</a>)<br>
<br>
(if we think that a more general label would be useful to mark
all <br>
issues that have to do with pre-construction context, we can
do that too).<br>
<br>
Cheers<br>
Maurizio<br>
</blockquote>
</div>
<br clear="all">
<br>
<span class="gmail_signature_prefix">-- </span><br>
<div dir="ltr" class="gmail_signature">Archie L. Cobbs<br>
</div>
</blockquote>
</body>
</html>