RFR: 8341782: Allow lambda capture of basic for() loop variables as with enhanced for()
Maurizio Cimadamore
mcimadamore at openjdk.org
Thu Oct 10 17:27:11 UTC 2024
On Thu, 10 Oct 2024 15:51:28 GMT, Archie Cobbs <acobbs at openjdk.org> wrote:
>>> Regardless, I think I found an issue with the current implementation when there are nested loops:
>>>
>>> ```
>>> void test() {
>>> for (int i = 0; ; ) {
>>> for (; ;) {
>>> i = 42;
>>> Runnable r = () -> System.out.println(i);
>>> }
>>> }
>>> }
>>> ```
>>
>> Issues aside, I think a related example:
>>
>>
>> void test() {
>> for (int i ; ; ) {
>> for (; ;) {
>> i = 42;
>> Runnable r = () -> System.out.println(i);
>> }
>> }
>> }
>>
>> Poses an interesting question: should `i` be capturable from the PoV of the inner loop? After all, in the inner loop we know there one value for `i` on each iteration.
>>
>> Relatedly, what about this?
>>
>>
>> void test(int x) {
>> int i = 42;
>> if (x > 42) {
>> i = 43;
>> } else {
>> Runnable r = () -> System.out.println(i);
>> }
>> }
>>
>>
>> We know that `x` is not reassigned in the `else` branch. But the analysis we have now is not sophisticated enough to detect that.
>>
>> I think there seem to be some more general principle here that is struggling to get out: sometimes there might be region of code where we know a variable can hold only one value (even if in other regions that variable might be reassigned multiple times). In such case you can imagine to take the content of the variable and assign it to a _new_ variable, local to that "well-behaved" region. That new variable could, of course, be captured. The loop case (handled by this PR) seems to be a special case of this situation.
>
>> Regardless, I think I found an issue with the current implementation when there are nested loops:
>
> Yep - thanks, very interesting. To clarify the intent, a variable is capturable if it is not **reassigned in the loop**, which is defined as:
>
> * Wherever the variable occurs as the left hand side in an assignment expression, it is definitely unassigned
> * The variable never occurs as the operand of a prefix or postfix increment or decrement operator
>
> In your example with `i = 42`, the variable `i` is not DU at that point (even though it takes two loops around the inner loop in `AssignAnalyzer` to figure that out). So according to the original intent, it should not be capturable.
>
> I'll work on fixing this (and add a new test case).
>
>> I think there seem to be some more general principle here that is struggling to get out...
>
> Agreed... I had a similar vague thought while wondering "What about `while { }` and `do { }` loops?" In other words, is there some coherent way that we could we spread this new capturable goodness more widely?
>
> One simple idea for generalizing the current approach: Within any scope _S_, any normal variable _v_ that is visible in _S_ is capturable if it is not reassigned in _S_. I don't know if the JLS actually defines "scope" but you know the idea: a region of code in which variables can be declared and be visible; so `{ ... }` curly braces obviously, but also the body of a `for()` statement.
>
> An example that would then be allowed:
>
> int i, j = 0;
> while (true) {
> j += i;
> {
> Runnable action = () -> foo(i + j);
> try {
> this.executor.submit(action).get();
> } catch (InterruptedException e) {
> break;
> }
> }
> i++;
> }
>
> Note the extra curly braces required to create the "well-behaved" region.
>
> It's almost as if the "well-behaved" region "captures" the variables that it uses and makes them temporarily effectively final... which then begs the question, why not just cut out the middle man? That would then put us in a situation where any variable could be captured anywhere (which some have argued for), i.e., everything is effectively final at the point of capture.
>
> So I'm not sure if a coherent middle ground exists or what it would be.
> An example that would then be allowed:
>
> ```java
> int i, j = 0;
> while (true) {
> j += i;
> {
> Runnable action = () -> foo(i + j);
> try {
> this.executor.submit(action).get();
> } catch (InterruptedException e) {
> break;
> }
> }
> i++;
> }
> ```
I tend to agree with this line of thinking. And this is IMHO, a sign that the design has not yet fully settled. The main thing to notice is that there's two stable design points:
1. only capture effectively final vars (where the language is today);
2. capture every variable - with the understanding that what's being captured is the _current_ value (and not a "mutable box" which keeps changing in real time - that would be awful).
Having special rules for loop variables is sending us down the slippery slope of finding the next stable point between (1) and (2) - assuming one exists. The risk here is that once we start relaxing things in some areas (e.g. loops) you can quickly run into issues where other deficiencies in capture become even more apparent.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/21415#discussion_r1795837784
More information about the compiler-dev
mailing list