capturing (or not) mutable local variables

Sun Nov 21 03:39:03 PST 2010

Excellent Brian.

It would mean that a single compiler error would be shown when this is broken:

  - A variable cannot be changed after it has been used in a closure initialization.

Easy, clear and user friendly. No need for funky hacks in the non-mutable case.

This kind of capture will not be seen as limited as the current effectively-final definition, which only real merit is that it saves six keystrokes.

And I think there's a big opportunity here as well. The spin doctor in me say that if it is carefully communicated by Oracle that 2) is a very conscious decision and that the way you should do capture of mutable variables is by using an AtomicXxx I think you can almost totally get away with it. Maybe even get out on top of other languages since this is very multi threading aware from the start.
It is very important that all code examples for Lamdas have at least one example of a good use of AtomicXxx for mutable variables. Then this becomes the standard way of implementing mutability in parallel closures. There might even be room for some syntax sweetening of AtomicXxx usage in the future, which makes this even more compelling.

Cheers,
Mikael

On Nov 20, 2010, at 21:41 PM, Brian Goetz wrote:

> (2) is an interesting idea and worth considering.  
> 
> On Nov 20, 2010, at 5:37 AM, Mikael Grev wrote:
> 
>> Please don't make it possible to mutate the variable from within the closure.
>> 
>> 
>> However, I wouldn't mind if "effectively-final" was extended so that either:
>> 
>> 1) The closure works on a copy, giving no restrictions on the initialization and reassignment of the captured variable
>> 2) Any variable is effectively-final as long as it isn't reassigned after the closure initialization. (reordering should be solvable)
>> 
>> The reason for this is that the following isn't that uncommon and "i" needs to be reassigned as another variable to be effectively-final (ugly):
>> 
>> int i = 0;
>> if (something)
>> 	i = xxx;
>> 
>> capture of i
>> 
>> Loosening the definition of effectively-final would effectively make the feature easier to use.
>> 
>> Cheers,
>> Mikael
>> 
>> 
>> On Nov 20, 2010, at 10:29 AM, Brian Goetz wrote:
>> 
>>>> Hi Brian, I came to you yesterday at devoxx to ask about the rationale
>>>> behind the decision to not capture mutable local variables.
>>> 
>>> I admire your passion and your persistence :)
>>> 
>>>> I understand that among several arguments, there is a strong one about
>>>> preventing people from doing unsafe things. you gave the analogy of
>>>> the kid, the fence and the pool.
>>> 
>>> Safety mechanisms do not need to be perfect to be useful.  Your arguments seem to center around "I can climb the fence, so there's no point in having it."  This argument is, to be blunt, silly.  The design principles of the Java language stress safety, even if there are cases where such safety guards can be "casted away" by a sufficiently motivated user.  
>>> 
>>> If you want to capture a not-effectively-final local variable in a closure, then one of the following two things has to happen:
>>> a) The lifetime of the variable must be extended to the lifetime of the closure
>>> b) The lifetime of the closure must be shortened to the lifetime of the variable.
>>> 
>>> In both cases, we would want to ensure the variable is only accessed from the thread capturing the closure.  In (a), we would be effectively creating a whole new class of variables (in addition to the seven (or eight, depending on how you count) already defined by the JLS); these are no longer local variables and should not look like them.   In both (a) and (b), we would be creating a kind of restricted closure, whose execution is restricted in space (confined to a specific thread) and/or time (confined to the lifetime of the scope in which the variable is declared.)  For concreteness let's call these confined variables and confined lambdas.  
>>> 
>>> It is possible to create mechanisms to reify and enforce these various types of confinement (we've discussed them extensively internally.)  However, they add nontrivial complexity to the language.  
>>> 
>>> In order to justify the complexity that these new features would generate, there needs to be a compelling use case.  When I explored this issue, I asked a number of people to write down a use case for this.  Every one wrote some form of:
>>> 
>>> int sum = 0;
>>> list.forEach( #{ x -> sum += x.foo() }
>>> 
>>> In a parellel world, this idiom is irretrievably broken.  (See Guy Steele's presentation "Organizing Functional Code for Parallel Execution, or, foldl considered slightly harmful.")  People will do this, they will do this without thinking, and their code will be broken.  It is very hard, even for experts, to get this right.  While Java has plenty of other opportunities to create non-thread-safe code, I am not going to create an entirely new and nearly irresistible vector for doing so.  Iteration and side-effects are how we've been trained to do things in Java, but we have to learn to do better.  
>>> 
>>> In sum, adding this feature that you (and others) want so badly seems to add up to:
>>> - Lots of new complexity (e.g., confined variables and lambdas)
>>> - Nearly irresistible new areas for making errors
>>> - All to prop up a broken programming model (iterations + side effects)
>>> 
>>> To quote a past president: "Wouldn't be prudent."  At this point in time, it feels the risk and complexity outweighs the benefit.  I would rather put the effort into supporting map/reduce-y idioms in the libraries.  The above block is much better as:
>>> 
>>> list.reduce( #{ x, y => x+y } )
>>> or
>>> list.reduce(Reducers.SUM)
>>> or
>>> list.sum()
>>> 
>>>> this thought leads to another one. in your example you are assuming
>>>> that the foreach method will get into multithreaded stuff.
>>> 
>>> Not quite: s/will/may/  
>>> 
>>>> your foreach impl coud say:
>>>> 
>>>> public void forEach(@Multithreaded Foreachable block) {...}
>>> 
>>> We explored these issues in the JSR-166 expert group several years ago.  The basic problem is that we don't have a way of reliably expressing "reference to thread-safe object" in the type system (either statically or dynamically), so these things revert to being documentation rather than type assertions.  
>>> 
>>>> for that matter, you should probably name your loop method forEachMT.
>>> 
>>> Under consideration.  Though my intuition is that (a) such a convention will be hard to stick to and (b) in five years it will probably look silly.  
>> 
>>