Effectively final

Rémi Forax forax at univ-mlv.fr
Fri Jul 29 16:27:52 PDT 2011


Hi Tim,
see below

On 07/29/2011 08:42 PM, Tim Fox wrote:
> Brian,
>
> Thanks for your reply. Comments below
>
> On 29/07/2011 19:08, Brian Goetz wrote:
>> You are right to have these concerns.
>>
>> The canonical example is:
>>
>> int fooSum = 0;
>> list.forEach(#{ x ->  sum += x.getFoo() });
>>
>> You are correct that there are purely sequential use cases that
>> benefit from this approach, which are not subject to data races.  (On
>> the other hand, it is nearly impossible to write the above
>> primitive-using code so that it is not subject to data races in a
>> parallel environment.)  We have explored approaches of capturing this
>> constraint in the language, so that we could prevent or detect when
>> such a "thread-confined" lambda is used from the wrong thread.  While
>> these are likely feasible, they add complexity.
>>
>> Examples like the above have been around for 50+ years.  However, it
>> is worth noting that they became popular in the context of a
>> sequential, uni-processor world.  Rather than expend energy and
>> introduce additional complexity to prop up an aging and increasingly
>> irrelevant programming idiom,
> I'd have to disagree that this approach is aging, the success of node.js
> and ruby event machine are good counter-examples. They both use the
> reactor pattern (i.e. a single event loop which executes everything), so
> the developer does not have to worry about concurrency concerns. This is
> a huge win in terms of simplicity for the developer.
>
> Frameworks like node and eventmachine scale over cores by spinning up
> more processes, not threads (since there's only one event loop per
> process). This is less than ideal when you want to share state between
> event loops.
>
> New frameworks like node.x https://github.com/purplefox/node.x (which is
> what I am working on) allow multiple event loops per process (typically
> one event loop per core), and then partition objects so they are "owned"
> by one of the event loops. The framework will then guarantee that all
> callbacks on those objects are always executed by the same event loop.
>
> What you get out of this is the user can write all their code as single
> threaded, but the system as a whole scales well over available cores
> without having to spin up more processes.
>
> If the framework can guarantee this code is always executed by the same
> thread, it seems wrong to force users to use AtomicReferences (or
> whatever) to co-ordinate results from different callbacks.

Then you will realize that if the computation that you have to do
dominate the read phase and write phase, you will need worker threads
(by example, the HTTP parser (even an async one) can take more time
than a read or a write)
otherwise your server will be stuck waiting because you need to
access to a particular thread which is not available (because parsing 
another request).

So you will refine again your model saying it's ok to have a context 
object (a scope)
that contains the results of each phases (read, decode, work, encode, 
write, etc)
that is passed from thread to thread because only one thread can access
to that context at a time.

This model restricts side effect to only one object and
you will be cheerful that Java lambda is not able to capture local state :)

Note that you can still refine the model above by adding work stealing
between all reader (resp. writer) threads and fork/join between worker 
thread.

>> we instead are directing our energies towards providing more modern,
>> parallel-friendly idioms, like:
>>
>>    int fooSum = list.reduce(0, Reducers.INT_SUM);
>>
>> which is more compact, more readable (once you learn what "reduce"
>> means), less error-prone, and can be parallelized by the library.
>>
>> You state that "most languages that support closures do allow capture
>> of mutable local variables."  However, most of them have
>> {little,no,bad} support for safe concurrency or parallelism.  If Java
>> were a strictly sequential language, the story might be different, but
>> that isn't the reality -- or desirable.
>>
>> We had three basic choices here:
>>   1.  Allow arbitrary mutable capture, and leave it to the user to not
>> create data races.
>>   2.  Complicate the language further by introducing a notion of
>> thread-confined lambdas, and allow mutable local capture by
>> thread-confined lambdas.
>>   3.  Prohibit mutable local capture entirely.
>>
>> To do (1) in 2011 would be foolish;
> I agree that 3) should be the default, since most people won't be using
> frameworks that guarantee access is always by the same thread. But I
> think it would be great if 2) was an option for those frameworks (and I
> think they will be a growing trend) which would benefit from a simpler
> programming style that it would allow.

Rémi



More information about the lambda-dev mailing list