Structured concurrency: TaskHandle

Fri May 12 15:01:21 UTC 2023

----- Original Message -----
> From: "Alan Bateman" <Alan.Bateman at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>, "loom-dev" <loom-dev at openjdk.org>
> Sent: Friday, May 12, 2023 2:37:31 PM
> Subject: Re: Structured concurrency: TaskHandle

> On 12/05/2023 11:39, Remi Forax wrote:
>> I really like the fact that STS.fork() does not return a Future but a TaskHandle
>> because as the JEP [1] said, the idea is to only give accesss to the resulting
>> value (get()) or the exception once join() is called. But I think this approach
>> can be refined following several axis.
>>
>> 1. I believe the method state() should also work like get()/exception(), i.e. if
>> a user call state() before calling join(), an ISE should be thrown.
>> Practically, it means that state() can not return RUNNING so the enum State can
>> be simplified with only 3 states SUCCESS, FAILED and CANCELLED.
> Having "get state" method throw ISE would be a bit strange, and I think
> reduce debugging to the toString method.

If the ISE message is something along "the computation is still running !", there is not much debugging needed.

> 
> For the most part, you should find that you either don't need the return
> from fork, or you just treat it as a Supplier and get the result when
> you know the task has completed successfully.

I would say it in the other way, you need state() to know if the task has competed successfully then you can call get() or use the handle as Supplier.
Technically, you do not need state() before calling join().
But more on than below.

> 
>> 2. I believe CANCELLED is a weird state. First, a lot of other asynchronous
>> libraries merge FAILED and CANCELLED. Then, when shutdown() is called, from a
>> user POV, the corresponding state can be either CANCELLED or FAILED(with an
>> exception InterruptedException). So as a user, getting CANCELLED as result of
>> state() is not enough to know if the task was shutdown or not. I propose to
>> remove the state CANCELLED and usee FAILED + a newly created
>> InterruptedException instead.
> 
> I wouldn't expect a non-deterministic state with TaskHandle when you
> shutdown. Maybe that comment is about when the API used Future where
> this was the case?

The non-determinism comes from the way Thread.interrupt() works. If the thread is interrupted during a blocking call or reach a blocking call an InterruptedException will be thrown. If there is no blocking call or Thread.currentThread().interrupt() is called, only the flag is positioned.
I proposed that if the flag is positioned then the state of the task should be FAILED and if exception() is called, an InterruptException should be thrown (one with no stacktrace so it can be shared).

> 
> There may be merit in dropping CANCELLED but it does create an
> inconsistency in that handleComplete is only called to handle tasks that
> completely successfully or fail. If the state were dropped then would
> you expect a storm of calls to handleComplete for failed-with-IE tasks?
> 

Taking a step back, there are two states, there are the task state and the state on the object passed to handleComplete.
The former can be RUNNING, SUCCESS, FAILED or CANCELLED, the later is only SUCCESS or FAILED.
Ìf we have two different objects, each one can have a different enum, TaskState and ResultState an everything is fine.
If we have use TaskHandle for both, switching on the state inside handleComplete has two unreachable states but the compiler does not know that.

So perhaps the solution is to have two different states.

> 
>>
>> 3. TaskHandle is a mutable "active object", storing it (by example in a queue)
>> is a common error. TaskHandle should give access to an immutable result object,
>> a record like object that represent SUCCESS (T value) | FAILED(Throwable
>> exception) instead of providing the the methods get()/exception(). With that,
>> handleComplete() should take this result object as parameter instead of
>> TaskHandle so users have less chance to store TaskHandle in a collection.
> We've explored several APIs and shapes, including Supplier<Result> and
> variations that included ADT for the Result. 

ADT in Java are not yet ready for that, the ADT we want need to be able to denote the bottom type.
  sealed interface Result<T> permits Success, Failed {}
  record Success<T>(T value) implements Result<T> { }
  record Failed(Throwable exception) implements Result<Nothing> { }  // Nothing is the bottom type

We need the bottom type otherwise the switch will never be exhaustive.
We may also need to say that the T of Result<T> is covariant, depending on how the type system around Nothing behave.

Instead of using an ADT, a simpler solution that actually works is to declare an immutable class (the same way Optional is declared) and when the pattern methods will be introduced in Java, one will be able to write
  switch(result) {
    case Result.success(T t) -> ...
    case Result.failed(Throwable exception) -> ... 
  } 

> When you work through
> examples with heterogenous and homogenous tasks then what we've got now
> isn't too bad.

For me, having two different objects, task and result, make sense because their state are different (see above).

> 
> 
>> An interesting followup question is should the TaskHandle objects should be
>> invalidated when close() is called. While it would be nice to have this
>> behavior, it means that it extends the lifetime of the TaskHandle objects so
>> i've rule out that idea.
> close does a shutdown so it will "cancel" any tasks that have not
> complete.  In any case, I don't think we can anticipate all the possible
> usages and I could imagine some assigning to a TaskHandle that is
> outside the scope for some reason or another.

yes.

Another question does the API of TaskHandle should be only available to the owner thread of the scope ?

> 
> -Alan

Rémi