Yield considered harmful

Wed Jul 28 15:24:46 PDT 2010

 Hi!

On 23/07/10 08:49, Alessio Stalla wrote:
> On Fri, Jul 23, 2010 at 1:28 AM, Steven Simpson <ss at comp.lancs.ac.uk> wrote:
>> static<T>  BlockExit forEach(Iterable<T>  coll, #LoopExit(T) block);
>>
>> ...where unlabelled break/continue apply to the forEach loop, whereas:
>>
>> static BlockExit withLock(Lock lock, #BlockExit() block);
>>
>> ...where break/continue would apply to a containing loop, or would be
>> illegal.
> I really like this proposal! It allows to have both functions and
> blocks but it doesn't need two different syntaxes for each of them, it
> doesn't introduce new keywords nor it changes the meaning of existing
> ones, it doesn't need to change the type system... it would also help
> wrt. exception transparency:
>
> public class BlockExit<throws E> { ... }
>
> static <throws E> BlockExit<E> withLock(Lock lock, #BlockExit<E>() block);

It shouldn't actually be necessary to expose E on withLock.  By
accepting a BlockExit-returning object, it also accepts that the block
in its original form could throw anything, and that the compiler has
provided a block object with a method that will wrap up anything thrown
into the returned BlockExit.

However, having seen a few more examples of what people want from
closures, I see that this scheme wouldn't allow cases where a method
could accept a function type permitting both local and lexical returns. 
While BlockExit in its current form can afford to have no special type
safety in providing lexical returns (because the compiler hides all
type-specific uses of BlockExit during a lexical return), you'd want to
have visible type safety on extra methods provided to perform local
returns, e.g. BlockExit_I for int, BlockExit_F for float,
BlockExit_O<T>, etc.  That makes it messy, and it gets messier still
when the methods accepting blocks want to return non-void, as the
control-abstraction syntax can't be used anymore to hide the BlockExit
return value.

It might be better to throw BlockExits and LoopExits as unchecked
exceptions, rather than try to translate return types transparently.  A
forEach implementation might be:

static <T> void forEach(Iterable<T> coll, #for void(T) block) {
  for (Iterator<T> iter = coll.iterator(); iter.hasNext(); ) {
    try {
      block.invoke(iter.next());
    } catch (LoopExit exit) {
      if (exit.breaksLocal())
        break;
      if (exit.continuesLocal())
        continue;
      throw exit;
    }
  }
}

And that try block could be generated by the compiler if the syntax
block.(iter.next()) was used instead, so the programmer has a choice of
breaking into the exit abstraction if necessary.

A call site might expand to:

private static final Label _label_outer = ...;

int findStuff(BufferedReader in)
  throws IOException, InterruptedException {
outer:
  for (String line; (line = in.readLine()) != null; ) {
    List<String>  list = parseLine(line);
    int count = 0;
   inner:
    java.lang.function.for_VO<String> _block =
      new java.lang.function.for_VO<String>() {
        public void invoke(String s) {
          try {
            doStuff(s);
            if (umph(s))
              LoopExit.breaking(); // was break;
            if (oof(s))
              LoopExit.continuing(); // was continue;
            if (umph2(s))
              _label_outer.breaking(); // was break outer;
            if (oof2(s))
              _label_outer.continuing(); // was continue outer;
            if (argh(s))
              BlockExit.returningInt(count); // was (lexical) return count;
            if (flip(s))
              throw new EOFException();
            count++;
          } catch (IOException|InterruptedException ex) {
            BlockExit.throwing(ex);
          }
        }
      };
    try {
      try {
        forEach(list, _block);
      } catch (BlockExit _exit) {
        try {
          _exit.doThrow();
        } catch (IOException|InterruptedException|RuntimeException|Error _ex) {
          throw _ex;
        } catch (Throwable _ex) {
          throw new SomeError(...);
        }
        if (_exit.breaksTo(_label_outer))
          break outer;
        if (_exit.continuesTo(_label_outer))
          continue outer;
        if (_exit.returns())
          return _exit.intValue();
        throw new SomeError(...);
      }
    } finally {
      // The original block goes out of scope here, so there's an
      // opportunity to mark it invalid here.
      _block.dispose();
    }
  }
  return -1;
}

BlockExit and LoopExit are no longer mentioned in a method's signature,
so you can't use them to tell whether returns are local or lexical, or
whether breaks/continues are permitted, etc.  So I started to consider
different suites of functions types.  I've used the syntax #for above to
show to the compiler that:

    * Unlabelled breaks/continues are permitted:
          o They are translated to LoopExit.breaking/continuing(),
            causing a LoopExit to be thrown.
          o block.(val) is automatically wrapped in a try block to catch
            those LoopExits.
    * Labelled breaks/continues are permitted:
          o break/continue label; are translated into
            Label#breaking/continuing() at the site of the call to the
            abstraction method, causing a BlockExit to be thrown.
          o At the site of the call to the abstraction method, BlockExit
            is caught, leading to a real "break/continue label;" being
            executed.
    * Lexical returns are permitted:
          o They are translated into BlockExit.returning*(value),
            causing a BlockExit to be thrown.
          o At the site of the call to the abstraction method, BlockExit
            is caught, leading to a real "return value;" being executed.
    * Exceptions permitted in the scope enclosing the block can be
      thrown from within the block, without having to declare the
      exceptions as part of the function type.
          o The entire block is wrapped in a 'try' which calls
            BlockExit.throwing(ex), causing a BlockExit to be thrown.
          o At the site of the call to the abstraction method, BlockExit
            is caught, leading to a real "throw ex;" being executed.
    * The block becomes invalid outside of its declaration scope.
          o This makes it possible to do things like not having to
            declare the exceptions in the function's type, as the
            exception list is statically available to the compiler.  If
            such a block were to be passed out of its scope, its
            validity in throwing exceptions permitted in that scope
            would end.

A plain old # function type (with no following keyword) would promise
the same things, except for the first point about unlabelled
breaks/continues.  While #for would map to a java.lang.function.for_*
type, # would map to something else (say, block_*).  Each block_* type
could extend its corresponding for_* type, since an object of # type is
able to meet all of the requirements that an abstraction method would
place on a #for object.  You wouldn't be able to go the other way,
though, as a method expecting a # object could not cope with it
attempting a local break/continue.

Then you could have a #throw suite of types (extending plain # types). 
They would promise exception transparency, but not lexical returns,
breaks or continues.  (Not sure if there would be much call for these.)

Finally, you could have a #class suite of types.  These would provide
little more than a shorthand for anonymous classes.  Exceptions thrown
from the block would have to be explicitly declared.  Returns would
always be local.  Breaks and continues would not be permitted.  #class
types would not extend the other types, but you could permit assignment
of a #class object to one of the other types, yielding a new object
which wraps up the checked exceptions into BlockExits.

Block/LoopExits could still have the same lifetimes as before (static or
thread-local), but if they had to be derived from Throwable, you'd have
to ignore their meaningless stack traces.  Also, there's probably no
longer any benefit to having two distinct types here, as they don't
identify whether you're dealing with an iteration block or a plain block
any more.

Sorry if I'm muddying the waters, or I'm going over old ground, or I've
missed the boat - just treat it as brain vomit if you like.  Anyway,
getting back to the point of this thread, you may as well use the two
syntaxes (invocation argument or control abstraction) to distinguish
between local and lexical returns, as Stephen Colebourne mentioned in
the first place.  Oh well.

Cheers,

Steven