thinking about proper implementation of tail calls
Remi Forax
forax at univ-mlv.fr
Sat Sep 1 06:49:58 PDT 2012
On 09/01/2012 01:26 PM, Florian Weimer wrote:
> * John Rose:
>
>> As I recall, Doug Lea noted at the 2010 JVM Language Summit that
>> tail calls would seem to allow work-stealing algorithms to be
>> implemented somewhat more cleanly or efficiently. (How's that for
>> tentative?) A worker thread goes from task to task in a data-driven
>> way, similar to the examples given (e.g., in Guy's article) of
>> linked list traversal. Currently you need an outer loop to "pump"
>> the task sequence, and each task must return to the loop to allow
>> the next task to run. If you had tail calls, you would not need to
>> force the task sequence into a form that the outer loop can
>> understand, which would allow more flexibility in organizing the
>> task sequence. More freedom in shaping the code can translate into
>> better-tuned algorithms.
> I would be surprised if tail calls give more flexibility. Usually
> it's the opposite, the mini-interpreter wins in terms of flexibility:
>
> <http://www.enyo.de/fw/notes/tail-calls.html>
This trick can work with a lexer because usually you don't keep states
between each recognized tokens, it's harder to use the same trick with
a parser or an interpreter.
> The mini-interpreter turns direct calls into difficult-to-predict
> indirect calls. This could be a drawback. But optimizing that is
> probably not much more difficult than a general implementation of hard
> tail calls.
no, I don't think so because as you said direct calls are not hard to
predict.
>
>> I have been looking for a "killer application" for tail calls on the
>> JVM. Doug's example may turn out to one. Here's another candidate:
>> Overcoming class file size limits.
> I doubt it.
>
>> Step 3b is non-trivial, and probably requires boxing on the heap, in
>> general. (Remember that you can have up to 2^16-1 locals, but only
>> 2^8-1 parameters.) In the case where there are only a few locals,
>> they can just be passed in registers as parameters.
> It also needs multiple return values or unboxed tuples, so that
> updated state can be carried around without boxing.
>
>> Step 3a would seem to require tail calls. Specifically, the act of
>> leaving one mini-method M1 to go to a block in another mini-method
>> M2 requires a hard tail call from M1 to M2, passing any live data
>> values that the M2 block requires.
> Only if you can have closures which share mutable variables (sometimes
> called "upvalues"). Otherwise, you still have marshal state updates
> to the canonical locals, using a mini-interpreter, and the individual
> state machine entries would have different function signatures.
>
> Or you have to go full CPS and have an endless parameter list, but
> that's going to run into the parameter count limit. Copying this much
> state around isn't going to be free, either. And each entry in the
> state machine would have to explicitly thread its full state, so it's
> most useful for automatically generated code.
Rémi
More information about the mlvm-dev
mailing list