Coro in Java 8

Thu Apr 28 09:41:54 PDT 2011

On 04/28/2011 04:33 PM, Charles Oliver Nutter wrote:
> On Thu, Apr 28, 2011 at 5:03 AM, Lukas Stadler<lukas.stadler at jku.at>  wrote:
>> I totally agree as well, naturally :-)
>>
>> And I do plan on spending more time on the coroutine patch again, I
>> didn't do much coding in the last months because I was occupied at
>> university.
>> The result of this is my thesis on coroutines:
>> http://ssw.jku.at/Research/Papers/Stadler11Master/
> How would you feel about leading a JSR? :)

An other idea can be to reuse the JSR 292 vehicle as Doug Lea reuse
JSR 166 to add by example fork/join.
I don't know if John (Rose) will agree but in that case you already have
a spec leader with JCP ninja skills.

 From a spec point of view, co-routine are not that big
the question for me is more is Oracle and IBM are Ok to implement coroutine.

Also because coroutine can be used to implement concurrent web server,
there is also a need for a memory model/concurrency guy.

> Seriously though, that may be the most important thing we need to do
> to get this ball rolling. I could act as interim JSR lead, but this is
> obviously not my area of expertise (and I don't know much about the
> JSR process). Eventually someone that either knows the implications of
> JVM coroutines and/or the vagaries of the JSR process would want to
> lead it.
>
>> The coroutines in the current state actually provide full stack
>> reification and can simulate continuations, albeit with suboptimal
>> performance.
>> But this is confined to two concrete methods (serialize/deserialize),
>> and can thus easily be restricted with permissions.
> Man, you need to get that patch working so I can play with it again :)
> JRuby 1.7 is going to be my "experimenting with Java 7 and beyond)
> release, so I'm keen to get back into coro experimentation (and tailc,
> if that patch could get updated too).
>
>> I would be really interested to hear what you guys think are the most
>> important characteristics for coroutines... since an implementation is
>> always a tradeoff:
>> * fast creation
>> * fast 1st activation (the first time a coroutine gets to run)
>> * fast switching
>> * fast migration from thread to thread
>> * many coroutines (~100)
>> * lots of coroutines (~ 10000)
>> * lots and lots of coroutines (~ 1000000)
> tl;dr: lots of coroutines and fast switching
>
> For me, the pain of non-coro coroutines (i.e. using Threads) is in two places:
>
> 1. The cost of spinning up and managing Thread-based coroutines (and
> coping with the inevitable GC rooting that happens)
> 2. The cost of cross-thread signaling and context-switching
>
> For 1: coro is already *vastly* cheaper than spinning up threads, so
> you've got that problem nailed. I don't think I've ever asked about GC
> rooting...if I walk away from a live-but-not-running coroutine, does
> it GC? Currently JRuby's Enumerator#next logic (which uses
> Thread-based coroutines) has to do a complicated
> WeakReference/finalization dance to safely shut down the coroutine
> thread when the Enumerator object goes away. Fibers don't even do that
> much; if a Fiber doesn't complete, it will live forever. Ouch. I think
> coroutines must be GCable and not act as a GC root.
>
> It's also already possible to spin up *way* more coroutines than
> Threads. I have heard tales of people using tens or hundreds of
> thousands of Fibers in Ruby, but it's uncommon; usually we're talking
> about one or a handful of Fibers per web request (for a Fiber-driven
> server, say). Given that, the ability to do "lots and lots" probably
> should be there, but I'm not sure even in the weirdest cases that it
> will be common.
>
> On the other hand, I talked with Jim Baker (Jython) about the
> potential for using "lots and lots" of coroutines, one per
> (language-level, not Java-level) method invocation, to do a poor-man's
> stackless language implementation. In that case we'd need as many
> coroutines as there are active method invocations on the stack, which
> could easily be tens of thousands. This is, again, a rather weird and
> specialized case.
>
> For 2: As you've seen in your benchmarks comparing JRuby's
> Thread-based Fiber with your coro-based Fiber, the signaling and
> switching mechanism we use currently is *very* slow. The best I've
> come up with is currently in JRuby, and uses two SynchronousQueue
> objects (input and output) to signal and communicate. I had a separate
> implementation that used a single SynchronousQueue and it was faster,
> but it failed to terminate gracefully (there was always either a
> sender or receiver waiting on the queue at the end).
>
> I suspect the overhead is largely due to depending on OS thread
> scheduling to (hopefully) schedule the Fiber's thread.
>
> coro, on the other hand, does its switch in-place, without any
> dependency on OS-level thread-scheduling semantics. For my money, coro
> is already so much faster than Thread-to-Thread communication that any
> improvement you make would just be gravy.
>
> Now for the other features you offered:
>
> * fast creation
>
> coro is already so much faster than Thread, I'm quite satisfied here
> already. But if we get into "lots and lots" we'd definitely see the
> impact of creation (presumably "lots and lots" would mean many
> short-lived coroutines).
>
> * fast 1st activation (the first time a coroutine gets to run)
>
> How much of a difference are we talking in the best and worst cases?
> > From what I've seen, there are two use cases for Ruby Fibers: as
> one-shot (or "few-shot") continuations to do cross-call flow control;
> and as generators. The first case would see a lot of benefit from
> faster 1st invocation; the second would see less.
>
> * fast switching
>
> Again, already so much faster than Thread-to-Thread that I'm happy.
> And again, "lots and lots" implies there would be a lot of
> context-switching too...
>
> * fast migration from thread to thread
>
> Currently Ruby's Fiber is not migratable across threads. In JRuby's
> current impl, we could easily migrate Fibers across threads with no
> additional cost (i.e. it would remain as slow as it is regardless of
> migration). If it's possible to make migration across threads fast
> without sacrificing any same-thread performance, I'd say do it. But I
> see this as the *least* important feature on the list...same-thread
> "microthreading" is going to be *far* more common.
>
> Now this *would* play into the "functional concurrency" model I talked
> about earlier...ideally we'd be able to pass coroutines to worker
> threads to simulate an M:N threading model. That would allow us "lots
> and lots" of active calculations to be spread across a few threads,
> rather than the current complicated gymnastics required to make
> threads gracefully chew on a dataset. I picture an ideal future where
> it's so cheap to spin up and run coroutines that you'd spin up one for
> every (nontrivial) calculation you want to do in parallel, and then
> dump them into an Executor to be run. Because they're coroutines, the
> Executor could at will suspend any of them, or just run each to
> completion before grabbing another. Call it the "coroutine as
> pausable/resumable job" model.
>
> * many coroutines (~100)
> * lots of coroutines (~ 10000)
> * lots and lots of coroutines (~ 1000000)
>
> The cases for "lots and lots" start to sound very domain-specific,
> don't they? If supporting "lots and lots" degrades any other features,
> it should be a configurable option, e.g. -XX:+LotsAndLotsOfCoroutines.
>
>> I'll soon fork of a version that only supports lots of coroutines,
>> because this makes a lot of things much easier (it doesn't require the
>> whole copy to/from stack logic).
>>
>> And I also think that a jsr as a rallying point would help a lot - I'll
>> see what it will take to create one.
> I've been a poor Java contributor since I never got on the 292 JSR
> after leaving Sun (I couldn't be on it while there), but I promise if
> any other language/VM-related JSRs come up I will leap right on them.
> And I officially offer my help in making a JSR happen (as much as my
> time permits, of course).
>
> - Charlie

I remember that at last JVM Summit everybody, Josh and Neal included,
agree that coroutine should be in Java.

By coroutine, I mean thread sticky coroutine, thread migration can be 
added later,
the most important thing is to have lots of coroutines i.e 
pausable/resumable task
with fast micro-scheduling.

Lukas, if you refresh your patch, we can come at the next JVM Summit in 
July with
a web server that use co-routine with nio/asyncio plus some examples
of yield/iterator/generator.

Rémi