Proposal for Hybrid Threading Model and simpler Async IO

Wed May 7 17:54:20 UTC 2014

Our kernel hackers implemented the thread switching, so I can't give you
too many details.  But, in principle, it isn't complicated - you have a
stack pointer, a bunch of registers and a program counter associated with
any given thread, so to do a context switch semi-manually, all you have to
do is change them to the appropriate value for another thread.  This is
drastically oversimplified, of course.

One big win in our scenario is to add the ability to state which thread you
want to switch to.  We can context switch to the thread that *should* wake
up next (e.g., the next thread to acquire a lock, or the thread that will
respond to an IO event) very quickly.  This is drastically oversimplified,
of course.

We already have this in C++.  Programmers love it, because they don't have
to write asynchronous code, and it scales very well.

To support it in Java, I'm basically making an API that does some of the
user-level scheduling, and intercepting anywhere that the JVM blocks (for
example, calls to pthread_cond_wait).  This involves no actual changes to
the JDK - you just have to write a jsig-like interposer for pthreads and
things like epoll.  I also have to write a JNI blob of my own and some
supporting APIs for the user-level scheduling, but again, those things
don't require JDK support.

I have the basic functionality in place, but I have to do some hardening /
additional testing.  The major performance / scalability concerns are
things like the fact that the default thread stack size is 1M, which is
enormous, so the big win of being able to have lots and lots of threads
might not be there because of RAM limitations; or the fact that the JDK
does a lot of spinning before acquiring locks, which is completely
unnecessary in the case where you switch directly and cheaply to a thread
when it is that thread's turn to acquire a lock, so I'll have to take it
out.

I haven't done a lot of testing, partially because my managerial
responsibilities have increased dramatically recently, and partially
because priorities shifted a bit.  I'll probably circle back to it towards
the end of the year.

Since these are changes to libc or the kernel, or additions of non-JDK
libraries, it doesn't make much sense to contribute them to OpenJDK.  The
couple of times I've brought up these topics with JDK hackers, I've gotten
the sense that there is less than full enthusiasm for them (basically, I
get responses like David's), so I haven't pushed it.

A note on doing this in the JVM: it's much harder!  My colleague Hiroshi
Yamauchi tried to add continuations in
2010<http://hiroshiyamauchi.blogspot.com/2012_10_01_archive.html>(or
so).  He gave a presentation at the JVM languages summit on it.  There
is a lot of user-controlled thread-local state to deal with the JVM, and
fixing that up is hard.  Plus, you have to make two compilers and an
interpreter aware of it.  By contrast, all of the thread-local state in
libc is stored at (or reached from) the bottom of the stack, and it is
compiler agnostic.

Jeremy

On Wed, May 7, 2014 at 12:17 AM, Joel Richard <ri.joel at gmail.com> wrote:

> Hi Jeremy,
>
> Thank you very much for sharing this with us. Would you mind to elaborate
> on your work a little bit further? I am particularly interested in answers
> for the following questions:
>
> How do you save the stack state before executing a pausing call and then
> be able to resume the same program flow on another thread? What changes
> have you applied to JNI? Can you tell us already something about the
> performance and scalability characteristic? Are there already any plans to
> contribute your work to the OpenJDK or will that remain an internal project?
>
> Thanks, Joel
>
>
> On Tue, May 6, 2014 at 11:33 PM, Jeremy Manson <jeremymanson at google.com>wrote:
>
>> FWIW, we're looking at doing this internally by implementing the whole
>> kit and kaboodle in native code.  We're going to intercept all low-level
>> blocking calls to libc so that they just result in a change to a different
>> stack.  Requires approximately no Java-level changes (unless you want
>> control over which stack you switch to, which is an easy addition).
>>
>> I spent a fair bit of time working on this last year, but had to
>> back-burner it for a while in favor of some other work.  There is a *lot*
>> of demand from our server developers, who all loathe the existing async IO
>> APIs.  We'll probably circle back to it by the end of the year.
>>
>> Jeremy
>>
>>
>> On Tue, May 6, 2014 at 6:15 AM, Florian Weimer <fweimer at redhat.com>wrote:
>>
>>> On 05/04/2014 02:19 PM, Joel Richard wrote:
>>>
>>>  Right now, InputStream.read(byte b[]) is a blocking method. Hence the
>>>> native thread waits until the byte array got filled. With my proposal,
>>>> the
>>>> underlaying blocking native method (for example
>>>> java.net.SocketInputStream#socketRead0) would not block the native
>>>> thread
>>>> anymore. Instead, it would call the C function with the _async suffix
>>>> and
>>>> continue to process another task. As soon the async operation has
>>>> completed, it can then continue with the first task (maybe even in
>>>> another
>>>> native thread).
>>>>
>>>
>>> You cannot do this without completely redesigning JNI.  You cannot
>>> resume native code on a different native thread than the one it initially
>>> ran on because that would change the set of thread-local variables, and
>>> native code is compiled with the assumption that references to thread-local
>>> variables are stable (relative to the current stack).
>>>
>>> On the Java side, you'd likely have to duplicate thread local variables,
>>> to preserve the current semantics.  Better support for coroutines would be
>>> nice, but I don't think it's prudent to attempt to provide this
>>> functionality purely at the JVM layer because of the resulting
>>> interoperability issues.
>>>
>>>
>>> --
>>> Florian Weimer / Red Hat Product Security Team
>>>
>>
>>
>