Comment on fibers and async library interfaces

Fri May 24 18:29:01 UTC 2019

After thinking about how it had to be implemented under the hood, I figured
too it had to be async/non-blocking in the implementation (otherwise it
would have to spawn a thread to do I/O...).
Also documented on wiki

The implementation of the networking APIs in the *java.net
> <http://java.net> *and *java.nio.channels * packages have as been updated
> so that fibers doing blocking I/O operations park, rather than block in a
> system call, when a socket is not ready for I/O. When a socket is not ready
> for I/O it is registered with a background multiplexer thread. The fiber is
> then unpacked when the socket is ready for I/O. These same blocking I/O are
> also updated to support cancellation. If a fiber is cancelled while in a
> blocking I/O operation then it will abort with an IOException.

"In what cases will fibers not suffice?"

One probably said the same about threads back in the day too :)  The
unknown unknowns thing.

Async message passing can be one. What I mean by pure async server (very
poorly explained) is for example a telecom server where everything is
solely based on messages passed asynchronously between state machines.
There is no blocking I/O invoked at all, I wrote such a thing many years
ago (in C++ though)

 - a single mainloop using select() or epoll()
 - every implemented telco protocol adapter registered similar to nonblock
NIO selectors, but with method to detect complete-length inbound protocol
message and to decode such message.
 - all communication in business logic (like call menus - "press 1 for
xyz") on non-blocking message queues.
 - no blocking receive either (unlike Erlang).

Nature of async I/O allowed for the whole system could do call setup,
teardown, business logic and media conferencing all to run on one single
thread, with smooth media mixing. Shows the power of async processing.
One could still configure on the side how many threads, what protocol ran
in which thread, etc. , separates processing completely from threading
model.

Not a bad programming model, statemachines do well for reasoning about
concurrent events from multiple sources, as long as one avoids "state
explosion" problem (which fibers admittedly could help avoid, but purely
reactive state machines are easy to reason about as all state transitions
are expressed in the code as state tables, easy to see if one missed a
state/signal combination).

*Now at long last to the issue*

In this case, I'd probably want to integrate any protocol as completely
async.
If this was Java and I wanted to integrate say JDBC into this, I'd might
want to have an async JDBC driver where I could get the selector, add it to
the main loop of the server and then perform queries async from
statemachines, all of it happening on the same thread. So setting the
statemachine to QUERYING state and to QUERIED or similar after received
response.

The issue of using fibers here and call a blocking API is that the
statemachine would not be able to process other events while the query to
DB was going on, that call chain would get swapped out til DB returned
response, no matter what.

Now one could perhaps work around it by spawning threads instead, but I am
not quite sure how it would work out. A model where all processing is done
by message passing statemachines or protocol adapters is very simple and
clean, once used to the idea.

There are probably other such cases that one has not thought about, they
tend to turn up after a while...

The point here is that if main network entry points are implemented async
(eg. JDBC drivers) then one can easily implement reactive/statemachine
patterns on top of them while also easily add a synchronous API. The other
way around would require spawning threads, which seems needless given that
one could have the driver be async directly.

Thus as I see it now at least, one way (sync driver) has a risk of somewhat
shutting out or make more difficult perhaps rare but quite doable
implementation patterns, the other allows for both.
If driver is already async I/O, fiber has to do nothing particular but for
sync, yield when it hits the object.wait() that waits for completion of
async JDBC (if I have understood fibers correctly).

Regards
Nils Henrik Lorentzen

On Fri, May 24, 2019 at 6:56 PM Ron Pressler <ron.pressler at oracle.com>
wrote:

> There are two different issues here: async IO and async APIs (or
> programming style).
>
> Fibers do async IO automatically given a synchronous API. I don’t think I
> understand your concern about deemphasizing async APIs.
> Is a server that uses blocking APIs with fibers considered a “pure async
> server” or not? If not, why not?
> Under the covers, only async IO is used. This is the same as in Erlang and
> Go.
>
> You also write that "The opposite requires threads to be spawned and that
> defeats the purpose of async for scalability/throughput”,
> but the whole point of lightweight threads is that spawning (and blocking)
> them is cheap so that it does harm scalability/throughput.
>
> In what cases will fibers not suffice?
>
> Ron
>
>
> On May 24, 2019 at 10:11:32 AM, Nils Henrik Lorentzen (
> nils.lorentzen at gmail.com) wrote:
>
> Hi,
>
>
> I am a longtime Java developer just becoming aware of project Loom and its
> lightweight threads. It seems like an idea well worth implementing in the
> core JVM/libraries for easier making scalable server applications.
>
>
>
> From reading the proposal at
> https://cr.openjdk.java.net/~rpressler/loom/Loom-Proposal.html, I do have
> a
> few concerns though, if I have understood fibers correctly.
>
>
> Not subscribed to the list so sending this as food for thought.
>
>
> These are mainly related to that there probably will be corner cases where
> one has to write async code, and fibers just will not suffice. These cases
> might be not be known yet but will surface in the future.
>
>
> What is of concern is the statement "In addition to making concurrent
> applications simpler and/or more scalable, this will make life easier for
> library authors, as there will no longer be a need to provide both
> synchronous and asynchronous APIs for a different simplicity/performance
> tradeoff."
>
>
> I understand this is written with the best intentions, who wouldn't want
> to
> make life easier for library writers, and I have no intention to criticise
> the author on this.
>
> What I am wary of here is that this might discourage providing async APIs,
> even at the low level, which will then make it way more difficult to write
> pure async servers if need be. Or one just prefers that way of programming
> (better logs solve much of the no-proper-stacktrace issues, and better log
> capabilities are also a plus in prodution deployed systems)
>
>
> Consider JDBC as an example, one is now at long last working on providing
> async JDBC drivers that can be useful for high throughput processing and
> reactive/async apps.
>
>
> When it comes to network communication, similarily to what the proposal
> states that async/await can be easily implemented by continuations, so can
> a synchronous network driver API easily be made on top of an asynchronous
> driver. The opposite requires threads to be spawned and that defeats the
> purpose of async for scalability/throughput.
>
> Keep in mind that even for request/response protocols, the base
> communication is always async by nature. There is no blocking operation on
> an ethernet card :) Thus async operation on top of a sync driver means
> async network => sync API => thread to simulate asynchronousity => async
> application, which is a long chain for something that was asynchronous in
> the first place.
>
>
> I would argue that for essential drivers (especially proprietary ones like
> JDBC), one should always implement an async API at the base using NIO and
> then just have a generic sync wrapper on top.
>
>
> Async driver at the core does imply either spawning a thread in the driver
> for its own select() mainloop or an API for integrating NIO Selectors into
> another mainloop (eg. of an application server) but should be manageable.
>
>
> An example of this architecture is Erlang. From what I can tell, socket
> communication is non-blocking and done via message passing between
> processes. The trick (and elegance) of Erlang is that it has a "selective
> receive for messages" and from what I can tell, 'receive' is pretty much
> the only place in all of Erlang that it would suspend lightweight threads
> (probably a setjmp()/longjmp() libc call at that place in its VM).
>
>
> For an async network driver in Java would be the blocking API doing
> Object.wait()/notify() for threads. For suspend in fibers, the underlying
> sync/async wrapper implementation could continue the fiber when there is
> input (or on writeability for writes).
>
>
>
> Just raising a flag here a bit because even if it is not such now, it
> could
> become a classic case of group think where async becomes discouraged, and
> then at some point one figures one needs it anyways. Except that all APIs
> have adopted synchronous functioning and it would be even more difficult
> to
> convince someone provide async network drivers as they would argue that
> fibers should solve it, so no need for it.
>
> Lightweight threads have a bright future but hopefully not at the expense
> of tried and proven patterns for high throughput servers :)
>
>
> Kind regards,
>
> Nils Henrik Lorentzen
>
>