Performance Questions and Poller Implementation in Project Loom

Fri Nov 10 00:12:51 UTC 2023

Btw, I had a chance to check out Helidon 4.0 and the results are mixed. I really like the programming model.

In a very simple (and probably not very meaningful nor scientific) test github.com/robaho/httpserver <http://github.com/robaho/httpserver> outperforms it by 2x. I couldn’t get it to work on Linux (see https://github.com/helidon-io/helidon/issues/7983 <https://github.com/helidon-io/helidon/issues/7983>), which is strange given that they have very similar non-async designs.

The robaho httpserver achieves more than 6GB/sec on Linux under the same conditions as reported in the issue. Hopefully the Helidon team checks it out and makes some recommendations or at least provides reasoning around the performance issues.

Hopefully projects like this signal the days of Java async are over. Finally back to the model that made Java so successful in the first place.

Awesome work Loom team!

> On Nov 2, 2023, at 4:05 AM, Alan Bateman <Alan.Bateman at oracle.com> wrote:
> 
> On 01/11/2023 20:13, Ilya Starchenko wrote:
>> :
>> 
>> Firstly, with the ReadPoller and WritePoller now being based on the number of hardware threads, could you please clarify the MasterPoller is used to make the polling mechanism non-blocking as well?(there's no events, so we park)
>> 
>> Secondly, from what I can gather, the Poller itself is used only to polling events(obviously) and park/depark vthread. However, the actual reading from buffers and writing to the OS buffer is performed by the virtual threads themselves, which can be executed by any platform thread. There is no performance lack in this case, because we kinda losing data locality(due to virtual threads being carried by any platform thread without affinity)? Have you considered to make all network operations to the platform thread and reserving virtual threads solely for user code execution when data is ready?
>> 
>> I apologize if I've misunderstood any aspects of the architecture, and I'm genuinely trying to gain a better understanding of how it all works.
>> 
> In this mode, the read and write pollers do not block wait for events. If there are no events then a poller will register with the master poller and park. So think of the master poller as the lender of last resort.
> 
> I think you may have missed that unparking queues the virtual thread to continue on the same carrier that the poller is mounted on. The poller yields so queues itself to continue on the same carrier. In a busy system this works quite well, less so when less busy as stealing may mean the virtual thread and poller move to other carriers.
> 
> -Alan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20231109/6e759532/attachment-0001.htm>