Data server design questions

Tom Lasseter t.lasseter at
Tue Sep 4 21:20:25 PDT 2012

It would be interesting to see Paul's code to get a better understanding of
the actual data flow and see if your concerns are justified.  I would hope
the test was more realistic and useful than you imply, but it's hard for me
to judge without seeing the code.  Paul seems to be saying that now that we
have hardware with more threads available and more efficient context
switching, it's difficult for asynchronous IO to provide a better software
solution than what can be achieved with hardware.  It would be nice to come
up with a test that you were satisfied is testing the various approaches on
the same hardware configuration.  It wouldn't surprise me as you imply in
your note that the results will depend very much on the data sizes and data
operations that are performed.  It might be that in my case, I will have to
test these two very different approaches to determine which is optimal for
my case, but learning that there is no generalization that can be or should
be made....



-----Original Message-----
From: Zhong Yu [mailto:zhong.j.yu at] 
Sent: Tuesday, September 04, 2012 6:03 PM
To: Tom Lasseter
Cc: nio-dev at
Subject: Re: Data server design questions

About the claim that traditional socket IO is 30% faster than Selector based
NIO: It's unclear how they did the benchmark. At that time I also did some
benchmark myself, and got the same 30% number. However the test case is too
simplistic. Basically, on the same machine, some clients write() data to the
server; the server does read(). Measure the throughput, the blocking version
is 30% more than the non-blocking version.

There's a critical flaw though. The server calls read(byte[]) or
read(ByteBuffer), but doesn't do anything with the data received.
That's a extremely high throughput but terribly useless server! As soon as
the server does the most basic thing with the data - reading every byte once
- the throughput drops significantly, and there's not much difference
between blocking and non-blocking any more.

If the 30% number they got is also from such extreme and unrealistic setups,
it has litter value for evaluating real server performance.

Zhong Yu

On Tue, Sep 4, 2012 at 3:04 PM, Tom Lasseter <t.lasseter at> wrote:
> I am developing a high-performance cloud data server primarily for the 
> display of large engineering and scientific datasets.  The client-side 
> is a rich Java application though a client browser is possible if the 
> GUI were limited.
> The challenge is the diverse libraries and technologies that are
> I have spent weeks analyzing and prototyping.  The decision on what 
> path to take is a complex one requiring knowledge of not just the 
> detailed software architecture of the various design possibilities, 
> but projections/predictions of how effectively they will function in a 
> heavy-load environment.
> Despite the fact that most questions on this open-jdk list are 
> low-level, I believe this is the right venue for my questions as the 
> experts who understand the code well enough to design and improve it 
> are the only ones who can expertly answer the questions I'm posing.
> A data server is pretty general, so I will be more explicit about the 
> requirements for this one.
> The data consists of ASCII files which are JSON-like key-value files 
> each of which describes an object.  Associated with each of these 
> object files are a series of binary files which contain the bulk of 
> the data.  These data objects fall into two extreme groups: 1) large 
> 3D/4D volumes from which the user wishes to extract a subset of the 
> data; 2) smaller objects of which the user may wish to completely display
many thousands.
> For the large volume files, the user is often moving a cursor through 
> the volume, so the server will need to be constantly delivering new 
> segments selected.  It makes sense to have a server thread monitoring 
> requests for this volume which may be coming simultaneously from 
> multiple users and starting worker threads to deliver the requests.  
> Client-server sockets should stay alive for some period of time and 
> the associated server work thread might be kept alive as well until the
user ceases activity.
> For the large sets of smaller object files, the options are to make 
> individual requests for each file and have the client process and 
> display the data as received, or to make a request for all the objects 
> and have the server send them sequentially on a single socket.  The 
> optimization here is a bit tricky:  the IO can be running on several 
> sockets or sequentially on a single socket, and the processing on both 
> server and client ends could be run on single or multiple threads in
either case.
> QUESTION 1: Looking at these requirements and scenarios, how would you 
> design this client-server system in general terms?
> An interesting analysis and presentation was put together by Paul Tyma:
> "Thousands of Threads and Blocking I/O: The old way to write Java 
> Servers is New again (and way better)"
> where he makes the case that asynchronous IO was developed because of 
> performance issues in creating and switching threads, but that 
> hardware improvements have since significantly reduced these problems.  
> His comparison shows that the multi-thread blocking IO is simpler and 
> more efficient than asynchronous IO.  QUESTION 2:  What are your 
> comments on Paul's analysis and has it changed since his analysis and 
> presentation (2008)?
> Looking at the technologies out there is quite confusing.  Here are 
> the main ones I've studied in detail.
> 1)      Java 7 with its outstanding NIO.2 API;  Anghel Leonard's book
> describing many of its features is amazing: there are dozens of 
> examples which can be rapidly loaded and tested and they all work (!); 
> the problem is that no one seems to be adopting the Java 7 socket 
> APIs; many such as the Netty group say it's somewhat incompatible and 
> will be adopted at a low level under the existing Netty API;  Jetty is 
> not supporting asynchronous IO at all (I don't believe); QUESTION 3:  
> What are the issues with Java 7 socket APIs that are inhibiting their
> 2)      Apache MINA; a lightweight server with an FTP server implemented;
> uses blocking IO;
> 3)      Netty asynchronous NIO; excellent design for an event-driven
> client-server system; does not have a TCP file-server implementation 
> comparable to FTP;
> 4)      Kaazing socket gateway server; closed source, not asynchronous;
> 5)      Waarp FTP and R66 file servers; heavy-duty file server built on
> Netty by the French government; impressive and complex.
> QUESTION 4:  Are there other project and technologies I should be aware
> It seems to me that that the best solution is to use Java 7 and use 
> simple blocking IO for the application I've described.  The only issue 
> is reliability under heavy-load on a cloud server system.
> QUESTION 5: Can I rely on the cloud server itself (such as Amazon EC2) 
> to provide me all the bells-and-whistles needed such as 
> authentication, load-balancing, etc and do so reliably with the simple 
> server I'm envisioning? QUESTION 6:  Do any of the other technologies 
> discussed above provide capabilities which cannot be easily built from
> Thank you very much for any and all input!
> Tom Lasseter
> Email:  tom at
> Website:

More information about the nio-dev mailing list