7169050: (se) Selector.select slow on Solaris due to insertion of POLLREMOVE and 0 events

Alan Bateman Alan.Bateman at oracle.com
Sat May 19 03:26:29 PDT 2012


Some recent changes in the Solaris kernel have exposed performance 
issues that are caused by the way that the /dev/poll based Selectors 
uses the driver. These issues have always been there, but are were 
completely invisible (until now and thanks to several people in the 
Solaris kernel and performance teams to diagnose the issues).

One issue is the batch updates to the driver. If the update contains 
more than one pollfd entry for a file descriptor then the events are 
OR'ed. To workaround this then the original Selector implementation 
inserts a POLLREMOVE event before each update. This just happens to 
work, but isn't guaranteed. The impact is that the driver is spending a 
lot of time scanning for entries that are not present. A second issue is 
that deregistration step in the Selector is writing the POLLREMOVE after 
the file descriptor has been closed, again leading to additional 
overhead in the driver. A third issue relates to file descriptors that 
are registered with an event mask of 0, again leading to more 
performance issues.

The webrev with a patch to fix these issues is here:

http://cr.openjdk.java.net/~alanb/7169050/webrev/

In summary:

1. Pending updates are queued as before except that at most one update 
is pending for a file descriptor (to eliminates the effects of OR'ing). 
A bit set is used to indicate if a file descriptor is in the update 
list. It should only be very rarely that it needs to scan the update 
list for the file descriptor.

2. The release (deregistration) no longer queues a POLLREMOVE but 
instead writes the POLLREMOVE immediately (after dropping any pending 
update for the file descriptor). This means the descriptor is removed 
from the driver before the file descriptor is closed. This is similar to 
how we did this in the epoll based Selector.

3. The batch update no longer inserts a POLLREMOVE before each update 
(not needed now). Additionally it removes the file descriptor when then 
event mask is changed to 0 (this is also something we do in the epoll 
Selector).

That's mostly it. I should explain that I went through a couple of 
iterations on this (and thanks to Joy Xiong from the Solaris Performance 
team for running benchmarks on several test builds). Previous iterations 
included writing single updates rather than in batches, and changing the 
update list into a Map that is keyed on the file descriptor. There may 
be further tuning later but for now this addresses the major issues.

-Alan


More information about the nio-dev mailing list