7169050: (se) Selector.select slow on Solaris due to insertion of POLLREMOVE and 0 events
Alan Bateman
Alan.Bateman at oracle.com
Sat May 19 03:26:29 PDT 2012
Some recent changes in the Solaris kernel have exposed performance
issues that are caused by the way that the /dev/poll based Selectors
uses the driver. These issues have always been there, but are were
completely invisible (until now and thanks to several people in the
Solaris kernel and performance teams to diagnose the issues).
One issue is the batch updates to the driver. If the update contains
more than one pollfd entry for a file descriptor then the events are
OR'ed. To workaround this then the original Selector implementation
inserts a POLLREMOVE event before each update. This just happens to
work, but isn't guaranteed. The impact is that the driver is spending a
lot of time scanning for entries that are not present. A second issue is
that deregistration step in the Selector is writing the POLLREMOVE after
the file descriptor has been closed, again leading to additional
overhead in the driver. A third issue relates to file descriptors that
are registered with an event mask of 0, again leading to more
performance issues.
The webrev with a patch to fix these issues is here:
http://cr.openjdk.java.net/~alanb/7169050/webrev/
In summary:
1. Pending updates are queued as before except that at most one update
is pending for a file descriptor (to eliminates the effects of OR'ing).
A bit set is used to indicate if a file descriptor is in the update
list. It should only be very rarely that it needs to scan the update
list for the file descriptor.
2. The release (deregistration) no longer queues a POLLREMOVE but
instead writes the POLLREMOVE immediately (after dropping any pending
update for the file descriptor). This means the descriptor is removed
from the driver before the file descriptor is closed. This is similar to
how we did this in the epoll based Selector.
3. The batch update no longer inserts a POLLREMOVE before each update
(not needed now). Additionally it removes the file descriptor when then
event mask is changed to 0 (this is also something we do in the epoll
Selector).
That's mostly it. I should explain that I went through a couple of
iterations on this (and thanks to Joy Xiong from the Solaris Performance
team for running benchmarks on several test builds). Previous iterations
included writing single updates rather than in batches, and changing the
update list into a Map that is keyed on the file descriptor. There may
be further tuning later but for now this addresses the major issues.
-Alan
More information about the nio-dev
mailing list