Bug in sun.nio.ch.SolarisEventPort#port_dissociate
David M. Lloyd
david.lloyd at redhat.com
Wed Jun 14 14:32:51 UTC 2017
On 06/14/2017 09:29 AM, Alan Bateman wrote:
>
>
> On 14/06/2017 14:35, David M. Lloyd wrote:
>> There's a bug in sun.nio.ch.SolarisEventPort#port_dissociate which
>> manifests as an IOException like this:
>>
>> Exception in thread "default I/O-30" java.lang.InternalError:
>> java.io.IOException: File descriptor in bad state
>> at sun.nio.ch.EventPortWrapper.release(EventPortWrapper.java:235)
>> at
>> sun.nio.ch.EventPortSelectorImpl.implDereg(EventPortSelectorImpl.java:144)
>>
>> at
>> sun.nio.ch.SelectorImpl.processDeregisterQueue(SelectorImpl.java:149)
>> at
>> sun.nio.ch.EventPortSelectorImpl.doSelect(EventPortSelectorImpl.java:75)
>> at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
>> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
>> at org.xnio.nio.WorkerThread.run(WorkerThread.java:528)
>> Caused by: java.io.IOException: File descriptor in bad state
>> at sun.nio.ch.SolarisEventPort.port_dissociate(Native Method)
>> at sun.nio.ch.EventPortWrapper.release(EventPortWrapper.java:233)
>> ... 6 more
>>
>> The problem was observed in:
>>
>> java version "1.8.0_121"
>> Solaris versions 10 and 11
>>
>> But I think it also exists in 9. The problem appears to be that the
>> Java_sun_nio_ch_SolarisEventPort_port_1dissociate function in
>> SolarisEventPort.c is checking for ENOENT but not EBADFD. I'm not
>> sure if the ENOENT check is needed (the associate variant does not
>> check for it even though it's specified to be a possible return), but
>> empirically I conclude that the EBADFD check is.
>>
> The timing here is a good as I think the JDK should move to using the
> port based Selector as the default on Solaris (it still uses the
> /dev/poll based Selector by default). Maybe JDK 10 is the right thing to
> attempt this switch.
>
> Can you say anything on when this happens? Can you distill it down to a
> small test case?
It's coming from a user so my information is limited but I can establish
that it is happening under load, and I think it corresponds to an open
socket being abruptly closed in another thread.
I am not sure whether I can get it down to a test case though. I'll see
if I can get access to a Solaris system for testing.
--
- DML
More information about the nio-dev
mailing list