RFR: 8293067: (fs) Implement WatchService using system library (macOS)

Michael Hall mik3hall at gmail.com
Sat Nov 19 13:13:40 UTC 2022



> On Nov 16, 2022, at 8:49 PM, Michael Hall <mik3hall at gmail.com> wrote:
> 
> 
> 
>> On Nov 16, 2022, at 8:44 PM, Michael Hall <mik3hall at gmail.com <mailto:mik3hall at gmail.com>> wrote:
>> 
>> 
>>> 
>>> @AlanBateman still can't reproduce any of that on my hardware, 
>>> 
>> 
>> Same using ‘make test’ besides my own modified. 
>> 
>> Passed: java/nio/file/WatchService/LotsOfEvents.java
>> 
>>> I've been backed up with other things and finally getting to look at this more closely. For testing, I tried both release and debug builds across a range of macOS releases on both x64 and aarch64. Unfortunately there is a lot of timeouts and intermittent failures and across quite a range of macOS releases (from 10.15 to 12.2).
>> 
>> Doesn’t this sort of sound like a threading/deadlock type issue? Intermittent - timeouts. Maybe your runloop concerns were well founded.
> 
> Or does this possibly go back to the earlier discussed file descriptors running out issues?
> 
> FSEvents API leaks file descriptors (KQUEUE)
> https://stackoverflow.com/questions/20311184/fsevents-api-leaks-file-descriptors-kqueue <https://stackoverflow.com/questions/20311184/fsevents-api-leaks-file-descriptors-kqueue>
I changed my modified LotsOfEvents to run 20 times and saw no indication doing lsof’s that file descriptors might be leaked.

All 20 did run successfully. Are there any suggestions as to what is being done differently for the frequent errors as opposed to when there are no failures?

Again,
WatchService: Run loop 600003a5c500 - waiting for event source...
----------System.err:(12/655)----------
java.lang.RuntimeException: Key not signalled (unexpected)
	at LotsOfEvents.drainAndCheckOverflowEvents(LotsOfEvents.java:112)
	at LotsOfEvents.testOverflowEvent(LotsOfEvents.java:84)
	at LotsOfEvents.main(LotsOfEvents.java:51)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
	at java.base/java.lang.reflect.Method.invoke(Method.java:578)
	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:312)
	at java.base/java.lang.Thread.run(Thread.java:1591)

The runloop ‘appears’ to go into a wait for something that doesn’t happen. I think I saw LotsOfEvents may have different timeout limits than the test framework.

WatchService: Run loop 60000
result: Error. Agent error: java.lang.Exception: Agent 8 timed out with a timeout of 480 seconds; check console log for any additional details

This one appears to be the testing framework issuing the timeout. 8 minutes seems more like a deadlock. Again following a run loop trace message.

It seems like you could look into the run loop part of the code. I never had a good grasp on run loops. 
Hoping that some change eliminates the errors. But being unable to reproduce the errors I don’t know that I would want to try this.



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/nio-dev/attachments/20221119/00a0f7ce/attachment.htm>


More information about the nio-dev mailing list