WatchService questions

Alan Bateman Alan.Bateman at Sun.COM
Sat Dec 5 12:48:53 PST 2009


John Hendrikx wrote:
> Hi, I'm making good use of the WatchService for a file manager type 
> program, and I'm finding that I have some questions that I cannot 
> glean from the documentation or examples.
It's good for us to get feedback from folks that are using this API in 
anger.

>
> Let me give some quick background.  I'm using the WatchService to 
> monitor a directory.  This directory sometimes is heavily modified 
> (like deleting 7000 files, copying lots of files etc).  I've been 
> having some trouble during testing with high CPU use resulting from 
> lots and lots of updates from the WatchService, which lead me to these 
> questions.
>
> 1) How should OVERFLOW be handled?  If it occurs, will it be the first 
> event returned from pollEvents, and if so, can I safely reset the key 
> after handling it?  If not the first event, would it be wise to scan 
> for an OVERFLOW and then just reset the key?
Are you seeing OVERFLOW events?  Our implementations are reasonably 
efficient at draining the events from the kernel so I would expect it to 
be very rare that the overflow is caused by reaching the kernel limits. 
There is a second limit in the watch service that is another source of 
OVERFLOW events. If the application doesn't retrieve the events in a 
timely manner then events will accumulate. If the same files are 
modified many times then it's not a problem because it just increments 
the modification count. Create/delete, on the other hand,  are distinct 
events and so will queue up. Our implementation limits the number of 
distinct events pending per directory to 512. That limit was chosen 
arbitrarily and unfortunately isn't yet configurable. I can't say if 
this is what you are running into but if you are seeing a lot of 
OVERFLOW events then I'll bet that is the issue. Looking at the code 
snippet below it appears that the thread that waits for keys to be 
signalled also waits for the AWT event thread to retrieve and process 
the events. Could this be changed to retrieve the events and invoked 
asynchronously on the AWT event thread (with invokeLater)?

As regards dealing with the OVERFLOW event then it is simply an 
indication that events have potentially been lost so you should refresh 
your view of the directory and simply continue processing events. It may 
not be the first event (and appear anywhere in the list). They key 
remains valid and so can be reset.

>
> 2) Is it possible to keep a directory sync'd using the WatchService or 
> are there races/timing issues that may cause updates to be lost?  Like 
> for example, when I handle OVERFLOW (by re-reading the directory), 
> what would be the correct way of making sure I'm not missing any 
> updates?  Currently I re-read the directory before resetting the key 
> again, is this sufficient?  In other words, can I rely on the 
> WatchService to keep my directory up-to-date or should I periodically 
> re-read the directory?
Yes, the watch service will keep your view of directory in sync and I 
can't think of any issues or bugs that would cause it not to be in sync. 
In other words, it shouldn't be necessary to periodically iterate over 
the directory to refresh your view. There are of course timing issue 
where you view is temporarily out of sync but that shouldn't be a 
problem. When you receive an OVERFLOW then you refresh your view and 
process any subsequent events as normal. I should say that after you 
refresh you may process a number of pending events that you will likely 
ignore -- for example, suppose a ENTRY_CREATE event is queued after an 
OVERFLOW event. When you refresh your view will see see the new file and 
so the subsequent ENTRY_CREATE event will not update your view. Does 
that make sense?

>
> 3) Is there any way to throttle the amount of updates?  Currently I 
> just delay for 1 second in the loop that keeps an eye on the 
> WatchService -- without this, the handling of all the events (during 
> Copy or Deletes) consumes so much CPU that some threads (Swing 
> updates) are starved for CPU time.  This may also be my own slow 
> handling of the incoming events, but I did notice that ENTRY_MODIFY 
> can be send a lot during copying (once for every few kB copied it 
> seems, which makes for a fun live update display...).  Are 
> ENTRY_MODIFY events consolidated if I simply donot poll as often?  
> Could I throttle only ENTRY_MODIFY?
I'm interested to know which threads are busy. Also, I'm interested to 
know if you've looked at the -verbose:gc output in case there is 
something else going on. It's platform dependent but for the platforms 
that I think you are on then there is a background thread draining the 
kernel buffers but it should be barely noticeable (even under load).
:
>
>
> I suspect my problem is that I'm handling most of the events on the 
> Swing dispatch thread, and I'm underestimating how much time that is 
> taking me (and how much events are getting generated).  It functions 
> as intended when copying large files, but when many small files are 
> involved, the load is so great that Swing can't do it's regular 
> updates any more.
Kernel will generate a lot of events during file copying but that case 
will often just cause the count for the last modification event to be 
incremented. Out of curiosity, do you actually need modification events? 
I probably don't have the full context here but if you just want to 
maintain a list of the files in a directory then the ENRTY_CREATE and 
ENTRY_DELETE events should be sufficient.

-Alan.


More information about the nio-dev mailing list