WatchService questions

Sun Dec 6 03:53:12 PST 2009

>>
>> Let me give some quick background.  I'm using the WatchService to 
>> monitor a directory.  This directory sometimes is heavily modified 
>> (like deleting 7000 files, copying lots of files etc).  I've been 
>> having some trouble during testing with high CPU use resulting from 
>> lots and lots of updates from the WatchService, which lead me to 
>> these questions.
>>
>> 1) How should OVERFLOW be handled?  If it occurs, will it be the 
>> first event returned from pollEvents, and if so, can I safely reset 
>> the key after handling it?  If not the first event, would it be wise 
>> to scan for an OVERFLOW and then just reset the key?
> Are you seeing OVERFLOW events?  Our implementations are reasonably 
> efficient at draining the events from the kernel so I would expect it 
> to be very rare that the overflow is caused by reaching the kernel 
> limits. There is a second limit in the watch service that is another 
> source of OVERFLOW events. If the application doesn't retrieve the 
> events in a timely manner then events will accumulate. If the same 
> files are modified many times then it's not a problem because it just 
> increments the modification count. Create/delete, on the other hand,  
> are distinct events and so will queue up. Our implementation limits 
> the number of distinct events pending per directory to 512. That limit 
> was chosen arbitrarily and unfortunately isn't yet configurable. I 
> can't say if this is what you are running into but if you are seeing a 
> lot of OVERFLOW events then I'll bet that is the issue. Looking at the 
> code snippet below it appears that the thread that waits for keys to 
> be signalled also waits for the AWT event thread to retrieve and 
> process the events. Could this be changed to retrieve the events and 
> invoked asynchronously on the AWT event thread (with invokeLater)?
Yes, I'm seeing OVERFLOW events.  I'm not processing them fast enough as 
I'm updating the UI, which admittedly isn't highly optimized.  Buffering 
them is an option, although even such a buffer could overflow.  It would 
however give me the opportunity to deal with a huge burst of events in 
my own manner: for example, clear the buffer if OVERFLOW is received and 
then insert the OVERFLOW event, or even clear the buffer when the events 
exceed a certain maximum and insert my own OVERFLOW.

Anyway, I'm pretty sure the OVERFLOW events are caused by slow retrieval 
as the UI processing involves updating a JTable, so I donot think I'm 
hitting the Kernel limit here.  The throttling of events I'm doing is 
also playing a big part in this.
>> 2) Is it possible to keep a directory sync'd using the WatchService 
>> or are there races/timing issues that may cause updates to be lost?  
>> Like for example, when I handle OVERFLOW (by re-reading the 
>> directory), what would be the correct way of making sure I'm not 
>> missing any updates?  Currently I re-read the directory before 
>> resetting the key again, is this sufficient?  In other words, can I 
>> rely on the WatchService to keep my directory up-to-date or should I 
>> periodically re-read the directory?
>
> Yes, the watch service will keep your view of directory in sync and I 
> can't think of any issues or bugs that would cause it not to be in 
> sync. In other words, it shouldn't be necessary to periodically 
> iterate over the directory to refresh your view. There are of course 
> timing issue where you view is temporarily out of sync but that 
> shouldn't be a problem. When you receive an OVERFLOW then you refresh 
> your view and process any subsequent events as normal. I should say 
> that after you refresh you may process a number of pending events that 
> you will likely ignore -- for example, suppose a ENTRY_CREATE event is 
> queued after an OVERFLOW event. When you refresh your view will see 
> see the new file and so the subsequent ENTRY_CREATE event will not 
> update your view. Does that make sense?
Yes, I think I understand -- that's also what I've been seeing and I had 
to adapt my event handling to take into account events that were already 
"handled" as part of re-reading the directory -- so it is also important 
to first initialize a WatchKey, then read the directory, not the other 
way around :)  Now that I think about it, this makes a lot of sense and 
allows implementations to keep a directory sync'd pretty easily.

What I hadn't realized is that after you receive an OVERFLOW that the 
subsequent events will form a cohesive stream of events again which can 
be used to update a directory after re-reading it.  It does make sense 
to me though to "scan" for the overflow so to speak -- either that or I 
would probably expect an implementation of WatchService to clear all 
pending events before adding the OVERFLOW.  Not only are the pending 
events useless if an overflow occurs, but clearing them would make room 
for subsequent events again (if not, you'd be in a permanent state of 
overflow...?)
>> 3) Is there any way to throttle the amount of updates?  Currently I 
>> just delay for 1 second in the loop that keeps an eye on the 
>> WatchService -- without this, the handling of all the events (during 
>> Copy or Deletes) consumes so much CPU that some threads (Swing 
>> updates) are starved for CPU time.  This may also be my own slow 
>> handling of the incoming events, but I did notice that ENTRY_MODIFY 
>> can be send a lot during copying (once for every few kB copied it 
>> seems, which makes for a fun live update display...).  Are 
>> ENTRY_MODIFY events consolidated if I simply donot poll as often?  
>> Could I throttle only ENTRY_MODIFY?
> I'm interested to know which threads are busy. Also, I'm interested to 
> know if you've looked at the -verbose:gc output in case there is 
> something else going on. It's platform dependent but for the platforms 
> that I think you are on then there is a background thread draining the 
> kernel buffers but it should be barely noticeable (even under load).
My original assessment was wrong. The Swing Event thread was not starved 
for CPU, but instead was busy handling the WatchService events in the 
invokeAndWait part.  This made it seem that the Event thread was starved 
for CPU (as other parts of the UI weren't updating anymore) while it 
really was consuming the CPU all by itself.

In effect, in the time it took me to handle 60 WatchService events, 513 
(512 + Overflow?) more had accumulated, resulting in second long delays 
for the UI to become responsive again.
> Kernel will generate a lot of events during file copying but that case 
> will often just cause the count for the last modification event to be 
> incremented. Out of curiosity, do you actually need modification 
> events? I probably don't have the full context here but if you just 
> want to maintain a list of the files in a directory then the 
> ENRTY_CREATE and ENTRY_DELETE events should be sufficient.
I do need them, as I need to know changes in file size as well.  Most of 
the problems I've been having are related to underestimating the amount 
of events that could be generated.  Now that I have a clear picture of 
how to deal with overflows and where the bottleneck is I should be able 
to solve the problems I've been having.

Thanks for your help!
--John