[crac] RFR: Linux file system watcher support [v2]
Radim Vansa
duke at openjdk.org
Wed May 17 12:48:14 UTC 2023
On Wed, 17 May 2023 12:30:17 GMT, joeylee <duke at openjdk.org> wrote:
>> src/java.base/linux/classes/sun/nio/fs/LinuxWatchService.java line 367:
>>
>>> 365: // wait for close or inotify event
>>> 366: try {
>>> 367: do {
>>
>> When the `poll` is followed by a C/R we call it second time, ignoring the old value. Will this forget any events?
>> Obviously all the events happening when the application is in snapshot will be lost, but I wonder whether we should queue and replay anything that's already recorded. In the future (not necessarily in this PR) it would be nice to detect if anything happened when the application was in snapshot and generate events to keep its view up to date.
>
> I think it's ok to ignore the previous poll result, in this case:
> 1. wakeup() called by other thread, and polling thread received the notify
> 2. the process begin checkpoint and block at processCheckpointRestore, forget the previous notify
> 3. the process restored, and proceed with no notify.
>
> my previous design was to auto save and reopen the inotify and socketpair, and let user control the watch keys, the only place where notify is used is during request, after restore all keys should be re-registered by user, so I thought it's ok to drop the notify.
>
> `I wonder if you have an example of code where it's useful to automatically suspend the service but leave WatchKey management up to the user. `
> I thought the watch key path is dependent on the running environment, so during restore I couldn't take over, because at restore step the path might not exist or change, so I am leaving watch keys management for users.
Let's explain the design first, please, then we can think about the lost wakeup. I wrote the comment actually before I realized that watch keys are not handled.
I understand the hesitation to manage watch keys automatically. What I am missing, though, is an example of idiomatic code where the user actually manages the watch keys during C/R - you only provide a test that checks that C/R fails when the key is left open. There I would like to demonstrate that closing the notify service automatically simplifies the code, rather than just adding a close/reopen to the watch key management.
-------------
PR Review Comment: https://git.openjdk.org/crac/pull/72#discussion_r1196459627
More information about the crac-dev
mailing list