Recent Java 9 commit (e5b66323ae45) breaks fsync on directory

Uwe Schindler uschindler at apache.org
Tue May 12 18:59:21 UTC 2015


Hallo Alan,

I just wanted to come back to this issue, because there was no further communication recently regarding the behavior of Java 9 with opening a FileChannel on a directory to fsync the directory metadata. Unfortunately, this would break the improved data safety after commits to Apache Lucene indexes. This would affect many applications like Apache Solr and Elasticsearch that rely on fsyncing the metadata on UNIX systems (Linux, Solaris, MacOSX). Recently Elasticsearch also started to use the same approach for its transaction log! Because we (Apache Lucene) use atomic rename functionality to "publish" commits, losing the directory metadata after a power failure loses all data in the commit done before the failure. With Java 7 and Java 8 we already did extensive tests with remote controlled power plugs switching a test machine on and off and validating that index was intact. This is no longer working with Java 9 because of the change.

Our question now: The discussion was, to allow maybe another OpenOption to do such special stuff, that is important for other databases, too (I assume, Apache Derby, HSQLDB or other databases written in Java would like to do similar things). Is there anything we can do to make a proposal for a new API, like starting a JEP, opening a bug report,... I would take the opportunity to get involved into the OpenJDK project to help and bring this forward.

Maybe instead of complex open options, we should simply add a new method to the Files class: Files.force/fsync(Path fileOrDir, boolean metadata) that does the right thing depending on the file / operating system?

The Java 7 / Java 8 approach we use at the moment is a bit of undocumented hack already (guarded in a try/catch), because some systems like Windows does not allow fsync on directories (Windows already ensure that the metadata is written correctly after atomic rename). On the other hand, MacOSX looks like ignoring fsync requests completely - also on files - if you don't use a special fnctl. So adding an API that works around the different operating system specialties would be very good.

Uwe

-----
Uwe Schindler
uschindler at apache.org 
ASF Member, Apache Lucene PMC / Committer
Bremen, Germany
http://lucene.apache.org/

> -----Original Message-----
> From: nio-dev [mailto:nio-dev-bounces at openjdk.java.net] On Behalf Of
> Uwe Schindler
> Sent: Friday, January 09, 2015 7:56 PM
> To: 'Alan Bateman'; nio-dev at openjdk.java.net
> Cc: rory.odonnell at oracle.com; 'Balchandra Vaidya'
> Subject: RE: Recent Java 9 commit (e5b66323ae45) breaks fsync on directory
> 
> Hi Alan,
> 
> Thank you for the quick response!
> 
> The requirement to fsync on the directory from Java came already quite
> often on the web (also before Java 7 release - but, before Java 7 it was
> impossible to do from Java code).
> 
> This is one example from before Java 7:
> http://www.quora.com/Is-there-any-way-to-fsync%28%29-a-directory-
> from-Java
> 
> Stackoverflow has some questions about this, too. A famous one (ranked #1
> at Google is this one):
> http://stackoverflow.com/questions/7694307/using-filechannel-to-fsync-a-
> directory-with-nio-2
> 
> In fact this is exactly what we do in Lucene. The question here "Can I count
> on this working on all Unix platforms, in future versions of Java, and in non-
> Oracle JVMs?" is now answered -> NO.
> 
> Fsyncing on a directory is in most cases not needed for "standard java
> programs", but for those who really want to do this (like Lucene or Hadoop),
> maybe the idea with a separate OpenOption would be an idea! In Lucene
> code we can (which has Java 7 as minimum requirement) look with reflection
> for the new OpenOption and pass it. Unfortunately, people using currently
> released Lucene/Solr/Elasticsearch versions can no longer be sure that their
> index survives power outages, if they run it with Java 9. If we can early step
> in and test the new API, we can already release artifacts which at least "try"
> to use the new OpenOption (if available) and fall back to Java 7/Java 8
> semantics otherwise.
> 
> Personally, I would prefer to just document that opening a file channel for
> sure only works with regular files, but may fail with other types of files (think
> of directories, or /dev/xxx devices). The code like it is now was working fine
> for 2 major Java releases, so why change semantics? If somebody
> accidentially opens a directory for reading, it is perfectly fine if he gets an
> IOException a bit delayed. If one opens a block device and writes a non block
> aligned bunch of data, it will fail, too. You patch does not handle this case, it
> only tests for directories. So I think we should leave it up to the operating
> system what you can do with a "file".
> 
> About Windows: In fact, you can also open a directory with CreateFile() [1],
> but with standard flags this fails with access denied (this is what we see in
> Java 7 and Java 8). You have to pass FILE_FLAG_BACKUP_SEMANTICS as
> flags, then you can do it [2]. But FlushFileBuffers does not work, because [2]
> does not list it as "valid call" for directory file handles (because its not needed
> for Windows, in opposite to POSIX). So FileChannel#force() would still fail. In
> that case. For Linux it is important that opening directories only works with
> READ, but not with WRITE, but this is obvious.
> 
> You may also want to read Mike McCandless blog about testing this with his
> installation using remote power switches on a testing machine to test
> durability. With the current Java 7 code in Lucene he got no failures:
> http://blog.mikemccandless.com/2014/04/testing-lucenes-index-durability-
> after.html
> 
> Uwe
> 
> [1] http://msdn.microsoft.com/en-
> us/library/windows/desktop/aa363858(v=vs.85).aspx
> [2] http://msdn.microsoft.com/en-
> us/library/windows/desktop/aa365258(v=vs.85).aspx
> 
> > > We really would like to keep the possibility to fsync a directory on
> > supported operating systems. We hope that the above commit will not be
> > backported into 8u40 and 7u80 releases! In Java 9 we can discuss about
> > other solutions how to handle this:
> > > - Keep current semantics as of Java 7 and Java 8 and just fail if
> > > you really
> > want to READ/WRITE from this FileChannel? This is how the underlying
> > operatinmg system and libc handles this. You can open a file
> > descriptor on anything, file/directory/device/..., but not all
> > operations work on this descriptor, some of them throw exception/return
> error.
> > > - Add a new API for fsyncing a directory (maybe for any file type).
> > > Like
> > Files.fsync(Path)? On Windows this could just be a no-op for directories?
> > Basically something like our IOUtils.fsync() from the link above.
> > >
> > > What's you opinion and how should we proceed?
> > >
> > This use-case may need a new API, one possibility is a new OpenOption
> > (like
> > NOFOLLOW_LINKS) for opening directories. This would allow opening a
> > FileChannel to a directory and also provides somewhere to specify that
> > many of the operations may fail. Implementation-wise it also means you
> > should be able to open directories on Windows.
> >
> > Sorry the original change broken what you are doing but I'm sure you
> > understand that the unspecified behavior to allow directories be
> > opened on some platforms and have subsequent attempts to do common
> > operations (like read) fail wasn't ideal either. There are no plans to
> > back-port this change.
> >
> > -Alan



More information about the nio-dev mailing list