8202116: (fc) FileChannel.map should ensure mapped region is backed by disk space
Colin McCabe
cmccabe at apache.org
Tue Apr 24 18:00:59 UTC 2018
Hi all,
Thanks for working on this. I agree that the earlier change which made RandomAccessFile#setPosition do preallocation was not ideal.
To give a little bit of context, I am a developer working on Apache Kafka. We make heavy use of the JVM, as well as sparse files.
It seems like the idea behind this change is to move the pre-allocation to FileChannel#map, rather than having it in RandomAccessFile#setLength. This is definitely better than having the preallocation in RandomAccessFile#setLength.
However, I would argue that the preallocation should not be done in the first place. My perspective is that there are always going to be errors that are possible when accessing a memory mapped file. For example, if there is an I/O error on the underlying file, the memory mapped region will not be accessible. In this case, POSIX specifies that the process should receive a SIGBUS when attempting to access the memory mapped region.
Fundamentally, running out of space is only one disk error among many possible disk errors. It's not clear why it should be treated differently than the rest. The only real difference I can see is that it's easy to reproduce ENOSPC in testing, and hard to reproduce EIO. Disk errors simply have to be handled when they occur.
The cost of preallocation can be quite high. In the case of Kafka, this would probably make us essentially give up on mmap because it would no longer meet our needs.
Finally, I would argue that the FileChannel#map operation itself should not mutate the file. In general, this operation is supposed to create a view of the file, not change the file.
In general, it seems like what we really need is a way of translating I/O errors on an mmaped file to exceptions. Most Java developers are not going to write signal handlers for SIGBUS. But in order to have robust software when using mmap, they may want to handle I/O errors. It would be really good if Java would make this possible. Preallocation doesn't fix this problem, and it adds other problems which make mmap itself unappealing.
regards,
Colin
On Tue, Apr 24, 2018, at 08:49, Thomas Stüfe wrote:
> On Tue, Apr 24, 2018 at 5:47 PM, Alan Bateman <Alan.Bateman at oracle.com> wrote:
> > On 24/04/2018 11:56, Thomas Stüfe wrote:
> >>
> >> :
> >> Okay. I was concerned about situations where we may run with a version
> >> of glibc - or another libc - which uses the numerical value 4 for some
> >> valid but different thing. E.g. as scetched in this patch proposal:
> >>
> >> https://lwn.net/Articles/439719/
> >>
> >> I did not do any further research though, so if you think there is no
> >> risk, I am removing my objection.
> >>
> > That article seems to be pre-date the eventual addition (in Linux 3.1) where
> > they choose a value of 4.
> >
> > In any case, I've looked at several options and think what we have is the
> > best of a bad bunch. We can always re-visit.
> >
>
> Sure. I now think the risk is quite low.
>
> Best Regards, Thomas
>
> > -Alan
More information about the nio-dev
mailing list