stop using mmap for zip I/O
John Rose
john.r.rose at oracle.com
Tue Mar 3 04:32:33 UTC 2015
On Mar 2, 2015, at 3:12 PM, Martin Buchholz <martinrb at google.com> wrote:
>
> Our experience at Google is also that many people run into crashes when
> overwriting zip files. Our standard answer is "don't do that, then!"
> Instead of the "evil" overwriting cp program, make a temp file, then move
> it into place. More generally, follow a policy of "immutability" for file
> contents.
I wonder if it's not too late to wedge this policy into Java's basic io behaviors.
Most Java codes use FileOutputStream to write a file. We could change its
behavior to delete its output file instead of truncating. This could be fine-tuned
by various knobs (properties, callbacks, etc.). Then if the offending code uses
Java to write a file, it would no longer tickle this class of bugs.
Maybe it's just wishful thinking, but I think it's worth asking what would be
the exact compatibility expense of making such a change.
Alternatively, and less invasively, the JVM hook which maps a file
(don't recall what it is; something in NIO) could make a JVM-private
copy of the mmap-ed file. Sounds silly, but maybe there are practical
ways to limit the costs, yet still get the benefit on zip files.
Maybe the JVM can copy the file to a private directory, once per
file timestamp. Or maybe not.
Worst case is (a) mmap files are large and very seldom changing,
and (b) non-JVM tools (like /bin/cp) perform the truncate-and-write.
— John
More information about the core-libs-dev
mailing list