[RFC] 4950302: (fs spec) Random write produces different results on Linux vs Windows from same .class
David Lloyd
david.lloyd at redhat.com
Wed Nov 20 00:52:10 UTC 2019
On Tue, Nov 19, 2019 at 6:07 PM Brian Burkhalter
<brian.burkhalter at oracle.com> wrote:
>
> This issue [1] is caused by a Linux bug documented in the pwrite(2) man page [2]:
>
> POSIX requires that opening a file with the O_APPEND flag should have
> no effect on the location at which pwrite() writes data. However, on
> Linux, if a file is opened with O_APPEND, pwrite() appends data to the
> end of the file, regardless of the value of offset.
>
> One possible fix is [3] where if O_APPEND is set, it is unset to make the pwrite() call and then reset. This of course could be problematic if another thread were writing to the same file descriptor simultaneously: not all uses of IOUtil.write() use exclusion locks. But maybe this is better than the current situation? It was verified to fix the problem shown by the reproducer included in the issue description.
Wow, that's super unfortunate.
What if, on Linux (newer kernels only unfortunately), instead of using
O_APPEND at open, instead use pwritev2() and pass RWF_APPEND when
writing if the channel was opened for append? This would eliminate
the extra system calls. If you pass a -1 as the offset to that call
with that flag set, then the file's offset is updated.
You'd have to add a test (to class init?) to try the call; if it
returns ENOSYS (likely for kernels before 4.6) or EINVAL (kernels
before 4.16) then a flag could be set and the dispatcher could fall
back to the less-optimal behavior (sort of similarly to how sendfile
is detected). Alternatively, the uname(2) function could be used to
detect that kernel version is at least 4.16, but I guess that might be
the only place where it would be done that way.
At least then there would be a chance for optimal behavior.
--
- DML
More information about the nio-dev
mailing list