more about - System.arraycopy(...) equivalents for ByteBuffer/FileChannel
Jeff Hain
jeffhain at rocketmail.com
Thu Nov 1 18:29:53 PDT 2012
Hello.
Resurrecting this subject, for I'm now quite done with
my implementation (*) and have more (and less approximative)
remarks.
(*)
code.google.com/p/jodk/source/browse/src/net/jodk/io/ByteCopyUtils.java
(contains some no-longer used code (mapped ByteBuffers),
but I let it around for experimentation purpose)
1) I said in previous mail:
>With the current API, you can do whatever you want,
>with only native copies (and just the required amount),
>both with ByteBuffers and FileChannels contents.
That's modulo the fact that MappedByteBuffers can't be
explicitly and portably unmapped (Bug ID: 4724038), so
you can't always use them.
Maybe having a FileChannel.unmap(MappedByteBuffer)
method wouldn't hurt more than its absence does, due to
either the performance hit of not using MBBs, or the
non-portability and hackiness of the workarounds people
have to elaborate.
===>
As a result, even when using a MBB is appropriate, i.e.
when src is a FileChannel, and dst a heap ByteBuffer or
another FileChannel, and the number of bytes to copy above
some threshold, portable code must still have an alternative,
typically involving temporary direct ByteBuffers, first
doing FileChannel.read(tmpBB), and then native copy from
tmpBB to dstBB, or FileChannel.write(tmpBB).
That's also modulo the fact that, for BB to FC copies,
if the FileChannel is not readable, you can't use MBB
(no WRITE_ONLY mapping mode). A FileChannel.isReadable()
method would allow to use MBB for readable (and writable)
channels, without risk of an exception being thrown.
===>
As a result, when src is not direct, an intermediary
temporary direct ByteBuffer is required.
That's also modulo the fact that when you copy between
a MappedByteBuffer and a FileChannel, src and dst might
share memory, but there is no API to figure out if and how,
thus you don't know if you have to copy forward or backward
(to avoid erasing bytes to copy with copied bytes).
===>
As a result, for BB to FC and FC to BB copies, in case
of overlapping, the result is undefined.
That's also modulo the fact that, depending on OS or
architecture, FileChannel.write methods don't use
overlapping-proof treatments, but either forward or
backward loops on bytes, erasing bytes to copy with
copied bytes.
On a T2300/WinXP, it only occurred for copies below
a certain threshold (around 256+ bytes), and was always
a forward copy.
On a 980X/Win7, it was more messy, I couldn't figure
out a simple logic.
===>
As a result, for FC to FC copies, if wanting to
support overlapping cases, one can't use MBBs, and
an intermediary temporary direct ByteBuffer is
required.
2) Performance remarks.
I found FileChannel.write(ByteBuffer,long) to be very slow
on WinXP/7, by a factor 15, when the ByteBuffer is "large"
(like 500Ko), even if it is direct (mapped or not), in which
case it resolves to sun.nio.ch.IOUtils.writeFromNativeBuffer
(which I supposed to always be fast)).
On Linux there was no slow down.
===>
This problem also hurts FileChannel.transferTo/transferFrom,
which I stopped using.
Writing the direct BB by chunks of 32Kio makes up for that.
If the ByteBuffer is not direct, FileChannel.write methods
causes to use an AS BIG temporary direct ByteBuffer (see
sun.nio.ch.IOUtils), and might also be slow due to the direct
ByteBuffer being large (previous problem).
===>
A work-around is to copy by chunks, using your own temporary
direct ByteBuffer to avoid creation of multiple ones
(especially if you have some local temporary instances
available already).
Intensive benches involving MBBs (namely FC to heap BB
copies) were hanging from time to time, up to nearly a second,
and then resumed at usual speed (slightly faster than temporary
direct ByteBuffer approaches).
===>
As a result, I completely disabled MBBs usage for my copies.
3) Non-performance remarks:
When copying between direct ByteBuffers, the efficient way
is to use ByteBuffer.put(ByteBuffer), but its spec says that
it's about equivalent to
"while (src.hasRemaining()) dst.put(src.get());",
i.e. that it is not suited if memory is shared (which no API
allows to figure out) and srcPos < dstPos (as raw memmory
positions), since then it could erase bytes to copy with
copied bytes.
Fortunately, in that case the implementation doesn't follow
the spec, and instead uses Unsafe.copyMemory(long,long,long),
which seems to handle overlapping.
It is also the case for the heap-to-heap case, where the
implementation uses System.arraycopy(...) (but I don't use
this put method for heap-to-heap cases).
===>
I hope that Unsafe.copyMemory(long,long,long) effectively
handles overlapping, as my tests show, and that the put
method spec will be relaxed (its excessive precision just
looks like an unfortunate way to explain what it does),
and possibly aligned with its implementation for
direct-to-direct and heap-to-heap cases, as I'm relying on
it to handle overlapping for BB to BB copies.
FileChannelImpl.truncate(long):
The bug we already talked about (the early return if
size > size(), which masks the writability check, and the
fact that size >= size() could be done instead)
prevents reworking position whether truncation occurs or not,
as the spec says (even if size > size(), position should be
set to size if it is superior, but it isn't - or is it a
spec bug?).
FileChannelImpl.map(...):
If mode is null, and assertions enabled, "assert (imode >= 0)"
fails (I have somewhere in my head the idea that in JDK assertions
should only check private code - well in a sense here it does :).
If mode is null, and channel non-writable, and assertions disabled,
NonWritableChannelException is thrown, even though the spec says
it can only be thrown if mode is READ_WRITE or PRIVATE.
The spec says that NonReadableChannelException is thrown if the
channel is not readable AND mode is READ_ONLY. The part about the
mode could be removed, to align the spec on the implementation,
for which a non-readable channel is enough for this exception
to be thrown.
FileChannelImpl.transferFrom(...):
This method can grow destination channel, but if the specified
position (in dst) is > dst.size(), it just returns 0. It looks
like a bug, as the spec says nothing about this surprising behavior.
FileChannel.write(ByteBuffer,long):
The Javadoc says that position is not updated, but if the channel
is in append mode it might be (since then we have position = size,
and this method can grow the file).
-Jeff
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20121102/d8118528/attachment.html
More information about the nio-dev
mailing list