RFR: 8305744: (ch) InputStream returned by Channels.newInputStream should have fast path for ReadbleByteChannel/WritableByteChannel [v3]

Fri Jun 9 23:03:19 UTC 2023

On Sun, 9 Apr 2023 13:43:36 GMT, Markus KARG <duke at openjdk.org> wrote:

>> This sub-issue defines the work to be done to implement JDK-8265891 solely for the particular case of ReadableByteChannel-to-WritableByteChannel transfers, where neither Channel is a FileChannel, but including special treatment of SelectableByteChannel.
>> 
>> *Without* this optimization, the JRE applies a loop where a temporary 16k byte array on the Java heap is used to transport the data between both channels. This implies that data is first transported from the source channel into the Java heap and then from the Java heap to the target channel. For most Channels, as these are NIO citizen, this implies two transfers between Java heap and OS resources, which is rather time consuming. As the data is in no way modified by transferTo(), the temporary allocation of valuable heap memory is just as unnecessary as both transfers to and from the heap. It makes sense to prevent the use of Java heap memory.
>> 
>> This optimization omits the byte array on the Java heap, effectively omitting the transfers to and from that array in turn. Instead, the transfer happens directly from Channel to Channel reusing a ByteBuffer taken from the JRE's internal buffer management utility which also has a size of 16k. Both measures make the transfer more efficient, as just the very essential resources are used, but not more.
>> 
>> Using JMH on an in-memory data transfer using custom Channels (to prevent interception by other optimizations already existing in transferTo()) it was possible to measure a 2,5% performance gain. While this does not sound much, just as the spared 16k of Java heap doesn't, we need to consider that this also means sparing GC pressure, power consumption and CO2 footprint, and we need to consider that it could be higher when using different (possibly third-party) kinds of custom Channels (as "real I/O" between heap and OS-resources is way slower than the RAM-to-RAM benchmark applied by me). As the change does not induce new, extraordinary risks compared to the existing source code, I think it is worth gaining that 2,5+ percent. Possibly there might exist some kind of scalable Java-implemented cloud service that is bound to exactly this loop and that transports very heavy traffic, those particular 2,5+ percent could be a huge amount of kWh or CO2 in absolute numbers. Even if not now, t
 here might be in future.
>
> Markus KARG has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Putting the special casing for an output stream backed by a ChannelOutputStream in one place.

Beyond the merits of the PR itself, beware the [sunk cost falacy](https://developerexperience.io/articles/sunk-cost).
And consider the future maintenance cost of the additional code, its complexity, and interactions.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13387#issuecomment-1585113257