RFR: JDK-8337974 - ChannelInputStream::skip can use splice (linux) [v2]

Markus KARG duke at openjdk.org
Sun Oct 27 15:22:15 UTC 2024


On Sun, 27 Oct 2024 14:25:07 GMT, Markus KARG <duke at openjdk.org> wrote:

>> # Targets
>> 
>> The major target of this issue is to reduce execution time of `ChannelInputStream::skip(long)`. In particular, make `skip(n)` run noticable faster than `read(new byte[n])` on pipes and sockets in the optimal case, but don't run noticable slower in the worst case.
>> 
>> A side target of this issue is to provide unit testing for `ChannelInputStream::skip(long)`. In particular, provide unit testing for files, pipes and sockets.
>> 
>> An appreciated benefit of this issue is reduced resource consumption (in the sense of CPU cycles, Java Heap, power consumption, CO2 footprint, etc.) of `ChannelInputStream::skip(long)`. Albeit, as it is not a target, this was not acitvely monitored.
>> 
>> 
>> # Non-Targets
>> 
>> It is not a target to improve any other methods of the mentioned or any other class. Such changes should go in separate issues.
>> 
>> It is not a target to add any new *public* APIs. The public API shall be kept unchanged. All changes implied by the current improvement shall be purely *internal* to OpenJDK.
>> 
>> It is not a target to improve other source types besides pipes and sockets.
>> 
>> 
>> # Description
>> 
>> What users expect from `InputStream::skip`, besides comfort, is "at least some" measurable benefit over `InputStream::read`. Otherwise skipping instead of reading makes no sense to users.
>> 
>> For files, OpenJDK already applies an effective `seek`-based optimization. For pipes and sockets, benefits were neglectible so far, as `skip` more or less was simply an alias for `read`.
>> 
>> Hence, this issue proposes optimized implementations for `ChannelInputStream::skip` in the pipes and sockets cases.
>> 
>> 
>> # Implementation
>> 
>> The abstract idea behind this issue is to prevent transmission of skipped data into the JVM's on-heap memory in the pipes and socket cases. As a Java application obviously is not interested in skipped data, copying it into the Java heap is a waste of both, time and heap, and induces (albeit neglectible) GC stress without any benefit.
>> 
>> Hence, this pull request changes the implementation of `ChannelInputStream::skip` in the following aspects:
>> 1. On *all* operating systems, for pipe and socket channels, `skip` is implemented in C. While the data *still is* transferred form the source into the OS kernel and from the OS kernel into the JVM's off-heap memory, it is *not* transferred into the JVM's on-heap memory.
>> 2. For *Linux* pipes only, `splice` is used with `/dev/null` as the target. Data is neither transferr...
>
> Markus KARG has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Socket.getInputStream()'s JavaDocs explain behavior of skip for timeout, non-blocking, interrupt

>...this method doesn't specify how show skip behaves when a timeout is set, doesn't specify IllegalBlockingModeException when the channel is non-blocking, and doesn't specify how it behaves when the Thread is interrupted.
> In addition to skip, the other methods that may be implemented with more than one read are the transferTo, readAllBytes, readNBytes, ... all methods added since JDK 1.0. It may be that Socket's class level description or the setSoTimeout method will need wording to say that it is implementation specific as to how the configured timeout works with these methods. It may apply to the entire operation, or each read operation when implemented with multiple reads, tricky to word.
>
> The issue of skip or any of the other methods throwing IllegalBlockingModeException can probably be confined to the Socket.getInputStream's API docs where it already documents this exception as possible when reading.

Alan, sorry for being silent for weeks. I was bound in other projects.

To keep things simple, I now have added wording just for `skip`, ignoring the fact that several of other methods are missing specs, too. Looking at the class-level docs, I did not find anything to change. `skip`'s behavior is now found in the very places where `read`'s is found, which makes sense, as its implementation (besides the C-loops and using `splice` instead of `read`) is mostly a copy of `read`'s.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20489#issuecomment-2440064191


More information about the nio-dev mailing list