RFR: 8329829: HttpClient: Add a BodyPublishers.ofFileChannel method [v3]
Volkan Yazici
vyazici at openjdk.org
Fri Jul 18 09:44:37 UTC 2025
On Thu, 17 Jul 2025 13:18:10 GMT, Jaikiran Pai <jpai at openjdk.org> wrote:
>> Volkan Yazici has updated the pull request incrementally with one additional commit since the last revision:
>>
>> Improve docs on `IndexOutOfBoundsException` thrown
>>
>> Co-authored-by: Daniel Fuchs <67001856+dfuch at users.noreply.github.com>
>
> src/java.net.http/share/classes/java/net/http/HttpRequest.java line 746:
>
>> 744: */
>> 745: public static BodyPublisher ofFileChannel(FileChannel channel, long offset, long length) {
>> 746: Objects.requireNonNull(channel, "channel");
>
> The method's javadoc should have a `@throws NullPointerException if channel is null`. Unless that's covered by a wider statement about null parameters some place else? I couldn't find it in this class.
Implemented in 8f66a32ec8b.
> src/java.net.http/share/classes/java/net/http/HttpRequest.java line 747:
>
>> 745: public static BodyPublisher ofFileChannel(FileChannel channel, long offset, long length) {
>> 746: Objects.requireNonNull(channel, "channel");
>> 747: return new RequestPublishers.FileChannelPublisher(channel, offset, length);
>
> Hello Volkan, I see that the constructor of `FileChannelPublisher` attempts to determine the file size of the channel and if an `IOException` is thrown then it gets propagated as `UncheckedIOException`. I think an alternate approach and perhaps a better one would be to specify in this new method's documentation that the `FileChannel`'s size will be first determined by this method to verify that the given `offset` and the `length` are within bounds of that size. That then allows us to add a throws clause to this method which we merely propagate from the call of `FileChannel.size()`.
Changed as suggested in 8f66a32ec8b.
> src/java.net.http/share/classes/jdk/internal/net/http/RequestPublishers.java line 433:
>
>> 431: this.channel = Objects.requireNonNull(channel, "channel");
>> 432: long fileSize = fileSize(channel);
>> 433: Objects.checkFromIndexSize(offset, length, fileSize);
>
> My preference would be to move these file size determination and the offset/length bounds check to the call site of this constructor. That way it's much more visible at the place where this behaviour is specified (by the public API).
My motivation for keeping these checks at the `FileChannelPublisher::new` was
1. `FilePublisher` in the same file follows this convention too
2. Even though `jdk.internal.net.http` is not exported, the class is `public`. I wanted to ensure `FileChannelPublisher::new` doesn't accept invalid input
@jaikiran, given these,
1. Do you still prefer me moving the checks to the `ofFileChannel`?
2. If yes, do you also want me to duplicate the checks in `FileChannelPublisher::new` too?
> src/java.net.http/share/classes/jdk/internal/net/http/RequestPublishers.java line 476:
>
>> 474:
>> 475: @Override
>> 476: public synchronized boolean hasNext() {
>
> In context of the synchronizations in this class, does a single instance of `FileChannelIterator` gets accessed concurrently? I haven't fully grasped the publisher/subscriber model through which this `FileChannelIterator` instance would be handed out and how that then ends up as a `BodyPublisher` that's accessed by the application.
[Reactive Streams Specification for JVM](https://github.com/reactive-streams/reactive-streams-jvm) states the following:
> #### Publisher
> ...
> `onSubscribe`, `onNext`, `onError` and `onComplete` signaled to a `Subscriber` MUST be signaled _serially_
> ...
> #### Subscriber
> ...
> A `Subscriber` MUST ensure that all calls on its `Subscription`'s request and cancel methods are performed _serially_
Thus we indeed don't need synchronization in our publishers and subscribers. In fact, I explicitly investigated this concern for #23096, where we eventually dropped the synchronization for `LimitingSubscriber`, and this approach was blessed by @viktorklang-ora back then. The reason I opted for synchronization in `FileChannelIterator` was:
1. `StreamIterator` (located in the same class, and used for `ofFile()`) was synchronized too and I wanted to stick to the present convention
2. `PullPublisher` (which `StreamIterator`, `FileChannelIterator`, etc. instances are passed to _per subscription_) operates using a `SequentialScheduler`, and the _scheduler_ in the name made me more skeptical about concurrency implications
But giving this a second thought, I agree that we should not implement redundant synchronization. I've removed synchronization for `FileChannelIterator` in b114fe9598e.
> src/java.net.http/share/classes/jdk/internal/net/http/RequestPublishers.java line 490:
>
>> 488: int readLength = channel.read(buffer, position);
>> 489: // Short-circuit if `read()` has failed, e.g., due to file content being changed in the meantime
>> 490: if (readLength < 0) {
>
> The `FileChannel.read(ByteBuffer, position)` API states:
>
>> If the given position is greater than or equal to the file's current size then no bytes are read.
>
> I think we might have to consider the situation where the underlying file's size has reduced and this position is now past that size and the `read()` call returning 0. Would that then end up in a situation where we have a "never ending" BodyPublisher, which publishes nothing but hasn't terminated either?
The `FileChannel.read(ByteBuffer, position)` API further states the following:
> [Method returns] The number of bytes read, possibly zero, or `-1` if the given position is greater than or equal to the file's current size
That is, the case you're describing is indicated by the `-1` return value, which we check in the next line. Plus, we already have a test for precisely the scenario that you're describing: `testFileModificationDuringPublisherRead`.
> src/java.net.http/share/classes/jdk/internal/net/http/RequestPublishers.java line 492:
>
>> 490: if (readLength < 0) {
>> 491: // We *must* throw to signal that the request needs to be cancelled.
>> 492: // Otherwise, the server will continue waiting data.
>
> The comment itself is a good idea but i think we should leave out the reference to a "server" from this comment. Perhaps something like: "Throw to signal that the request needs to be cancelled"?
Changed as suggested in 183da9d0235.
> src/java.net.http/share/classes/jdk/internal/net/http/common/Utils.java line 372:
>
>> 370:
>> 371: /**
>> 372: * {@return a new {@link ByteBuffer} instance of configured capacity for the HTTP Client}
>
> I think instead of saying "configured capacity" we should say "instance of {@link BUFSIZE} ...". This is an internal class so it should be OK to link to this constant even if the constant itself isn't documented. Linking to `BUFSIZE` makes it clear and easier to find what the actual capacity is.
Changed as requested in eacd5da23bb.
> src/java.net.http/share/classes/jdk/internal/net/http/common/Utils.java line 384:
>
>> 382: *
>> 383: * @param maxCapacity a buffer capacity, in bytes
>> 384: * @throws IllegalArgumentException if {@code capacity < 0}
>
> Typo - should have been `if {@code maxCapacity < 0}`
Fixed in da69840639e.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/26155#discussion_r2215596609
PR Review Comment: https://git.openjdk.org/jdk/pull/26155#discussion_r2215596735
PR Review Comment: https://git.openjdk.org/jdk/pull/26155#discussion_r2215596298
PR Review Comment: https://git.openjdk.org/jdk/pull/26155#discussion_r2215596074
PR Review Comment: https://git.openjdk.org/jdk/pull/26155#discussion_r2215595182
PR Review Comment: https://git.openjdk.org/jdk/pull/26155#discussion_r2215595719
PR Review Comment: https://git.openjdk.org/jdk/pull/26155#discussion_r2215594742
PR Review Comment: https://git.openjdk.org/jdk/pull/26155#discussion_r2215594608
More information about the net-dev
mailing list