HttpResponse.BodyHandlers.ofFile may lose data on response with small chunk fragments
Dawid Weiss
dawid.weiss at gmail.com
Sat Dec 17 18:45:54 UTC 2022
Hello,
I've discovered a bug in the implementation of PathSubscriber in how
byte buffers are flushed/ written to the file channel. Specifically,
this method in PathSubscriber [2]:
@Override
public void onNext(List<ByteBuffer> items) {
try {
out.write(items.toArray(Utils.EMPTY_BB_ARRAY));
...
the implementation assumes out.write(ByteBuffer[]) always writes the
content of all buffers - this isn't the case [1] and the method can
write any number of bytes from input buffers.
The manifestation of this bug is that HttpClient can fail to write all
the data in a heavily chunked http response (!).
The longer story is that I discovered the above while attempting to
debug data inconsistency errors in a HttpClient fetching chunked
responses. I'm on Windows and the problem is fairly easy to reproduce.
Wireshark (packet interceptor) shows full data being transmitted,
HttpClient fails to write all of it to disk with:
httpClient.send(request, HttpResponse.BodyHandlers.ofFile(outputFile));
The bug is fairly evident - I'm not sure how it slipped undetected for
so long. The non-exhaustive FileChannel.write(ByteBuffer[]) behavior
can be demonstrated by this code:
var file = Files.createTempFile("tmp", "tmp");
try (var out = FileChannel.open(file, StandardOpenOption.WRITE,
StandardOpenOption.CREATE)) {
// Create a lot of smallish buffers.
ByteBuffer[] buffers = new ByteBuffer[100];
for (int i = 0; i < buffers.length; i++) {
buffers[i] = ByteBuffer.wrap(new byte[100]);
}
System.out.println(
"Data in buffers: " +
Arrays.stream(buffers).mapToLong(Buffer::remaining).sum());
long len = out.write(buffers);
System.out.println("Written: " + len);
System.out.println(
"Data in buffers: " +
Arrays.stream(buffers).mapToLong(Buffer::remaining).sum());
} finally {
Files.delete(file);
}
This prints the following for me on Windows:
Data in buffers: 10000
Written: 1600
Data in buffers: 8400
Clearly, PathSubscriber should have a loop to exhaust all buffers.
Seems like a trivial fix, so I don't include the patch (and I can't
file a proper Jira issue because I have no access to Jira).
Dawid
[1] https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/nio/channels/FileChannel.html#write(java.nio.ByteBuffer%5B%5D)
[2] https://github.com/openjdk/jdk/blob/master/src/java.net.http/share/classes/jdk/internal/net/http/ResponseSubscribers.java#L285
More information about the net-dev
mailing list