JEP 321: HTTP Client (Standard)

Thu Dec 7 00:19:12 UTC 2017

On 7 December 2017 at 06:58, David Lloyd <david.lloyd at redhat.com> wrote:

> On Wed, Dec 6, 2017 at 6:31 AM, Chris Hegarty <chris.hegarty at oracle.com>
> wrote:

[snip]

> > The primary motivation for the use byte buffers, as described above, is
> > to provide maximum flexibility to an implementation to avoid copying
> > and buffering of data.
>
> Is my reading of the API correct in that flow control is happening in
> terms of buffers, not of bytes?  Could there ever be any odd effects
> from very small or very large buffers passing through the plumbing?
>

Your reading is correct. In my experience, it varies wildly by use case. In
the technology I work with (Akka), we do exactly this, we have ByteStrings
(essentially immutable byte buffer), and flow control is done on the number
of ByteStrings, not the number of bytes in those strings. Generally, in
reading, the size of ByteStrings is limited by a configurable amount, for
example, 8kb. And then Akka's flow control will, by default, keep up to 8
ByteStrings in flight in its asynchronous processing pipeline. So we have a
maximum buffer size of 64kb per connection. For most HTTP use cases, this
is fine, something reading an HTTP message body might be collecting those
buffers up to a maximum size of 100kb by default, and then parsing the
buffer (eg, as json). So it's within the tolerances of what the amount of
memory that the user expects to use per request. If the data read in to the
buffers were very small, this would be due to the client trickle feeding
the server - care must be taken on the server to ensure that if 8kb buffers
are allocated for reads, but only a small amount of data is read, that
these large buffers are released, and the small data copied to a small
buffer.

I think where it can possible cause a problem is if for some reason
something sending data is only generating small byte buffer chunks, but
there's a long (and expensive) pipeline for the chunks to go through before
they get written out. This is not a use case that we see that often, but I
have seen it. The solution there is to either increase the number of
elements in flight in the stream (most reactive streams implementations
allow this to be done trivially), or to put an aggregating buffer in the
middle before the expensive processing (again, streaming implementations
such as RxJava, Reactor or Akka streams provide straight forward stages to
do this).

One issue that I'm not sure about is the consequences of using direct
buffers with regards to garbage collection. If direct buffers are never
copied onto the heap, and are never reused, lets say you're just
implementing a proxy passing buffers through from one connection to
another, then the heap usage of the application may be very small, and this
could mean that garbage collection is done very infrequently. As I
understand it, this can result in direct buffers staying around for a long
time, and possibly causing the system to run out of memory. Does anyone
have any experience with that, and how to deal with it? We don't generally
have this problem in Akka because we always copy our buffers onto the heap
into an immutable structure, so even if we do use direct buffers and don't
reuse them, our heap usage grows at least as fast as our direct buffer
usage grows, which means total memory usage won't exceed twice the size of
the heap since eventually garbage collection will clean both up.

>
> --
> - DML
>

-- 
*James Roper*
*Senior Octonaut*

Lightbend <https://www.lightbend.com/> – Build reactive apps!
Twitter: @jroper <https://twitter.com/jroper>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/net-dev/attachments/20171207/48a22390/attachment-0001.html>