HTTP client API

Fri Oct 28 11:28:38 UTC 2016

Hi Tobias

Thanks for the feedback (and thanks to Anthony for answering some of the 
questions)
My thoughts are below.

On 24/10/2016, 20:33, Tobias Thierer wrote:
>
> Hi Michael and others -
>
>
> Thanks for publishing the latest HTTP client API docs 
> <http://cr.openjdk.java.net/%7Emichaelm/httpclient/api/>(already 
> slightly outdated again), as well as for publishing the current draft 
> code in the sandbox repository!
>
>
> Below is some concrete feedback, questions and brainstorming on how to
>
> (a) increase the usefulness or
>
> (b) decrease the semantic weight
>
> of the API. Note that most of this is driven only by inspection of the 
> API and some brief exploration of the implementation code, not (yet) 
> by a substantial effort to write proof of concept client applications. 
> I’d love if I could help make this API as useful to applications as 
> possible, so I’d appreciate your feedback on how I can best do that 
> and what the principles were that guided your design choices.
>
>
> 1.) The HttpRequest.BodyProcessor 
> <http://cr.openjdk.java.net/%7Emichaelm/httpclient/api/java/net/http/HttpRequest.BodyProcessor.html>and 
> HttpResponse.BodyProcessor 
> <http://cr.openjdk.java.net/%7Emichaelm/httpclient/api/java/net/http/HttpResponse.BodyProcessor.html>abstractions 
> seem particularly hard to grasp / have a high semantics weight.
>
>  *
>
>     What purpose does the abstraction of a BodyProcessoraim to fulfill
>     beyond what the (simpler) abstraction of a Bodycould be?
>
>      o
>
>         Instead of describing the abstraction as a “processor” of
>         ByteBuffers / Java objects, wouldn’t it be simpler to say to
>         say that request / response bodiesare ByteBuffer / Java object
>         sources/ sinks? What is the advantage of the
>         Publisher<ByteBuffer> / Subscriber<ByteBuffer> API over plain
>         old InputStream / OutputStream based APIs?
>

The intention behind those interfaces is primarily to provide a 
conversion between ByteBuffers and higher level objects that users
are interested in. So, HttpRequest.BodyProcessor generates ByteBuffers 
for output and HttpResponse.ByteBuffer consumes them
on input. For response bodies, there is this additional implicit 
requirement of needing to do something with the incoming data
and that is why the option of discarding it exists (the details of that 
API are certainly open to discussion)

On Publisher/Subscriber (Flow) versus InputStream/OutputStream, we 
wanted a non blocking API and had originally defined
something that was quite similar to the asynchronous model used by Flow, 
but the Flow types have a nice mechanism for
managing flow control asynchronously. So, it made sense to use the 
standard API.

>      o
>
>         The term “processor” and the description of “converting
>         incoming buffers of data to some user-defined object type T”
>         is especially confusing (increases the semantic weight of the
>         abstraction) given that there is an implementation that
>         discards all data
>         <http://cr.openjdk.java.net/%7Emichaelm/httpclient/api/java/net/http/HttpResponse.BodyProcessor.html#discard-U->(and
>         its generic type is called Urather than T). A BodyProcessor
>         that has no input but generates the digits of Pi is also
>         conceivable. Perhaps call these BodySource / BodySink,
>         ByteBufferPublisher / ByteBufferSubscriber, or just Body?
>
>      o
>
>         The fact that you felt the need to introduce an abstraction
>         HttpResponse.BodyHandler whose name is similar to but whose
>         semantics are different from HttpResponse.BodyProcessor is
>         another indication that these concepts could be clarified and
>         named better.
>

We're certainly open to suggestions for improved names, but I think it 
is useful to have a way to explicitly discard a response body, 
especially since HTTP/2 has an efficient way to do that.

I am looking at the moment at simplifications of the server push API, 
and it will be helpful if the request and response body processor
types have shorter names, to help with that. So, I'll take a look at the 
possibility of using something using BodySource/BodySink.

>      o
>
>         To explore how well the abstractions “fit”, I played with some
>         draft code implementing the API on top of another one; one
>         thing I found particularly challenging was the control flow
>         progression:
>         HttpClient.send(request, bodyHandler)
>         -> bodyProcessor = bodyHandler.apply(); // called by the library
>         -> bodyProcessor.onSubscribe() / onNext()
>         because it is push based and forces an application to
>         relinquish control to the library rather than pulling data out
>         of the library.
>         Perhaps the Response BodyHandler abstraction could be
>         eliminated altogether? For example, wouldn’t it be sufficient
>         to abort downloading the body once an application thread has a
>         chance to look at the Response object? Perhaps once a buffer
>         is full, the download of the further response body could be
>         delayed until a client asks for it?
>

On the question of push versus pull, there seems to be differing 
philosophies around this, but
actually the publish/subscribe model provided by the Flow types has 
characteristics of both push and pull.
It is primarily push() as in Subscriber.onNext() delivers the next 
buffer to the subscriber. But, the subscriber
can effectively control it through the Subscription object.

There are other benefits to having the body available synchronously with 
the rest of the response object
and it is still possible to implement arbitrary streaming of response 
body data without having to buffer it.
A while ago I prototyped a response processor which delivers the body 
through an InputStream.

>      o
>
>         What’s the purpose of HttpRequest.bodyProcessor()’s return
>         type being an Optional<BodyProcessor> (rather than
>         BodyProcessor)? Why can’t this default to an empty body?
>
I think Anthony answered this
>
>      o
>
>         Naming inconsistency: HttpRequest.BodyProcessor.fromFile() vs.
>         HttpResponse.BodyProcessor.asFile(). How about calling all of
>         these of(), or alternatively renaming asFile() -> toFile() or
>         toPath()?
>

Probably should consider this question after any renaming of the 
response processor types themselves.

>      o
>
>         asByteArrayConsumer(Consumer<Optional<byte[]>> consumer): Why
>         is this an Optional? What logic decides whether an empty
>         response body will be represented as a present byte[0] or an
>         absent value?
>
>
I think Anthony answered this

> 2.) HttpHeaders: I love that there is a type for this abstraction! But:
>
>  *
>
>     Why is the type an interface rather than a concrete, final class?
>     Since this is a pure value type, there doesn’t seem to be much
>     point in allowing alternative implementations?
>
You could conceive of different implementations, with different 
tradeoffs between parsing early or lazily for instance.

>  *
>
>     The documentation should probably specify what the methods do when
>     nameis not valid (according to RFC 7230 section 3.2?), or is null.
>
>  *
>
>     Do the methods other than map() really pull their weight (provide
>     enough value relative to the semantic API weight that they introduce)?
>
>      o
>
>         firstValueAsLong() looks particularly suspect: why would
>         anyone care particularly about long values? Especiallysince
>         the current implementation seems to throw
>         NumberFormatException rather than returning an empty Optional?
>
>
Numeric header values would be fairly common I think. Long was just 
chosen as the widest possible integer type.
NumberFormatException is only thrown if the value is present but can't 
be parsed as a number.
>
> 3.) Redirect
>
>  *
>
>     Stupid question: Should Redirect.SAME_PROTOCOL be called
>     SAME_SCHEME (“scheme” is what the “http” part is called in a URL)?
>     I’m not sure which one is better.
>
I'll consider that suggestion.
>
>  *
>
>     I haven’t made up my mind about whether the existing choices are
>     the right ones / sufficient. Perhaps if this class used the
>     typesafe enum pattern from Effective Java 1st edition rather than
>     being an actual enum, the API would maintain the option in a
>     future version to allow client-configured Redirect policies,
>     allowing Redirect for URLs as long as they are within the same
>     host/domain?
>
>
As Anthony mentioned, there was a more complicated API originally, but 
it was felt to be too complicated, which
was why we settled on enums (which can be extended in future).

> 4.) HttpClient.Version:
>
>  *
>
>     Why does a HttpClient need to commit to using one HTTP version or
>     the other? What if an application wants to use HTTP/2 for those
>     servers that support it, but fall back to HTTP/1.1 for those that
>     don’t?
>
>
Answered.
>
> 5.) CookieManager
>
>  *
>
>     Is there a common interface we could add without making the API
>     much more complex to allow us both RFC 2965 (outdated, implemented
>     by CookieManager) and RFC 6265 (new, real world, actually used)
>     cookies? Needs prototyping. I think it’s likely we’ll be able to
>     do something similar to OkHttp’s CookieJar
>     <https://square.github.io/okhttp/3.x/okhttp/okhttp3/CookieJar.html>which
>     can be adapted to RFC 2965 - not 100%, but close enough that most
>     implementations of CookieManager could be reused by the new HTTP
>     API, while still taking advantage of RFC 6265 cookies.
>
>
One suggestion has been to use the low-level CookieHandler API. But, 
another option would be just to
evolve CookieManager to the latest RFCs, which sounds a lot like what 
you are suggesting.

> 6.) HttpClient.Executor
>
>   * The documentation isn’t very clear about what tasks run on this
>     executor and how a client can control HTTP traffic through a
>     custom Executor instance. What power does the current executor()
>     API provide to clients? Perhaps it would be simpler to omit this
>     API altogether until the correct API becomes clearer?
>
>
Basically, the main lever that an Executor offers to clients is the 
choice of one of the existing ExecutorService implementations
provided by java.util.concurrent.Executors which are various types of 
thread pools (or a customized one). So, you can choose
fixed size pools, or variable sized pools for example. The spec is 
deliberately vague on the allocation of tasks to threads,
to allow for different implementations, with differing degrees of 
blocking/non blocking behavior.

I hope to put out a new version of the API very soon, addressing some of 
the points I mention above.

Thanks,
Michael.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/net-dev/attachments/20161028/22b14cbd/attachment-0001.html>