JEP 321 surface-level comments

Tue May 22 19:05:39 UTC 2018

On Fri, May 18, 2018 at 11:23 AM, Chris Hegarty <chris.hegarty at oracle.com>
wrote:

> Hi Chris, Tobias,
>
> Apologies for the delayed reply. I have spent a significant amount of
> time on this feedback, both prototyping and determining any potential
> impact on the API. There are a number of proposals in this email that
> address some of the comments, as well as explanations for those that do
> not require changes. It would be really helpful if you could acknowledge
> each, after which I will file JIRA issues and get them resolved.
>

Thanks for the response!

> On 08/05/18 20:34, Chris Povirk wrote:
>
>> ...
>>
>>   * I second the idea of making HttpHeaders a final class, as proposed
>>     as part of JDK-8184274
>>     <https://bugs.openjdk.java.net/browse/JDK-8184274>. While "final
>>     class" was part of that suggestion, it doesn't seem to have been
>>     specifically shot down. Tobias suggested that the goal of an
>>     extensible class was to let users plug in a lazily parsed map for
>>     HTTP 2, but he didn't seem to think this was likely to work, given
>>     the HTTP 2 spec. The pros I see for a final class, besides the usual
>>     advantages of simplicity, are: It could ensure that instances use a
>>     Map that retains key order and doesn't contain nulls. It could also
>>     implement case insensitivity, which the current implementation
>>     claims to do but doesn't. (Maybe it's relying on internal callers to
>>     do normalize strings before calling it? But that doesn't help the
>>     external caller who asks for accept-encoding instead of
>>     Accept-Encoding.)
>>
>
>
> For HTTP/2, to keep the client and server HPACK indexing tables
> consistent it is required that all header block fragments be decoded.
> HTTP/1.1, obviously, does not have such a restriction. A lazy parsing
> implementation of HTTP/1.1 headers is not likely to be common, but
> possible in the current design.
>
> HttpHeaders instances are required, by specification, to be immutable,
> but I accept that a final class, and implementation, is a stronger
> contract. Ordering of any elements of the HttpHeaders is not something
> that we want to guarantee, and would likely lead to hard to diagnose
> issues if some aspect of user code depended on it, since the HTTP
> protocol does not provide any such ordering guarantees, nor the
> iteration order of the returned map.

Yes, I see that you're right about no guaranteed ordering. I'm not sure
what I was thinking.

All that said, on balance I think it may be better to make HttpHeaders
>
final. Given the requirement for other client implementations to be
> able to create an HttpHeaders, then it will need a first-class public
> API for its construction, even though direct construction of an
> HttpHeaders is not something that we necessarily want to promote in the
> API. Rather HttpHeaders are returned from an instance of HttpRequest or
> HttpResponse.
>
>
> Proposal: Make HttpHeaders final, as follows:
>
>  1) Change from abstract to final class ( obviously )
>  2) Remove its protected constructor
>  3) Remove the "default implementation" notes ( since there will be
>     just one implementation )
>  4) Provide a public factory method:
>
>    /**
>     * Returns an HTTP headers from the given map. The given map's key
>     * represents the header name, and its value the list of header
>     * values for that header name.
>     *
>     * @param headerMap the map containing the header names and values
>     * @param filter a filter that can be used to inspect each entry in
>     *               the given map to determine if it should, or should
>     *               not, be added to the to the HTTP headers
>     * @return an HTTP headers instance containing the given headers
>     * @throws NullPointerException if any of: {@code headerMap}, a key
>     *         or value in the given map, or an entry in the map's value
>     *         list, or {@code filter}, is {@code null}
>     */
>    public static HttpHeaders of(Map<String, List<String>> headerMap,
>                                 BiPredicate<String, List<String>> filter)
>
> After much prototyping, a filter argument in the API will significant
> reduce the need to re-creating Maps just for the purpose of feeding
> into an HttpHeaders ( and a defensive copy of the Map must be performed
> by the factory to ensure structural immutability ). The above also
> addresses the nullability concern.
>

I like it. Two follow-up thoughts:

1. Would you also consider an overload that accepts only a map? The filter
parameter may be surprising to new users.

2. Two alternatives to BiPredicate<String, List<String>> are
Predicate<String> (operating only on header names) and BiPredicate<String,
String> (operating on each name-value pair independently). I'm wondering if
one of those would be more convenient for your use cases. But I am
speculating; it just seems likely that callers would want to perform
operations like "remove all headers with this name" or "remove specific
header-value pairs," rather than "remove all headers with this name if one
of them has this value." (And I see now that the existing ImmutableHeaders
uses Predicate<String>. I don't know if that's typical of the use cases you
envision.)

(And a bonus thought: I don't know which implementation you're using, but
the ImmutableHeaders implementation I'm looking at may misbehave if passed
a Map with two header names that differ only in case. (The put() for the
second one will clobber the values for the first.) It looks like you ensure
that that can't happen in any of the existing callers, but it's something
to watch out for when you open up the class to public callers. You may well
have already addressed this.)

---
>
> From the HttpHeaders class-level description:
>
>   "The methods of this class ( that accept a String header name ), and
>    the Map returned by the map method, operate without regard to case
>    when retrieving the header value."
>
> The following demonstrates the case insensitivity.
>
> $ ~/binaries/jdk-11.jdk/Contents/Home/bin/jshell
> |  Welcome to JShell -- Version 11-ea
> |  For an introduction type: /help intro
>
> jshell> import java.net.http.*;
>
> jshell> URI uri = URI.create("http://foo.com");
> uri ==> http://foo.com
>
> jshell> HttpRequest req = HttpRequest.newBuilder(uri).header("Accept-Encoding",
> "gzip, deflate").build();
> req ==> http://foo.com GET
>
> jshell> HttpHeaders headers = req.headers();
> headers ==> jdk.internal.net.http.ImmutableHeaders at d9233844 { ...
> encoding=[gzip, deflate]} }
>
> jshell> System.out.println(headers.firstValue("accept-encoding").get());
> gzip, deflate
>
>
> If you do not find that this is the case, then that's a bug. Please
> create a short example, or otherwise describe how to reproduce the
> issue you encounter.
>

I agree that the built-in HttpHeaders implementations (ImmutableHeaders and
HttpHeadersImpl) implement case-insensitivity. My concern was that users'
custom implementations might not. (Given that they needed to implement an
abstract map() method, I would expect more of them to reach for a HashMap
than a case-insensitive TreeMap.)

But your proposed changes eliminate the possibility of a case-sensitive
HttpHeaders implementation, so it eliminates my concern.

  * The HttpHeaders doc could be clearer about what it does with
>>     comma-separated header values. I assume that the caller just gets
>>     the whole string as a single value, since HttpHeaders would have to
>>     know whether a given header is comma-separated or not, but some
>>     users might expect it to be split, especially given the List<String>
>>     return type and the interchangeability of comma-separated values and
>>     multiple headers in some cases.
>>
>
>
> HttpHeaders does not do any interpretation of header values.
>
> Proposal: Add a minor specification clarification to avoid any future
> potential confusion on this point, such as:
>
>   "Header values are uninterpreted strings."
>

I would personally lean toward being more explicit. I'm imagining, for
example, that someone would read "first value of the given named (and
possibly multi-valued) header" (from firstValue(String)) and conclude that
that means the class will split a single header value on its commas.

Maybe the behavior of firstValue(String) in particular would be clearer if
the doc were phrased "the full value of the first header with the given
name?" Similarly, allValues(String) could say something like "the values of
all of the headers with the given name." These phrasings are alternatives
to the current phrasing, "the given named header," which means "all headers
that have the given name" but might sounds like it means "the single header
with the given name (possibly built by merging several headers)" (if that
makes any sense; sorry if not).

  * Could HttpClient drop its getters? They feel like implementation
>>     details of the default HttpClient implementation, not properties of
>>     an HTTP client in the abstract. (e.g., even if my alternative
>>     implementation were to support cookies, it might not use the JDK's
>>     cookie API.) It feels kind of like if Future had executor() and
>>     runnable() getters: *Many* Futures will have reasonable values for
>>     them, but not all, so I'd hope most users would avoid using them,
>>     anyway, to be safe. Or, if the getters stick around, I could see
>>     putting them on some kind of DefaultHttpClient class (to which the
>>     current HttpClient.Builder might move).
>>
>
>
> The getters are necessary and are part of the abstract client. If one
> has an HttpClient, whose construction is unknown, then it may be
> necessary to determine the client's configuration. For example, if a
> cookie handler has been configured, then the code performing exchanges
> through the client need not explicitly process cookie related headers,
> that can be left to the cookie handler. Equally, the calling code should
> be able to determine if a cookie handler is not configured, so that it
> can take appropriate action. Similar with redirects and authentication.
>
> The Executor is a little different. To provide the most flexibility and
> allow for better use of resources, to build user-level chains of
> asynchronous dependent actions, the executor is required. Again, this is
> when the construction-site and use-site of the client do not have
> explicit knowledge of each other.
>
> Providing finer grain abstractions than the existing HttpClient seems
> not worthwhile.
>

My hope had been that the executor was needed only by classes that already
know they're operating on an HttpClientImpl. But this may well not be the
case.

I can imagine that the other getters could be useful to let code fail fast
if it gets an HttpClient configured differently than expected, so I can see
value there even if we don't expect people to plug in full cookie-handling
or redirect-handling code dynamically when required.

In any case, I can understand that the existing HttpClient may be the right
pragmatic solution.

  * I'm a little confused by HttpRequest. At first, I thought it was a
>>     value type like HttpHeaders, so I was going to have the same
>>     suggestion of making it a final type (and also its builder). But the
>>     implementation appears to be mutable and to contain additional
>>     fields. It feels like the type is pulling double duty as both a
>>     value type in the public API (which could be nice to make final) and
>>     as a stateful internal type (which might contain an instance of the
>>     public type), and it's not clear if different implementations could
>>     successfully interchange HttpRequest instances. (Tobias said that
>>     he'd raised this point earlier, and it sounds like it may
>>     realistically have to stay how it is, but since this was my reaction
>>     and it turned out to match what he'd said, I wanted to at least
>>     mention it again.)
>>
>
>
> From an API perspective HttpRequest is immutable, and all
> implementations returned from the client exhibit this behaviour. If not,
> then it's a bug. I think, since you clearly looked at the
> implementation, that you noticed that the HTTP request implementation
> has some mutable state that it carries, that is required for certain
> parts of the default implementation. That is not unusual or any cause
> for alarm, and does not have any impact on the exposed state.
>

I was wondering if our differences here might hinge on the definition of
state, which is interesting
<http://james-iry.blogspot.com/2009/04/state-of-sock-tubes.html> but might
lead too far afield.

But it might be simpler than that. More concretely, what I was looking at
is that isWebSocket is mutated after construction. (I also saw some other
non-final fields, but those appear to never be modified after
construction.) This suggested to me that I couldn't create an HttpRequest
and then send it multiple times (possibly concurrently), since one
attempt's update to isWebSocket might interfere with the other. (It at
least appears that isWebSocket is only ever changed from false to true,
never the reverse.) But I haven't traced through the WebSocket APIs to see
if it's actually possible for a given HttpRequest instance to be used both
with and without web sockets. If it's not possible, then I don't see any
way to trigger a problem based on the state -- which might be what you were
getting at. And I admit that it's unlikely that anyone would use a single
instance both with an without web sockets, anyway. I certainly was reacting
more to the mere idea of a non-final field than to any concrete problem I
can produce. (And I may have assumed that Tobias's earlier comments
referred to the same field, not to some earlier bits of state that you've
since corrected. Sorry about that.)

There can be different HttpRequest implementations. For example, the
> default implementation disallows requests with the CONNECT method -
> which is used for tunneling - and there are restrictions around its use
> when performing security manager permission checks. Another
> implementation may not have such a restriction. There are a few other
> small constraints and restrictions specific to the default client that
> another implementation may choose to do differently.
>
> It is for these reasons that neither HttpRequest or its Builder are
> final. More generally, there is a clear tension between extensibility
> and finality. We favour the former to be more friendly to other
> implementations, and to support scenarios like offline testing ( and to
> a lesser extent mocking ).
>
>
>   * In the HttpRequest.Builder docs, it would be nice to call out that
>>     GET is the default.
>>
>
>
> Agreed. The fact that the default request method is GET could be made
> clearer.
>
> Proposal: to HttpRequest.Builder add:
>
>   * <p> The builder can be used to configure per-request state, like:
>   * the request URI, the request method (default is GET unless
>   * explicitly set ), specific request headers, etc.

Looks good.

Thanks again for your response and consideration.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/net-dev/attachments/20180522/ca11f776/attachment-0001.html>