[Update]: JEP 249 (OCSP Stapling for TLS)
Jamil Nimeh
jamil.j.nimeh at oracle.com
Thu Jul 2 14:26:18 UTC 2015
On 07/02/2015 05:05 AM, Xuelei Fan wrote:
> sun/security/ssl/ServerHandshaker.java
> ======================================
> OCSP stapling only used for certificate-based server authentication at
> present. I was wondering, may be better to make a check before wrap
> the ServerHello OCSP extension and CertificateStatus message that
> Certificate message would be actually used.
>
> -------------------
> 860 for (CertStatusReqItemV2 csriv2 :
> 861 statReqExtV2.getRequestItems()) {
> 862 // Favor OCSP_MULTI over OCSP
> 863 statReqType = csriv2.getType();
> 864 statReqData = csriv2.getRequest();
> 865 if (statReqType == StatusRequestType.OCSP_MULTI) {
> 866 break;
> 867 }
> 868 }
>
> Client's preference (favorite choice first) should be respected. For
> type OCSP, line 863-864 prefer the last item rather than the client
> preference. And line 863-864 does not consider whether the
> OCSPStatusRequest is suitable for this server or not, just select the
> last one. As may result in ocsp stapling failure if the selected item
> is not for the right responder.
We talked about this back when the original design was done. Back then
I had gone with client-ordering as the selection criteria and the
feedback I got was to favor ocsp_multi for those clients that can do it,
since it gives a more complete set of revocation information. While the
client is ordering it's preference, I don't see in the spec any area
that says the server has to strictly adhere to the client preference.
If we want to favor ocsp_multi as a server we should be able to do that
legally. As for the discussion where we talked about this, I don't
recall if it was a conversation or in email. I'll have to check. If
folks feel strongly about strictly honoring client ordering I can make
the change.
The current code snippet doesn't strictly prefer the last item. If the
first item is OCSP_MULTI, then it will stop right there, but your point
is still true that we're not going on client ordering (as explained above).
>
> Line 865-867 does not consider whether the OCSPStatusRequest is suitable
> for this server or not, either.
>
> In general, client knows nothing about server certificates. So client
> would offer responders it trusted. The responders is not necessary
> suitable for the target server. The server side should select a
> suitable one. It is not necessary the first one or the last one. See
> also section 2.2, RFC 6961:
>
> "Servers that support a client's selection of responders using the
> ResponderID field could implement this selection by matching the
> responder ID values from the client's list with the ResponderIDs of
> known OCSP responders, either by using a binary compare of the values
> or a hash calculation and compare method."
>
> If the update of your implementation is pretty significant or it is
> unclean how we should move forward at present, I'm OK if you want to
> remove the support of client specified responders.
As implied from the above paragraph servers aren't required to support
client ResponderId selection. I don't have support for this feature
yet, neither from the client or the server. The OCSPStatusRequest has
the hooks in it to populate ResponderIds because it was fairly easy to
do. I would like to leave that code in, since providing that capability
to the client is something I want to do in the future, as well as handle
the server selection based on Responder Id.
>
> sun/security/ssl/StatusResponseManager.java
> ===========================================
> line 207-213:
>
> // It is assumed that the caller has ordered the certs in the chain
> // from end-entity to trust anchor. In order to create the CertId
> // necessary for OCSP requests the subject and issuer certs must
> // be present.
> - if (chain.length < 2) {
> + if (chain.length == 0) {
> return Collections.emptyMap();
> }
>
> This assumption is not actually correct in general. The trust anchor is
> not necessarily be included in the Certificate message. See page 48,
> section 7.4.2, RFC 5246.
I believe this chain comes in from the KeyManager via
ServerHandshaker.setupPrivateKeyAndChain(), so it should be a complete
cert chain in the correct order (or as complete as the server can do, I
suppose). The Certificate message is not a factor here.
>
> ----------------------------
> 436 List<ResponderId> responderIds;
>
> You don't actually use the client specified responders, do you? It's OK
> to me if we don't support none-empty OCSPStatusRequest. If we do not
> want to do that, I would suggest to remove the related implementation
> code for that. As would simplify the job, I think.
No, not today. In the future we will. I don't believe Apache does
ResponderId selection either. Not sure about other implementations. I
know Apache will handle extension forwarding, which we also do (and can
opt not to do with a property). I don't want to remove the code
though. Those are the hooks for the feature in the future. Same for
the OCSPStatusRequest...it has a constructor that allows population of
extensions and ResponderIds. I'd like to leave this in, even if we're
not populating those fields from the client today.
>
> -----------------------------
> 252 // Set a bunch of threads to go do the fetching
> 253 List<Future<StatusInfo>> resultList =
> 254 threadMgr.invokeAll(requestList, delay, unit);
>
> The current implementation of OCSP response getter, looks like for every
> cert in the chain, a fetch thread would be created to get the
> corresponding OCSP response. In the fetch thread, if the response is
> cached, use it; otherwise, connect to OCSP server and fetch it.
>
> Performance would be a very big concern. A connection may kick start
> many threads. Fetch threads may be a bottleneck of the server. In
> general, a server may be expected to serve 10K+ connections, but may
> only be able to hit around 10K- threads. It's likely to face DoS
> challenges if every ClientHello trigger 1+ fetch threads.
If a Server has to hit that many connections simultaneously, it's
probably already doing multi-threading to accomplish that goal, wouldn't
you think? If it cannot hit 10K threads to begin with, it seems to me
like some system-level tuning needs to be done . Second, a new thread
isn't created for each client connection. This is a thread pool, so a
core number of threads are created when the SRM is created, new ones are
made as needed, and old ones are reused whenever their task is finished
and goes into an idle state. So we not always making a new thread.
Further, time to bring the response back to the caller from the cache is
very fast, between 2-5 msec. So there would most likely be some thread
reuse on a very busy server. That very busy server is not going to be a
single-core/single-thread machine; it will be capable of servicing
multiple threads at the same time, so that should help aggregate throughput.
It of course takes significantly longer to bring the response back from
the network. I didn't get exact numbers, but in an environment where
the responders are on the same machine as the server I could get them
back in about 150msec. So in extremely rough terms it's network IO time
+ 150msec for each fetch.
>
> Another performance concern is that when there is no cache, the client
> would have to wait for the server to fetch the response from OCSP
> servers. It's easy to run into client timeout issue.
As clients do with other implementations. If the server doesn't have
the response, it has to get it and the client has to wait for it. I see
no problem here. The connection delay isn't technically the time taken
to retrieve all responses, it is the time taken to retrieve all
responses or the server fetch timeout, whichever is smaller. A server
can opt to wait 500 msec, 1000msec, or whatever the admin thinks is
appropriate given the distance of the responders and the client load.
If the timer is exceeded, the server will include any responses it has
and will not include the ones it is still retrieving. If it cannot get
any, then it doesn't even assert status_requst[_v2].
What's more, even if the server has to get 3 responses, but only has one
by the timeout threshold the remaining two fetches will continue in the
background and complete, even if the connection has gone on. This way a
future connection coming in has a greater chance of getting a cache hit.
>
> I may suggest to use a single thread to get responses from OCSP servers.
> And any TLS connection would only get responses from cache. The
> scenarios may looks like:
> 1. Per every SSLContext instance, create a scheduled single thread to
> fetch the response from OCSP servers, and cache them for each server
> certificate chains. The server certificates can be got from key manager.
> 2. For every client request, look for the response from the cache.
> 3. if responder is not cached and if we want to support none-empty
> OCSPStatusRequest, fetch the response from OCSP servers in the same
> thread, and cache the result. We may also want to add the task to the
> scheduled thread in #1. Not create new thread for every connection is
> important for SSLEngine in some circumstances.
>
> #3 is also not performance friendly. I think none-empty
> OCSPStatusRequest is not common in practice.
I'll have to think about your proposal a bit. It's a significant change
from what I'm doing, but it is an interesting alternative.
As stated above, we don't create new threads for each connection.
Threads are reused when they are idle/available.
You are correct, non-empty OCSPStatusRequests are rare. I've never seen
a client do one. I've only looked at browsers, but none of them
populate the request.
>
>
> I will stop here for the first round code review. Looking forward for
> the next round webrev.
>
> Thanks,
> Xuelei
>
>
> On 6/27/2015 11:06 PM, Jamil Nimeh wrote:
>> Hello all, I've posted an updated webrev based on comments I've received
>> so far:
>>
>> http://cr.openjdk.java.net/~jnimeh/reviews/8046321/webrev.1
>>
>> Thanks,
>> --Jamil
>>
>> On 06/18/2015 05:27 PM, Jamil Nimeh wrote:
>>> Hello all,
>>>
>>> I have a first cut at the OCSP stapling webrev posted for your review:
>>>
>>> JEP: https://bugs.openjdk.java.net/browse/JDK-8046321
>>> Webrev: http://cr.openjdk.java.net/~jnimeh/reviews/8046321/webrev.0/
>>>
>>> A couple items to note:
>>>
>>> * I'm in the process of updating the JEP with some more details. I
>>> should be done with these changes by tonight (PDT).
>>> * Missing are some of the TLS end-to-end tests. These tests have
>>> been coded and run outside the jtreg framework, but for some
>>> reason things hang in jtreg. I've included some of the supporting
>>> classes that these tests will use (CertificateBuilder.java and
>>> SimpleOCSPResponder.java) so folks could review those if they're
>>> interested. I will update the webrev and notify the list as soon
>>> as I've got the tests working in jtreg.
>>>
>>> Thanks to everyone who has helped along the way.
>>>
>>> --Jamil
>>>
>>>
More information about the security-dev
mailing list