KDF API review, round 2

Mon Nov 27 06:03:47 UTC 2017

>
>>
>> One additional topic for discussion: Late in the week we talked about 
>> the current state of the API internally and one item to revisit is 
>> where the DerivationParameterSpec objects are passed. It was brought 
>> up by a couple people that it would be better to provide the DPS 
>> objects pertaining to keys at the time they are called for through 
>> deriveKey() and deriveKeys() (and possibly deriveData).
>>
>> Originally we had them all grouped in a List in the init method. One 
>> reason for needing it up there was to know the total length of 
>> material to generate.  If we can provide the total length through the 
>> AlgorithmParameterSpec passed in via init() then things like:
>>
>> Key deriveKey(DerivationParameterSpec param);
>> List<Key> deriveKeys(List<DerivationParameterSpec> params);
>>
>> become possible.  To my eyes at least it does make it more clear what 
>> DPS you're processing since they're provided at derive time, rather 
>> than the caller having to keep track in their heads where in the DPS 
>> list they might be with each successive deriveKey or deriveKeys 
>> calls.  And I think we could do away with deriveKeys(int), too.
>
> See above - the key stream is logically produced in its entirety 
> before any assignment of that stream is made to any cryptographic 
> objects because the mixins (except for the round differentiator) are 
> the same for each key stream production round.   Simply passing in the 
> total length may not give you the right result if the KDF requires a 
> per component length (and it should to defeat (5) or it should only 
> produce a single key).
 From looking at 800-108, I don't see any place where the KDF needs a 
per-component length.  It looks like it takes L (total length) as an 
input and that is applied to each round of the PRF.  HKDF takes L 
up-front as an input too, though it doesn't use it as an input to the 
HMAC function itself.  For TLS 1.3 that component length becomes part of 
the context info (HkdfLabel) through the HKDF-Expand-Label 
function...and it's only doing one key for a given label which is also 
part of that context specific info, necessitating an init() call.  Seems 
like the length can go into the APS provided via init (for those KDFs 
that need it at least) and you shouldn't need a DPS list up-front.

As far as your (5) scenario goes, I can see how you can twiddle the 
lengths to get the keystream output with zero-length keys and large IV 
buffers.  But that scenario really glosses over what should be a big 
hurdle and a major access control issue that stands outside the KDF API: 
That the attacker shouldn't have access to the input keying material in 
the first place.  Protect the input keying material properly and their 
attack cannot be done.

I would rather see the DPS provided in the deriveKey.  It couples what 
you want out with the call that makes the object and it makes a lot more 
sense to keep those two together than try to remember where in the 
submitted list of DPS objects you are.
>
> 95% of the time this will be a call to produce a single key.  4% of 
> the time it will be a call to produce multiple keys. Only 1% of the 
> time will it need to intermix key, data and object productions. 
> Anybody who is doing that is going to write a wrapper around this 
> class to make sure they get the key and data production order correct 
> for each call.  So I'm not all that bothered by keeping the complexity 
> as a price for keeping flexibility.
>
> You could have a Key deriveKey(Key k, DerivationParameterSpec param) 
> for some things like TLS1.3 (where you can only make a single call to 
> derive key between inits) , but then you'd also need at least a byte[] 
> deriveData (Key k, DerivationParameterSpec param) and an Object 
> deriveObject(Key k, DerivationParameterSpec param).
I don't think those are necessary.  If you're just doing HKDF-Expand 
(for the HKDF-Expand-Label TLS 1.3 key derivation) then you can provide 
the input key, label and max length and any other context info that goes 
into that HkdfLabel structure...all of that would go into init().  Then 
provide the key alg and desired length via the DPS at deriveKey time.  
Any subsequent keys in the TLS 1.3 key schedule would need a new init 
call anyway since the labels change and possibly the output length.

Over the next day or so I'm going to have to make some final decisions 
on this API as there are internal projects that are waiting on this API 
to proceed.  I'm already past the cut-off date I set, but I recognize 
these discussions are important to have and I appreciate the input you 
and others have provided.

--Jamil