JCA design for RFC 7748

Thu Aug 17 15:19:39 UTC 2017

On 8/16/2017 3:17 PM, Michael StJohns wrote:

> On 8/16/2017 11:18 AM, Adam Petcher wrote:
>>
>> My intention with this ByteArrayValue is to only use it for 
>> information that has a clear semantics when represented as a byte 
>> array, and a byte array is a convenient and appropriate 
>> representation for the algorithms involved (so there isn't a lot of 
>> unnecessary conversion). This is the case for public/private keys in 
>> RFC 7748/8032:
>>
>> 1) RFC 8032: "An EdDSA private key is a b-bit string k." "The EdDSA 
>> public key is ENC(A)." (ENC is a function from integers to 
>> little-endian bit strings.

Oops, minor correction. Here A is a point, so ENC is a function from 
points to little-endian bit strings.

>> 2) RFC 7748: "Alice generates 32 random bytes in a[0] to a[31] and 
>> transmits K_A =X25519(a, 9) to Bob..." The X25519 and X448 functions, 
>> as described in the RFC, take bit strings as input and produce bit 
>> strings as output.
>
> Thanks for making my point for me.  The internal representation of the 
> public point is an integer.  It's only when encoding or decoding that 
> it gets externally represented as an array of bytes.  (And yes, I 
> understand that the RFC defines an algorithm using little endian byte 
> array representations of the integers - but that's the 
> implementation's call, not the API).
>
> With respect to the output of the KeyAgreement algorithm - your (2) 
> above, the transmission representation (e.g. the encoded public key) 
> is little endian byte array representation of an integer.  The 
> internal representation is - wait for it - integer.
>
> I have no problems at all with any given implementation using little 
> endian math internally.  For the purposes of using JCA, stick with 
> BigInteger to represent your integers.  Use your provider encoding 
> methods to translate between what the math is internally and what the 
> bits are externally if necessary. Implement the conversion methods for 
> the factory and for dealing with the existing EC classes.   Maybe get 
> BigInteger to be extended to handle (natively) littleEndian 
> representation (as well as fixed length outputs necessary for things 
> like ECDH).
>

All good points, and I think BigInteger may be a reasonable 
representation to use for public/private key values. I'm just not sure 
that it is better than byte arrays. I'll share some relevant information 
that affects this decision.

First off, one of the goals of RFC 7748 and 8032 is to address some of 
the implementation challenges related to ECC. These algorithms are 
designed to eliminate the need for checks at various stages, and to 
generally make implementation bugs less likely. These improvements are 
motivated by all the ECC implementation bugs that have emerged in the 
last ~20 years. I mention this because I think it is important that we 
choose an API and implementation that allows us to benefit from these 
improvements in the standards. That means we shouldn't necessarily 
follow all the existing ECC patterns in the API and implementation.

Specifically, these standards have properties related to byte arrays 
like: "The Curve25519 function was carefully designed to allow all 
32-byte strings as Diffie-Hellman public keys."[1] If we use 
representations other than byte strings in the API, then we should 
ensure that our representations have the same properties (e.g. every 
BigInteger is a valid public key).

It's best to talk about each type on its own. Of course, one of the 
benefits of using bit strings is that we may have the option of using 
the same class/interface in the API to hold all of these.

RFC 7748 public keys: I think we can reasonably use BigInteger to hold 
public key values. One minor issue is that we need to specify how 
implementations should handle non-canonical values (numbers that are 
less than 0 or greater than p-1). This does not seem like a huge issue, 
though, and the existing ECC API has the same issue. Another minor issue 
is that modeling this as a BigInteger may encourage implementations to 
use BigInteger in the RFC 7748 Montgomery ladder. This would be 
unfortunate because it would leak sensitive information through timing 
channels.

RFC 7748 private keys: This one is a bit more difficult. RFC 7748 
defines a "clamping" operation that ensures that the integers 
corresponding to bit strings have certain properties (e.g. they are a 
multiple of the cofactor). So if we use BigInteger for private keys in 
the API, we need to specify whether the value is clamped or unclamped. 
If an unclamped value is treated as clamped, then this can result in 
security and correctness issues. Also, the RFC treats private keys as 
bit strings---they are not used in any integer operations. So modeling 
them with byte arrays seems just as valid as modeling them with BigInteger.

RFC 8042 public keys: The analysis here is similar to RFC 7748 public 
keys, except we also need to store the (probably compressed) x 
coordinate. So if we don't use byte arrays, we would need to use 
something like ECPoint.

RFC 8032 private keys: These are definitely bit strings, and modeling 
them as integers doesn't make much sense. The only thing that is ever 
done with these private keys is that they are used as input to a hash 
function.

[1] https://cr.yp.to/ecdh.html