[External] : Re: PEM API github repo

Thu Jan 25 21:02:16 UTC 2024

On 1/25/24 9:20 AM, Daniel Jeliński wrote:
> Hi Tony,
> Thanks for the links! The API looks very promising.
> 
> Out of curiosity, why aren't you using the Base64 MIME
> encoder/decoder? They are supposed to produce/remove the newline
> characters.

I can look it over again.  I had inconsistencies during testing with 
expected data and returned data.

> The relationship between the byte[] and String data should be
> specified. Base64 explicitly specifies that the String APIs are
> translated to the byte[] APIs with ISO 8859-1 encoding. The PEMDecoder
> is currently using the default charset, which might produce
> interesting results if the charset is set to EBCDIC. The encoder is
> using UTF-8, which is reasonable, but given that the produced output
> will always be ASCII, ISO 8859-1 will perform better.

That's fine

> There's a disparity between the decoder and the encoder APIs; both
> work with strings, but Decoder accepts InputStream and not arrays, and
> Encoder produces byte arrays, but does not work with streams. This
> should be more uniform. I like Decoder's InputStream support (that's
> currently the only way to read multiple CA certificates from a single
> file - the String overload currently only returns the first one), so
> I'd add OutputStream support to Encoder for parity.

I don't see parity as a requirement.  I see encoding and decoding as 
having unique requirements.  Decoding from a file or InputStream makes 
sense.  A decode(byte[]) method didn't seem necessary as I felt it 
unlikely a user would have a single byte[] with PEM data.  They were 
more likely to have a String or an InputStream.  The developer can wrap 
it with ByteArrayInputStream or String(byte[], Charset), which is what a 
public API method would do internally.

Encoding is a single operation on an object.  Pass in a SecurityObject 
and PEM data is returned.  Returning a byte[] is the most flexible 
without creating more methods.  In the case where meta data is 
in-between the PEM or some other data formatting.

Why should the API have an OutputStream method, something like:
   pem.encodeToOutputStream(so, outputstream);
when the API as written today, the developer can use:
   outputstream.writeBytes(pem.encode(obj));

I don't like to add API methods just for symmetry, they need to have a 
compelling reason.  I have not seen that in the OutputStream case.

> Karl's earlier
> suggestion to support Stream<SecurityObject> also makes a lot of
> sense.
I haven't counted out Stream, I just haven't seen a compelling reason. 
My tests use stream() from an array list of PEM strings.  But I haven't 
seen the situation where the API needs stream support that couldn't be 
done in a different way.  This is a preview JEP, and we still have time. 
  If there is a compelling example, point it out to me.

> I'm not a big fan of the non-static factory methods
> withEncryption/withDecryption/withFactory. The problem with non-static
> methods that return an instance of the same type is that you need to
> check the documentation to know if the method returns a new instance
> or if it mutates the current one. Can we use static factory methods
> instead? Either that, or create a builder class.

The API states the classes are immutable which was a big requirement and 
it why it's stated all over the docs.  A builder class was considered 
early in the API development, but it was too much complication for a few 
optional cases where the developer may need to specify details like 
encryption or a factory.  The API has the builder idea, without the 
extra builder construction methods.  I don't see how a static factory 
method fit here.

> I don't like the PEMEncoder.withEncryption API. It's not predictable
> enough; when encoding data, it's not consistent between writing
> unencrypted data (certificate, crl), throwing (PublicKey,
> EncryptedPrivateKeyInfo) and writing encrypted data (unencrypted
> private keys). The alternative of forcing the users to encrypt using
> EncryptedPrivateKeyInfo looks better to me.

That was a design decision to make the API easier to use.  The 
non-security export does not need to be burden with understanding 
EncryptedPrivateKeyInfo.

The API can be consistent if you choose to only pass in 
EncryptedPrivateKeyInfo and not set withEncryption().  If an 
ArrayList<SecurityObject> encodes with a stream(), it is nice to 
configure encryption for private keys and still encode public keys and 
certificates with the same encoder.

> I'd love to see support for the OpenSSL private key formats; it seems
> that RSAPrivateCrtKeyImpl already supports PKCS#1 format, so it may be
> just a matter of exposing that functionality. Other key types like EC
> might need more work. That might be added later after the API is
> finalized.

OpenSSL 3.0 is transitioning away from their format to PKCS#8.  It is an 
obsoleted format.  While I have thought about decoding support of RSA 
PKCS#1 for compatibility, I have no intention to support OpenSSL or 
PKCS#1 encoding with this PEM API.

If someone is interested in old OpenSSL or other encoding formats, that 
is why the Encoder and Decoder interfaces are included.  To give a 
structure for developing other encoding.

> 
> Thanks,
> Daniel
> 
> śr., 24 sty 2024 o 22:24 Anthony Scarpino
> <anthony.scarpino at oracle.com> napisał(a):
>>
>> Hi,
>>
>> The following github link is to the PEM API as it is written in the
>> draft JEP (https://openjdk.org/jeps/8300911).  There has been a few
>> changes since the original posting.
>>
>> https://urldefense.com/v3/__https://github.com/ascarpino/jdk/tree/pem__;!!ACWV5N9M2RV99hQ!NmZu22NrC2hxWJuqLHZ6l1C0KYVK0Qf_rV7tV-1uLqUb_5sFMJXyCCKVPjmEmGCeQ6US2RJquDm9AJqZXO46ju8Q$
>>
>> The Encoder and PEMEncoder to now return byte[] for the encode() method.
>>    A new PEMEncoder method, encodeToString(), was added.  I believe these
>> make it easier for outputting data to a file and InputStreams, while
>> still supporting a method that returns a String.
>>
>> For decode, InputStream has replaced Reader.  There were comments
>> preferring InputStream and I found that Reader's buffering quirks were
>> problematic. Decoding from a byte[] is easy through an ByteArrayInputStream.
>>
>> If you are interested in testing out the API, please download and
>> compile the repo.  Then let me know how your experience went.
>>
>> thanks
>>
>> Tony