GREASE'd ALPN values - a RFC 8701 / RFC 7301 / JEP 244 discussion

Wed Nov 4 02:08:34 UTC 2020

On 10/8/2020 9:20 AM, Alexander Scheel wrote:
> Hi all,
> 
> I saw that ALPN support from JEP 244 was backported to JDK8 and I've
> recently had the time to take a closer look at it. For context, I'm
> one of the maintainers of JSS, a NSS wrapper for Java. I've been
> discussing this with another contributor, Fraser (cc'd).

Hi, thanks for looking it over, and especially thanks for reporting 
this.  I've filed:

     https://bugs.openjdk.java.net/browse/JDK-8254631

to track.

> One of the concerns we have with the implementation (and its exposure
> in the corresponding SSLEngine/SSLSocket/SSLParameters interface) is
> that protocols are passed in as Strings. However, RFC 7301 says in
> section 6:
> 
>>     o  Identification Sequence: The precise set of octet values that
>>        identifies the protocol.  This could be the UTF-8 encoding
>>        [RFC3629] of the protocol name.

This "could be" is probably what the original designer of the ALPN API 
went with for API ease-of-use, and it made sense at the time as 
everything in the IANA TLS Extensions list was in the ASCII range 
(0x00-0x7F).  But the GREASE values (0x80-0xFF) invalidated that assumption.

> When applied with GREASE'd values from RFC 8701, Strings don't work
> well. In particular, most of the registered values [0] are non-UTF-8,

0x0A-0x7A does work, but 0x8A-0xFA won't as you pointed out.

> which can't be easily round-tripped in Java. This means that while
> precise octet values are specified by IANA, they cannot be properly
> specified in Java.
> 
> In particular:
> 
>      byte[] desired = new byte[]{ (byte) 0xFA, (byte) 0xFA };
>      String encoded = new String(desired, StandardCharsets.UTF_8);
>      byte[] wire    = encoded.getBytes(StandardCharsets.UTF_8);
>      String round   = new String(wire, StandardCharsets.UTF_8);

Right.  These 2 values are mapped by the decoder to 2 Object Replacement 
Characters ("?" - \ufffd) representing 6 bytes:

     0xef, 0xbf, 0xbd,     0xef, 0xbf, 0xbd

     https://www.fileformat.info/info/unicode/char/fffd/index.htm

> fails, as does choosing US_ASCII for the encoding:
> 
>      byte[] desired = new byte[]{ (byte) 0xFA, (byte) 0xFA };
>      String encoded = new String(desired, StandardCharsets.US_ASCII);
>      byte[] wire    = encoded.getBytes(StandardCharsets.UTF_8);
>      String round   = new String(wire, StandardCharsets.UTF_8);

Yes, US_ASCII only uses the first 7 bits, so it also maps to 2 
replacement characters ("?"):

     0x3f    0x3f

> Note that we (at the application level) can't control the final (wire
> / round-tripped) encoding to UTF_8 as this is done within the SunJSSE
> implementation:

Correct.

> and perhaps other files I'm missing.
> 
> This decreases interoperability with other TLS implementations.
> OpenSSL [1], NSS [2], and GnuTLS [3] support setting opaque blobs as
> the ALPN protocol list, meaning the caller is free to supply GREASE'd
> values. Go on the other hand still uses its string [4], but that
> string class supports round-tripping non-UTF8 values correctly [5].
> 
> Additionally, it means that GREASE'd values sent by Java applications
> aren't compliant with the RFC 8701/IANA wire values.
> 
> Is there some workaround I'm missing?

Nothing is coming to mind.

> I believe that setting US_ASCII internally in SunJSSE isn't sufficient
> to ensure the right wire encoding gets used. I'm thinking the only
> real fix is to deprecate the String methods and provide byte[] methods
> for all identifiers.

There is one other option that doesn't introduce a new API but does have 
some compatibility risk, and that is to use the ISO_8859_1/LATIN-1 
charset instead of UTF-8.  This would require folks who use UTF-8 to 
update their code, but I haven't yet found any code in the wild which 
actually uses anything U+0080 and above.  I'm proposing a Security (or 
System?) property which would revert the behavior if it becomes a problem.

See the attached file, which is a proposal+code example which will 
eventually be turned into a formal CSR barring any significant issue.

I talked to our CSR lead, he felt that in this case, interoperability 
probably trumps compatibility for character values that likely aren't 
being used anyway, and behavior that was underspecified.

Brad
-------------- next part --------------
/*
 * Copyright (c) 2020, Oracle and/or its affiliates. All rights reserved.
 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
 *
 * This code is free software; you can redistribute it and/or modify it
 * under the terms of the GNU General Public License version 2 only, as
 * published by the Free Software Foundation.
 *
 * This code is distributed in the hope that it will be useful, but WITHOUT
 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
 * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
 * version 2 for more details (a copy is included in the LICENSE file that
 * accompanied this code).
 *
 * You should have received a copy of the GNU General Public License version
 * 2 along with this work; if not, write to the Free Software Foundation,
 * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
 *
 * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
 * or visit www.oracle.com if you need additional information or have any
 * questions.
 */

import java.nio.charset.StandardCharsets;

/*
 * (This text will likely form the basis of a future CSR.)
 * 
 * https://bugs.openjdk.java.net/browse/JDK-8254631
 * 
 * ALPN (RFC7301) values are sent in TLS extensions using byte arrays, but the
 * Java ALPN APIs selected Strings for ease of use. Internally, these Java
 * Strings are converted to byte arrays using UTF-8 as suggested as a possible
 * encoding in Section 6 of RFC 7301. This encoding convention was never
 * specified by the RFC or Java documentation/APIs.
 *
 * It is currently not possible for ALPN characters in the range of
 * (U+0080-U+00FF) to be output in SunJSSE, which are instead converted to a
 * multi-byte representation by the UTF-8 encoder/decoder.
 *
 * The GREASE mechanism (RFC 8701) was subsequently developed to help prevent
 * extensibility failures in the TLS ecosystem. Unfortunately, 1/2 of the
 * defined GREASE values fall into the (U+0080-U+00FF) range, and thus can't be
 * represented by SunJSSE (client or server side).
 *
 * A new API could be defined to use byte arrays, but this would be not be
 * helpful for earlier Java releases (8/11/15) without a Maintenance
 * Release (MR).  e.g. 
 * 
 *     https://jcp.org/aboutJava/communityprocess/mrel/jsr337/index3.html
 * 
 * The proposed workaround/fix is to have the Java JSSE implementation encode
 * Strings directly as ISO_8859_1/LATIN-1 which correctly outputs
 * (U+0000-U+00FF), but other UNICODE values U+8000-U+10FFFF will need to be
 * converted by applications to multiple consecutive bytes before sending
 * (e.g. UTF-8) rather than depending on SunJSSE to automatically provide
 * the (possibly incorrect) encoding.
 *
 * We don't anticipate this to be a significant interoperability issue, since
 * all known/current values in the IETF/IANA TLS ALPN extension list can be
 * encoded as ISO_8859_1/LATIN-1:
 *
 * https://www.iana.org/assignments/tls-extensiontype-values/tls-extensiontype-values.xhtml#alpn-protocol-ids
 *
 * This change will actually enhance interoperabibility with other
 * implementations which use byte arrays.
 *
 * The only compatability issue is if characters larger than U+007F are used.
 * We don't know of any applications currently using such ALPN values, but
 * there could be.  These values must be converted to the format required by
 * their peer.
 *
 * For compatibility issues, we introduce the following Java Security Property
 * to reverse this change:
 *
 *     #
 *     # The default Character set for converting ALPN values between byte
 *     # arrays and Strings. Older versions of JDK used UTF-8.
 *     #
 *     # jdk.jsse.alpnCharacterEncoder=UTF-8
 *     jdk.tls.alpnCharacterEncoder=ISO_8859_1
 *
 * which can be overridden to restore the previous conversion process.
 */
public class ALPNStringToBytesExample {

    /*
     * Any Unicode/Supplemental Unicode Values that need to be passed as UTF-8
     * must be first converted (see below):
     *
     *     'MEETEI MAYEK LETTER HUK'
     *     'MEETEI MAYEK LETTER UN'
     *     'MEETEI MAYEK LETTER I'
     *
     *     'DESERET CAPITAL LETTER LONG I'
     *     'DESERET CAPITAL LETTER LONG E'
     */
    private static final String HUKUNI = "\uabcd\uabce\uabcf";
    private static final String IE
            = new String(new int[]{0x10400, 0x10401}, 0, 2);

    // ALPN String array that will eventually be passed to SSLEngine/SSLSocket.
    private static final String[] ALPN_STRINGS = new String[]{

        // From the IETF/IANA TLS ALPN extension list.

        // ASCII/ISO_8859_1/LATIN-1 Strings
        "http/1.1",    // 0x68 0x74 0x74 0x70 0x2f 0x31 0x2e 0x31
        "h2",          // 0x68 0x32
        "imap",        // 0x69 0x6d 0x61 0x70
        "sunrpc",      // 0x73 0x75 0x6e 0x72 0x70 0x63
                       // etc.

        // GREASE (RFC 8701)
        toISO_8859_1((byte) 0x0A, (byte) 0x0A),
        toISO_8859_1((byte) 0x1A, (byte) 0x1A),
        toISO_8859_1((byte) 0x2A, (byte) 0x2A),
        toISO_8859_1((byte) 0x3A, (byte) 0x3A),
        toISO_8859_1((byte) 0x4A, (byte) 0x4A),
        toISO_8859_1((byte) 0x5A, (byte) 0x5A),
        toISO_8859_1((byte) 0x6A, (byte) 0x6A),
        toISO_8859_1((byte) 0x7A, (byte) 0x7A),
        toISO_8859_1((byte) 0x8A, (byte) 0x8A),
        toISO_8859_1((byte) 0x9A, (byte) 0x9A),
        toISO_8859_1((byte) 0xAA, (byte) 0xAA),
        toISO_8859_1((byte) 0xBA, (byte) 0xBA),
        toISO_8859_1((byte) 0xCA, (byte) 0xCA),
        toISO_8859_1((byte) 0xDA, (byte) 0xDA),
        toISO_8859_1((byte) 0xEA, (byte) 0xEA),
        toISO_8859_1((byte) 0xFA, (byte) 0xFA),

        // Additional Regular and Supplemental Unicode Points (above)
        toISO_8859_1(HUKUNI.getBytes(StandardCharsets.UTF_8)),
        toISO_8859_1(IE.getBytes(StandardCharsets.UTF_8))
    };

    public static void main(String[] args) throws Exception {

        /*
         * Create SSLEngine and set ALPN parameters.
         * 
         *     SSLContext sslContext = SSLContext.getDefault();
         *     SSLEngine sslEngine = sslContext.createSSLEngine("peer", 80);
         *     SSLParameters sslParameters = sslEngine.getSSLParameters();
         *     sslParameters.setApplicationProtocols(ALPN_VALUES);
         *     sslEngine.setSSLParameters(sslParameters);
         *     sslEngine.beginHandshake(); sslEngine.wrap()/unwrap();
         *     // etc.
         */

        /*
         * Local SunJSSE will now encode the String array as ISO_8859_1
         * byte array as expected by RFC 8701.
         */
        byte[][] outgoingBytes = new byte[ALPN_STRINGS.length][0];
        for (int i = 0; i < ALPN_STRINGS.length; i++) {
            outgoingBytes[i]
                    = ALPN_STRINGS[i].getBytes(StandardCharsets.ISO_8859_1);
        }

        /*
         * Peer SunJSSE receives byte array and parses back into ISO_8859_1
         * String array.
         */
        String[] incomingStrings = new String[outgoingBytes.length];
        for (int i = 0; i < incomingStrings.length; i++) {
            incomingStrings[i]
                    = new String(outgoingBytes[i], StandardCharsets.ISO_8859_1);
        }

        // Check the ASCII/LATIN chars.
        for (int i = 0; i < incomingStrings.length - 2; i++) {
            checkStrings(i, incomingStrings[i], ALPN_STRINGS[i]);
        }

        // Last 2 Strings need to be decoded back as UTF-8.
        checkStrings(incomingStrings.length - 2,
                toUTF_8String(incomingStrings[incomingStrings.length - 2]),
                HUKUNI);
        checkStrings(incomingStrings.length - 1,
                toUTF_8String(incomingStrings[incomingStrings.length - 1]),
                IE);
    }

    // Shorten method calls above.
    private static String toISO_8859_1(byte... bytes) {
        return new String(bytes, StandardCharsets.ISO_8859_1);
    }

    // Shorten method calls above.
    private static String toUTF_8String(String incomingString) {
        return new String(incomingString.getBytes(
                StandardCharsets.ISO_8859_1), StandardCharsets.UTF_8);
    }

    private static void checkStrings(int i, String incoming, String alpn) {
        System.out.println(i + ": \"" + incoming + "\" = \""
                + alpn + "\"");

        if (!incoming.equals(alpn)) {
            System.out.println("ISO_8859_1 didn't convert cleanly");
        }
    }
}