GREASE'd ALPN values - a RFC 8701 / RFC 7301 / JEP 244 discussion
Bradford Wetmore
bradford.wetmore at oracle.com
Wed Nov 4 02:08:34 UTC 2020
On 10/8/2020 9:20 AM, Alexander Scheel wrote:
> Hi all,
>
> I saw that ALPN support from JEP 244 was backported to JDK8 and I've
> recently had the time to take a closer look at it. For context, I'm
> one of the maintainers of JSS, a NSS wrapper for Java. I've been
> discussing this with another contributor, Fraser (cc'd).
Hi, thanks for looking it over, and especially thanks for reporting
this. I've filed:
https://bugs.openjdk.java.net/browse/JDK-8254631
to track.
> One of the concerns we have with the implementation (and its exposure
> in the corresponding SSLEngine/SSLSocket/SSLParameters interface) is
> that protocols are passed in as Strings. However, RFC 7301 says in
> section 6:
>
>> o Identification Sequence: The precise set of octet values that
>> identifies the protocol. This could be the UTF-8 encoding
>> [RFC3629] of the protocol name.
This "could be" is probably what the original designer of the ALPN API
went with for API ease-of-use, and it made sense at the time as
everything in the IANA TLS Extensions list was in the ASCII range
(0x00-0x7F). But the GREASE values (0x80-0xFF) invalidated that assumption.
> When applied with GREASE'd values from RFC 8701, Strings don't work
> well. In particular, most of the registered values [0] are non-UTF-8,
0x0A-0x7A does work, but 0x8A-0xFA won't as you pointed out.
> which can't be easily round-tripped in Java. This means that while
> precise octet values are specified by IANA, they cannot be properly
> specified in Java.
>
> In particular:
>
> byte[] desired = new byte[]{ (byte) 0xFA, (byte) 0xFA };
> String encoded = new String(desired, StandardCharsets.UTF_8);
> byte[] wire = encoded.getBytes(StandardCharsets.UTF_8);
> String round = new String(wire, StandardCharsets.UTF_8);
Right. These 2 values are mapped by the decoder to 2 Object Replacement
Characters ("?" - \ufffd) representing 6 bytes:
0xef, 0xbf, 0xbd, 0xef, 0xbf, 0xbd
https://www.fileformat.info/info/unicode/char/fffd/index.htm
> fails, as does choosing US_ASCII for the encoding:
>
> byte[] desired = new byte[]{ (byte) 0xFA, (byte) 0xFA };
> String encoded = new String(desired, StandardCharsets.US_ASCII);
> byte[] wire = encoded.getBytes(StandardCharsets.UTF_8);
> String round = new String(wire, StandardCharsets.UTF_8);
Yes, US_ASCII only uses the first 7 bits, so it also maps to 2
replacement characters ("?"):
0x3f 0x3f
> Note that we (at the application level) can't control the final (wire
> / round-tripped) encoding to UTF_8 as this is done within the SunJSSE
> implementation:
Correct.
> and perhaps other files I'm missing.
>
> This decreases interoperability with other TLS implementations.
> OpenSSL [1], NSS [2], and GnuTLS [3] support setting opaque blobs as
> the ALPN protocol list, meaning the caller is free to supply GREASE'd
> values. Go on the other hand still uses its string [4], but that
> string class supports round-tripping non-UTF8 values correctly [5].
>
> Additionally, it means that GREASE'd values sent by Java applications
> aren't compliant with the RFC 8701/IANA wire values.
>
> Is there some workaround I'm missing?
Nothing is coming to mind.
> I believe that setting US_ASCII internally in SunJSSE isn't sufficient
> to ensure the right wire encoding gets used. I'm thinking the only
> real fix is to deprecate the String methods and provide byte[] methods
> for all identifiers.
There is one other option that doesn't introduce a new API but does have
some compatibility risk, and that is to use the ISO_8859_1/LATIN-1
charset instead of UTF-8. This would require folks who use UTF-8 to
update their code, but I haven't yet found any code in the wild which
actually uses anything U+0080 and above. I'm proposing a Security (or
System?) property which would revert the behavior if it becomes a problem.
See the attached file, which is a proposal+code example which will
eventually be turned into a formal CSR barring any significant issue.
I talked to our CSR lead, he felt that in this case, interoperability
probably trumps compatibility for character values that likely aren't
being used anyway, and behavior that was underspecified.
Brad
-------------- next part --------------
/*
* Copyright (c) 2020, Oracle and/or its affiliates. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License version 2 only, as
* published by the Free Software Foundation.
*
* This code is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
* version 2 for more details (a copy is included in the LICENSE file that
* accompanied this code).
*
* You should have received a copy of the GNU General Public License version
* 2 along with this work; if not, write to the Free Software Foundation,
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
*
* Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
* or visit www.oracle.com if you need additional information or have any
* questions.
*/
import java.nio.charset.StandardCharsets;
/*
* (This text will likely form the basis of a future CSR.)
*
* https://bugs.openjdk.java.net/browse/JDK-8254631
*
* ALPN (RFC7301) values are sent in TLS extensions using byte arrays, but the
* Java ALPN APIs selected Strings for ease of use. Internally, these Java
* Strings are converted to byte arrays using UTF-8 as suggested as a possible
* encoding in Section 6 of RFC 7301. This encoding convention was never
* specified by the RFC or Java documentation/APIs.
*
* It is currently not possible for ALPN characters in the range of
* (U+0080-U+00FF) to be output in SunJSSE, which are instead converted to a
* multi-byte representation by the UTF-8 encoder/decoder.
*
* The GREASE mechanism (RFC 8701) was subsequently developed to help prevent
* extensibility failures in the TLS ecosystem. Unfortunately, 1/2 of the
* defined GREASE values fall into the (U+0080-U+00FF) range, and thus can't be
* represented by SunJSSE (client or server side).
*
* A new API could be defined to use byte arrays, but this would be not be
* helpful for earlier Java releases (8/11/15) without a Maintenance
* Release (MR). e.g.
*
* https://jcp.org/aboutJava/communityprocess/mrel/jsr337/index3.html
*
* The proposed workaround/fix is to have the Java JSSE implementation encode
* Strings directly as ISO_8859_1/LATIN-1 which correctly outputs
* (U+0000-U+00FF), but other UNICODE values U+8000-U+10FFFF will need to be
* converted by applications to multiple consecutive bytes before sending
* (e.g. UTF-8) rather than depending on SunJSSE to automatically provide
* the (possibly incorrect) encoding.
*
* We don't anticipate this to be a significant interoperability issue, since
* all known/current values in the IETF/IANA TLS ALPN extension list can be
* encoded as ISO_8859_1/LATIN-1:
*
* https://www.iana.org/assignments/tls-extensiontype-values/tls-extensiontype-values.xhtml#alpn-protocol-ids
*
* This change will actually enhance interoperabibility with other
* implementations which use byte arrays.
*
* The only compatability issue is if characters larger than U+007F are used.
* We don't know of any applications currently using such ALPN values, but
* there could be. These values must be converted to the format required by
* their peer.
*
* For compatibility issues, we introduce the following Java Security Property
* to reverse this change:
*
* #
* # The default Character set for converting ALPN values between byte
* # arrays and Strings. Older versions of JDK used UTF-8.
* #
* # jdk.jsse.alpnCharacterEncoder=UTF-8
* jdk.tls.alpnCharacterEncoder=ISO_8859_1
*
* which can be overridden to restore the previous conversion process.
*/
public class ALPNStringToBytesExample {
/*
* Any Unicode/Supplemental Unicode Values that need to be passed as UTF-8
* must be first converted (see below):
*
* 'MEETEI MAYEK LETTER HUK'
* 'MEETEI MAYEK LETTER UN'
* 'MEETEI MAYEK LETTER I'
*
* 'DESERET CAPITAL LETTER LONG I'
* 'DESERET CAPITAL LETTER LONG E'
*/
private static final String HUKUNI = "\uabcd\uabce\uabcf";
private static final String IE
= new String(new int[]{0x10400, 0x10401}, 0, 2);
// ALPN String array that will eventually be passed to SSLEngine/SSLSocket.
private static final String[] ALPN_STRINGS = new String[]{
// From the IETF/IANA TLS ALPN extension list.
// ASCII/ISO_8859_1/LATIN-1 Strings
"http/1.1", // 0x68 0x74 0x74 0x70 0x2f 0x31 0x2e 0x31
"h2", // 0x68 0x32
"imap", // 0x69 0x6d 0x61 0x70
"sunrpc", // 0x73 0x75 0x6e 0x72 0x70 0x63
// etc.
// GREASE (RFC 8701)
toISO_8859_1((byte) 0x0A, (byte) 0x0A),
toISO_8859_1((byte) 0x1A, (byte) 0x1A),
toISO_8859_1((byte) 0x2A, (byte) 0x2A),
toISO_8859_1((byte) 0x3A, (byte) 0x3A),
toISO_8859_1((byte) 0x4A, (byte) 0x4A),
toISO_8859_1((byte) 0x5A, (byte) 0x5A),
toISO_8859_1((byte) 0x6A, (byte) 0x6A),
toISO_8859_1((byte) 0x7A, (byte) 0x7A),
toISO_8859_1((byte) 0x8A, (byte) 0x8A),
toISO_8859_1((byte) 0x9A, (byte) 0x9A),
toISO_8859_1((byte) 0xAA, (byte) 0xAA),
toISO_8859_1((byte) 0xBA, (byte) 0xBA),
toISO_8859_1((byte) 0xCA, (byte) 0xCA),
toISO_8859_1((byte) 0xDA, (byte) 0xDA),
toISO_8859_1((byte) 0xEA, (byte) 0xEA),
toISO_8859_1((byte) 0xFA, (byte) 0xFA),
// Additional Regular and Supplemental Unicode Points (above)
toISO_8859_1(HUKUNI.getBytes(StandardCharsets.UTF_8)),
toISO_8859_1(IE.getBytes(StandardCharsets.UTF_8))
};
public static void main(String[] args) throws Exception {
/*
* Create SSLEngine and set ALPN parameters.
*
* SSLContext sslContext = SSLContext.getDefault();
* SSLEngine sslEngine = sslContext.createSSLEngine("peer", 80);
* SSLParameters sslParameters = sslEngine.getSSLParameters();
* sslParameters.setApplicationProtocols(ALPN_VALUES);
* sslEngine.setSSLParameters(sslParameters);
* sslEngine.beginHandshake(); sslEngine.wrap()/unwrap();
* // etc.
*/
/*
* Local SunJSSE will now encode the String array as ISO_8859_1
* byte array as expected by RFC 8701.
*/
byte[][] outgoingBytes = new byte[ALPN_STRINGS.length][0];
for (int i = 0; i < ALPN_STRINGS.length; i++) {
outgoingBytes[i]
= ALPN_STRINGS[i].getBytes(StandardCharsets.ISO_8859_1);
}
/*
* Peer SunJSSE receives byte array and parses back into ISO_8859_1
* String array.
*/
String[] incomingStrings = new String[outgoingBytes.length];
for (int i = 0; i < incomingStrings.length; i++) {
incomingStrings[i]
= new String(outgoingBytes[i], StandardCharsets.ISO_8859_1);
}
// Check the ASCII/LATIN chars.
for (int i = 0; i < incomingStrings.length - 2; i++) {
checkStrings(i, incomingStrings[i], ALPN_STRINGS[i]);
}
// Last 2 Strings need to be decoded back as UTF-8.
checkStrings(incomingStrings.length - 2,
toUTF_8String(incomingStrings[incomingStrings.length - 2]),
HUKUNI);
checkStrings(incomingStrings.length - 1,
toUTF_8String(incomingStrings[incomingStrings.length - 1]),
IE);
}
// Shorten method calls above.
private static String toISO_8859_1(byte... bytes) {
return new String(bytes, StandardCharsets.ISO_8859_1);
}
// Shorten method calls above.
private static String toUTF_8String(String incomingString) {
return new String(incomingString.getBytes(
StandardCharsets.ISO_8859_1), StandardCharsets.UTF_8);
}
private static void checkStrings(int i, String incoming, String alpn) {
System.out.println(i + ": \"" + incoming + "\" = \""
+ alpn + "\"");
if (!incoming.equals(alpn)) {
System.out.println("ISO_8859_1 didn't convert cleanly");
}
}
}
More information about the security-dev
mailing list