RFR: 8311216: DataURI can lose information in some charset environments
Michael Strauß
mstrauss at openjdk.org
Sat Jul 1 22:30:06 UTC 2023
DataURI uses the following implementation to decode the percent-encoded payload of a "data" URI:
...
String data = uri.substring(dataSeparator + 1);
Charset charset = Charset.defaultCharset();
...
URLDecoder.decode(data.replace("+", "%2B"), charset).getBytes(charset)
This approach only works if the charset that is passed into `URLDecoder.decode` and `String.getBytes` doesn't lose information when converting between `String` and `byte[]` representations, as might happen in a US-ASCII environment.
This PR solves the problem by not using `URLDecoder`, but instead simply decoding percent-encoded escape sequences as specified by RFC 3986, page 11.
**Note to reviewers**: the failing test can only be observed when the JVM uses a default charset that can't represent the payload, which can be enforced by specifying the `-Dfile.encoding=US-ASCII` VM option.
-------------
Commit messages:
- Don't use URLDecoder in DataURI
- Failing test
Changes: https://git.openjdk.org/jfx/pull/1165/files
Webrev: https://webrevs.openjdk.org/?repo=jfx&pr=1165&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8311216
Stats: 105 lines in 2 files changed: 93 ins; 3 del; 9 mod
Patch: https://git.openjdk.org/jfx/pull/1165.diff
Fetch: git fetch https://git.openjdk.org/jfx.git pull/1165/head:pull/1165
PR: https://git.openjdk.org/jfx/pull/1165
More information about the openjfx-dev
mailing list