RFR: 8311216: DataURI can lose information in some charset environments [v2]

Michael Strauß mstrauss at openjdk.org
Sat Jul 8 23:38:05 UTC 2023


> DataURI uses the following implementation to decode the percent-encoded payload of a "data" URI:
> 
> 
> ...
> String data = uri.substring(dataSeparator + 1);
> Charset charset = Charset.defaultCharset();
> ...
> URLDecoder.decode(data.replace("+", "%2B"), charset).getBytes(charset)
> 
> 
> This approach only works if the charset that is passed into `URLDecoder.decode` and `String.getBytes` doesn't lose information when converting between `String` and `byte[]` representations, as might happen in a US-ASCII environment.
> 
> This PR solves the problem by not using `URLDecoder`, but instead simply decoding percent-encoded escape sequences as specified by RFC 3986, page 11.
> 
> **Note to reviewers**: the failing test can only be observed when the JVM uses a default charset that can't represent the payload, which can be enforced by specifying the `-Dfile.encoding=US-ASCII` VM option.

Michael Strauß has updated the pull request incrementally with one additional commit since the last revision:

  added more tests

-------------

Changes:
  - all: https://git.openjdk.org/jfx/pull/1165/files
  - new: https://git.openjdk.org/jfx/pull/1165/files/9065941c..e13dfe43

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jfx&pr=1165&range=01
 - incr: https://webrevs.openjdk.org/?repo=jfx&pr=1165&range=00-01

  Stats: 13 lines in 1 file changed: 13 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jfx/pull/1165.diff
  Fetch: git fetch https://git.openjdk.org/jfx.git pull/1165/head:pull/1165

PR: https://git.openjdk.org/jfx/pull/1165


More information about the openjfx-dev mailing list