RFR: JDK-8320570 NegativeArraySizeException decoding >1G UTF8 bytes with non-ascii characters [v5]

Jim Laskey jlaskey at openjdk.org
Wed Dec 6 20:14:06 UTC 2023


> A regression is found in Java9+ creating String instance from UTF8 bytes, a side effect of string compactation https://openjdk.org/jeps/254 that changed the decoding logic. Specifically, when constructing a string from bytes: 
> 
> ``` 
> String str = new String(largeBytes, StandardCharsets.UTF_8); 
> ``` 
> 
> if the size of largeBytes is greater than 2^30 (>1 GB) but smaller than INT_MAX (2 GB), it fails on Java9+ (including 11, 17, 21, though the stack trace is slightly different, see below), regardless of jvm heap size. In Java8, it succeeded when jvm heap size is set to be sufficient.

Jim Laskey has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision:

 - Merge remote-tracking branch 'upstream/master' into 8320570
 - Alternate 64 bit test
 - Exclude 32 bit
 - Requested changes
 - Bump up memory
 - Cotrrect NegativeSize.java
 - Initial commit

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/16974/files
  - new: https://git.openjdk.org/jdk/pull/16974/files/9926adda..8ae170dd

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=16974&range=04
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16974&range=03-04

  Stats: 39190 lines in 426 files changed: 14144 ins; 23593 del; 1453 mod
  Patch: https://git.openjdk.org/jdk/pull/16974.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16974/head:pull/16974

PR: https://git.openjdk.org/jdk/pull/16974


More information about the core-libs-dev mailing list