RFR: JDK-8320570 NegativeArraySizeException decoding >1G UTF8 bytes with non-ascii characters [v2]

Roger Riggs rriggs at openjdk.org
Tue Dec 5 20:48:37 UTC 2023


On Tue, 5 Dec 2023 20:13:06 GMT, Jim Laskey <jlaskey at openjdk.org> wrote:

>> A regression is found in Java9+ creating String instance from UTF8 bytes, a side effect of string compactation https://openjdk.org/jeps/254 that changed the decoding logic. Specifically, when constructing a string from bytes: 
>> 
>> ``` 
>> String str = new String(largeBytes, StandardCharsets.UTF_8); 
>> ``` 
>> 
>> if the size of largeBytes is greater than 2^30 (>1 GB) but smaller than INT_MAX (2 GB), it fails on Java9+ (including 11, 17, 21, though the stack trace is slightly different, see below), regardless of jvm heap size. In Java8, it succeeded when jvm heap size is set to be sufficient.
>
> Jim Laskey has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - Bump up memory
>  - Cotrrect NegativeSize.java

test/jdk/java/lang/String/CompactString/NegativeSize.java line 30:

> 28:  * @test
> 29:  * @bug 8077559
> 30:  * @summary Tests Compact String for negative size.

It might be useful to require the larger memory; to avoid getting run when there's insufficient memory available.

 * @requires os.maxMemory >= 4G

test/jdk/java/lang/String/CompactString/NegativeSize.java line 63:

> 61:             System.out.println(inStr.substring(1_200_000_000));
> 62:         } catch (OutOfMemoryError ex) {
> 63:             System.out.println("Succeeded with OutOfMemoryError");

It might be good to check that it is the expected OOME whose message starts with `UTF16 String size is `.
No just any "Java heap memory" OOME.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16974#discussion_r1416257145
PR Review Comment: https://git.openjdk.org/jdk/pull/16974#discussion_r1416256370


More information about the core-libs-dev mailing list