RFR: 8366421: ModifiedUtf.utfLen may overflow for giant string [v2]

Guanqiang Han ghan at openjdk.org
Fri Sep 19 01:17:51 UTC 2025


On Thu, 18 Sep 2025 15:27:43 GMT, Chen Liang <liach at openjdk.org> wrote:

>> Guanqiang Han has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
>> 
>>  - add regression test
>>  - Merge remote-tracking branch 'upstream/master' into 8366421
>>  - Change return type of utfLen to long to prevent overflow
>
> test/jdk/jdk/internal/util/TestUtfLen.java line 50:
> 
>> 48:         for (int i = 0; i < iterations; i++) {
>> 49:             total += ModifiedUtf.utfLen(chunk, 0);
>> 50:         }
> 
> Suggestion:
> 
>         long total = ModifiedUtf.utfLen(chunk.repeat(iterations), 0);

String.repeat() cannot generate a string whose total length exceeds Integer.MAX_VALUE due to internal limits. That’s why I used a small chunk and accumulated UTF-8 length in a loop.It seems that the String type cannot hold a string whose length exceeds Integer.MAX_VALUE.
https://github.com/openjdk/jdk/blob/e3a4c28409ac62feee9efe069e3a3482e7e2cdd2/src/java.base/share/classes/java/lang/String.java#L4875

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/27285#discussion_r2361471514


More information about the core-libs-dev mailing list