RFR: 8366421: ModifiedUtf.utfLen may overflow for giant string

Roger Riggs rriggs at openjdk.org
Wed Sep 17 13:34:53 UTC 2025


On Mon, 15 Sep 2025 07:32:13 GMT, Guanqiang Han <ghan at openjdk.org> wrote:

> Please review this patch.
> 
> **Description:**
> 
> Currently, ModifiedUtf.utfLen returns a signed int. For very large strings, this may overflow and produce negative values, leading to incorrect behavior in code that relies on the UTF length. This patch changes the return type to long, which fully resolves the issue and allows safe handling of giant strings.
> 
> **Test:**
> 
> GHA

Can you add a test of the maximum length UTF-8 encoded string. 
That would be a string of Integer.MAX_VALUE/2 characters that were > 0xff.
It will likely have to write it to a file and read it back, ByteArrayIn/OutStream wouldn't be big enough.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/27285#issuecomment-3303034163


More information about the core-libs-dev mailing list