RFR: 8325002: Exceptions::fthrow needs to ensure it truncates to a valid utf8 string

David Holmes dholmes at openjdk.org
Sat Jul 27 12:18:34 UTC 2024


On Sat, 27 Jul 2024 08:11:53 GMT, Daniel Jeliński <djelinski at openjdk.org> wrote:

>> Exceptions::fthrow uses a 1024 byte buffer to format the incoming exception message string, but this may not be large enough, leading to truncation. However, we should ensure we truncate to a valid UTF8 sequence.
>> 
>> The process is explained in the code. Thanks to @RogerRiggs and @djelinski for their suggestions on how to tackle this.
>> 
>> Testing:
>>  - new gtest exercises the truncation code with the different possibilities for bad truncation
>>  - tiers 1-3 sanity testing
>> 
>> Thanks.
>
> src/hotspot/share/utilities/utf8.cpp line 398:
> 
>> 396: // byte sequence.
>> 397: static bool is_starting_byte(unsigned char b) {
>> 398:   return b >= 0xC0 && b <= 0xEF;;
> 
> Do you plan to use this method only for modified UTF-8 or for standard Utf-8 as well? Standard UTF-8 also uses F0-F7 as starting bytes.
> 
> Also, remove the double semicolon.

AFAIK the VM only deals with modified UTF-8.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20345#discussion_r1693948995


More information about the hotspot-dev mailing list