RFR: 8338257: UTF8 lengths should be size_t not int [v5]

David Holmes dholmes at openjdk.org
Tue Aug 27 07:12:06 UTC 2024


On Tue, 27 Aug 2024 01:09:09 GMT, Dean Long <dlong at openjdk.org> wrote:

>> David Holmes has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   more missing casts
>
> src/hotspot/share/classfile/javaClasses.cpp line 307:
> 
>> 305:   {
>> 306:     ResourceMark rm;
>> 307:     size_t utf8_len = static_cast<size_t>(length);
> 
> I think there should be an assert that length is not negative, probably at the very beginning of this function.

Why? As I explained to Coleen this is the length obtained from a Java array. All of the existing code relies on length being >= 0 and doesn't assert that anywhere.

> src/hotspot/share/classfile/javaClasses.cpp line 350:
> 
>> 348:   // This check is too strict when the input string is not a valid UTF8.
>> 349:   // For example, it may be created with arbitrary content via jni_NewStringUTF.
>> 350:   if (UTF8::is_legal_utf8((const unsigned char*)utf8_str, strlen(utf8_str), false)) {
> 
> Most of the time we use `is_legal_utf8`, we have a char* and have to cast it to unsigned char*.  How about adding an inlined overload for is_legal_utf8(const char*, size_t) for conenience?

Sorry out of scope for this change.

> src/hotspot/share/classfile/javaClasses.cpp line 555:
> 
>> 553:   bool      is_latin1 = java_lang_String::is_latin1(java_string);
>> 554: 
>> 555:   if (length == 0) return nullptr;
> 
> Should this be checking for length <= 0?  It looks like length can indeed be negative if UTF8::unicode_length() tries to return the length of a utf8 string with length > 0x7fffffff.

??? length is assigned on line 552 and again comes from the length of a Java array.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20560#discussion_r1732264231
PR Review Comment: https://git.openjdk.org/jdk/pull/20560#discussion_r1732264978
PR Review Comment: https://git.openjdk.org/jdk/pull/20560#discussion_r1732267249


More information about the serviceability-dev mailing list