RFR: 8338257: UTF8 lengths should be size_t not int [v2]
    David Holmes 
    dholmes at openjdk.org
       
    Mon Aug 19 23:19:04 UTC 2024
    
    
  
On Thu, 15 Aug 2024 20:43:51 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:
> Did you try to compile this with -Wconversion (not -Wsign-conversion) without -Werror?
I hadn't but will do so ...
> src/hotspot/share/classfile/javaClasses.cpp line 586:
> 
>> 584:     ResourceMark rm;
>> 585:     jbyte* position = (length == 0) ? nullptr : value->byte_at_addr(0);
>> 586:     size_t utf8_len = length;
> 
> These are sign conversions.  This is too big to change here and not sure what the fan out would be but java_lang_String::length() should return unsigned and also arrayOop.hpp length.  They're never negative.  We should probably assert that (below).
Sorry I don't follow what you are asking for here. A Java String length is an int even though it can't be negative in length - just as Java array indices are int even though they can't be negative either.
Anyway this conversion should have a cast, though none of the usual complaining compilers complained about it.
> src/hotspot/share/classfile/javaClasses.cpp line 639:
> 
>> 637:   if (length == 0) {
>> 638:     return 0;
>> 639:   }
> 
> Maybe assert length > 0 here?
Why "> 0" ?
> src/hotspot/share/classfile/javaClasses.cpp line 702:
> 
>> 700:   } else {
>> 701:     jbyte* position = (length == 0) ? nullptr : value->byte_at_addr(0);
>> 702:     return UNICODE::as_utf8(position, static_cast<int>(length), buf, buflen);
> 
> Don't you want checked_cast here not static casts.
These lengths are the lengths of Java array so an int >= 0.
> src/hotspot/share/prims/jni.cpp line 2226:
> 
>> 2224:  HOTSPOT_JNI_GETSTRINGUTFLENGTH_ENTRY(env, string);
>> 2225:   oop java_string = JNIHandles::resolve_non_null(string);
>> 2226:   jsize ret = java_lang_String::utf8_length_as_int(java_string);
> 
> So the spec says that this should be jsize (signed int), which is why this is, right?
Yes. Hence the other change to add a new JNI API.
> src/hotspot/share/utilities/utf8.cpp line 512:
> 
>> 510: 
>> 511: template<typename T>
>> 512: int UNICODE::utf8_length_as_int(const T* base, int length) {
> 
> Why was this parameter left as int and not size_t ?
It is the length of a Java array - so int >= 0
> src/hotspot/share/utilities/utf8.hpp line 47:
> 
>> 45: 
>> 46: In the code below if we have latin-1 content then we treat the String's data
>> 47: array as a jbyte[], else a jchar[]. The lengths of these arrays are specified
> 
> jchar is 16-bits right?  So the max length if not latin-1 is INT_MAX/2 ?
The maximum number of jchar characters is INT_MAX/2
-------------
PR Comment: https://git.openjdk.org/jdk/pull/20560#issuecomment-2297676721
PR Review Comment: https://git.openjdk.org/jdk/pull/20560#discussion_r1722493414
PR Review Comment: https://git.openjdk.org/jdk/pull/20560#discussion_r1722494136
PR Review Comment: https://git.openjdk.org/jdk/pull/20560#discussion_r1722494680
PR Review Comment: https://git.openjdk.org/jdk/pull/20560#discussion_r1722495327
PR Review Comment: https://git.openjdk.org/jdk/pull/20560#discussion_r1722495875
PR Review Comment: https://git.openjdk.org/jdk/pull/20560#discussion_r1722496709
    
    
More information about the hotspot-dev
mailing list