RFR: 8338257: UTF8 lengths should be size_t not int [v2]
David Holmes
dholmes at openjdk.org
Mon Aug 19 23:19:04 UTC 2024
On Thu, 15 Aug 2024 20:43:51 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:
> Did you try to compile this with -Wconversion (not -Wsign-conversion) without -Werror?
I hadn't but will do so ...
> src/hotspot/share/classfile/javaClasses.cpp line 586:
>
>> 584: ResourceMark rm;
>> 585: jbyte* position = (length == 0) ? nullptr : value->byte_at_addr(0);
>> 586: size_t utf8_len = length;
>
> These are sign conversions. This is too big to change here and not sure what the fan out would be but java_lang_String::length() should return unsigned and also arrayOop.hpp length. They're never negative. We should probably assert that (below).
Sorry I don't follow what you are asking for here. A Java String length is an int even though it can't be negative in length - just as Java array indices are int even though they can't be negative either.
Anyway this conversion should have a cast, though none of the usual complaining compilers complained about it.
> src/hotspot/share/classfile/javaClasses.cpp line 639:
>
>> 637: if (length == 0) {
>> 638: return 0;
>> 639: }
>
> Maybe assert length > 0 here?
Why "> 0" ?
> src/hotspot/share/classfile/javaClasses.cpp line 702:
>
>> 700: } else {
>> 701: jbyte* position = (length == 0) ? nullptr : value->byte_at_addr(0);
>> 702: return UNICODE::as_utf8(position, static_cast<int>(length), buf, buflen);
>
> Don't you want checked_cast here not static casts.
These lengths are the lengths of Java array so an int >= 0.
> src/hotspot/share/prims/jni.cpp line 2226:
>
>> 2224: HOTSPOT_JNI_GETSTRINGUTFLENGTH_ENTRY(env, string);
>> 2225: oop java_string = JNIHandles::resolve_non_null(string);
>> 2226: jsize ret = java_lang_String::utf8_length_as_int(java_string);
>
> So the spec says that this should be jsize (signed int), which is why this is, right?
Yes. Hence the other change to add a new JNI API.
> src/hotspot/share/utilities/utf8.cpp line 512:
>
>> 510:
>> 511: template<typename T>
>> 512: int UNICODE::utf8_length_as_int(const T* base, int length) {
>
> Why was this parameter left as int and not size_t ?
It is the length of a Java array - so int >= 0
> src/hotspot/share/utilities/utf8.hpp line 47:
>
>> 45:
>> 46: In the code below if we have latin-1 content then we treat the String's data
>> 47: array as a jbyte[], else a jchar[]. The lengths of these arrays are specified
>
> jchar is 16-bits right? So the max length if not latin-1 is INT_MAX/2 ?
The maximum number of jchar characters is INT_MAX/2
-------------
PR Comment: https://git.openjdk.org/jdk/pull/20560#issuecomment-2297676721
PR Review Comment: https://git.openjdk.org/jdk/pull/20560#discussion_r1722493414
PR Review Comment: https://git.openjdk.org/jdk/pull/20560#discussion_r1722494136
PR Review Comment: https://git.openjdk.org/jdk/pull/20560#discussion_r1722494680
PR Review Comment: https://git.openjdk.org/jdk/pull/20560#discussion_r1722495327
PR Review Comment: https://git.openjdk.org/jdk/pull/20560#discussion_r1722495875
PR Review Comment: https://git.openjdk.org/jdk/pull/20560#discussion_r1722496709
More information about the serviceability-dev
mailing list