RFR: 8327156: Avoid copying in StringTable::intern(oop, TRAPS)

David Holmes dholmes at openjdk.org
Tue Oct 15 21:04:49 UTC 2024


On Tue, 15 Oct 2024 13:07:15 GMT, Johan Sjölen <jsjolen at openjdk.org> wrote:

>> src/hotspot/share/classfile/javaClasses.cpp line 740:
>> 
>>> 738: }
>>> 739: 
>>> 740: bool java_lang_String::equals(oop java_string, const jchar* chars, int num_unicode_points) {
>> 
>> Please undo these changes - these are not "unicode points".
>
> I don't think that you're correct here, David. As far as I understand, the length `len` does not represent the length of the `chars` array but the number of unicode code points in the array. The same is true for the other overloads, such as the UTF8 string. I'd appreciate it if you could explain why what I'm saying is incorrect here.

The `len` parameter is just the length of the array. We use it to iterate through the array. e.g.

    for (int i = 0; i < len; i++) {
      if (value->char_at(i) != chars[i]) {
        return false;
      }
    }

But you are right that this is also the number of "code points" as defined by the Unicode standard - my apologies for that. I was mistakenly thinking that codepoints refer to actual characters, but it is a general term used for any numeric value in the unicode codespace, including non-characters and importantly surrogates. But while technically accurate I don't think using `num_unicode_points` adds any value here - quite the opposite as it obscures that this is simply the length of the array.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/21325#discussion_r1802004224


More information about the hotspot-dev mailing list