RFR: 8302871: Speed up StringLatin1.regionMatchesCI [v7]
Alan Bateman
alanb at openjdk.org
Tue Feb 21 14:30:29 UTC 2023
On Tue, 21 Feb 2023 11:14:13 GMT, Eirik Bjorsnos <duke at openjdk.org> wrote:
>> This PR suggests we can speed up `StringLatin1.regionMatchesCI` by applying 'the oldest ASCII trick in the book'.
>>
>> The new static method `CharacterDataLatin1.equalsIgnoreCase` compares two latin1 bytes for equality ignoring case. `StringLatin1.regionMatchesCI` is updated to use `equalsIgnoreCase`
>>
>> To verify the correctness of `equalsIgnoreCase`, a new test is added to `EqualsIgnoreCase` with an exhaustive verification that all 256x256 latin1 code point pairs have an `equalsIgnoreCase` consistent with Character.toUpperCase, Character.toLowerCase.
>>
>> Performance is tested for matching and mismatching cases of code point pairs picked from the ASCII letter, ASCII number and latin1 letter ranges. Results in the first comment below.
>
> Eirik Bjorsnos has updated the pull request incrementally with one additional commit since the last revision:
>
> Remove whitespace following '('
src/java.base/share/classes/java/lang/CharacterDataLatin1.java.template line 163:
> 161: return mapChar;
> 162: }
> 163: /**
I assume you should insert a blank line between the two methods.
src/java.base/share/classes/java/lang/CharacterDataLatin1.java.template line 175:
> 173: }
> 174: // uppercase b1 using 'the oldest ASCII trick in the book'
> 175: int U = b1 & 0xDF;
I'm sure some people reading this comment will wonder which book :-) It might be better to drop that bit and if possible, find a better name for "U" as normally variables start with a lower case.
-------------
PR: https://git.openjdk.org/jdk/pull/12632
More information about the core-libs-dev
mailing list