RFR: 8304245: Speed up CharacterData.of by avoiding bit shifting in the latin1 fast-path test [v2]

Wed Mar 15 15:00:25 UTC 2023

On Wed, 15 Mar 2023 14:31:03 GMT, Eirik Bjorsnos <duke at openjdk.org> wrote:

>> By avoiding a bit shift operation for the latin1 fast-path test, we can speed up the `java.lang.CharacterData.of` method by ~25% for latin1 code points.
>> 
>> The latin1 test is currently implemented as `ch >>> 8 == 0`.  We can replace this with `ch >= 0 && ch <= 0xFF` for a noticable performance gain (especially for Latin1 code points):
>> 
>> This method is called frequently by various property-determining methods in `java.lang.Character` like `isLowerCase`, `isDigit` etc, so one should expect improvements for all these methods.
>> 
>> Performance is tested using the `Characters.isDigit` benchmark using the digits '0' (decimal 48, in CharacterDataLatin1) and '\u0660' (decimal 1632, in CharacterData00):
>> 
>> Baseline:
>> 
>> 
>> Benchmark           (codePoint)  Mode  Cnt  Score   Error  Units
>> Characters.isDigit           48  avgt   15  0.870 ± 0.011  ns/op
>> Characters.isDigit         1632  avgt   15  2.168 ± 0.017  ns/op
>> 
>> PR:
>> 
>> 
>> Benchmark           (codePoint)  Mode  Cnt  Score   Error  Units
>> Characters.isDigit           48  avgt   15  0.654 ± 0.007  ns/op
>> Characters.isDigit         1632  avgt   15  2.032 ± 0.019  ns/op
>
> Eirik Bjorsnos has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Update StringLatin1.canEncode to sync with same test in CharacterData.of

Just for fun, I tried with a benchmark where the code point is Latin1 in every other call:

@Benchmark
public void isDigitVarying(Blackhole blackhole) {
    blackhole.consume(Character.isDigit(48));
    blackhole.consume(Character.isDigit(1632));
}

With this benchmark, there is no difference between the baseline, the PR and using StringLatin1.canEncode:

Baseline:

Benchmark                  (codePoint)  Mode  Cnt  Score   Error  Units
Characters.isDigitVarying         1632  avgt   15  1.198 ± 0.056  ns/op

PR:

Benchmark                  (codePoint)  Mode  Cnt  Score   Error  Units
Characters.isDigitVarying         1632  avgt   15  1.195 ± 0.058  ns/op

StringLatin1.canEncode:

Benchmark                  (codePoint)  Mode  Cnt  Score   Error  Units
Characters.isDigitVarying         1632  avgt   15  1.193 ± 0.055  ns/op
``` 

At this point, I'm starting to wonder a bit if the performance benefits suggested by this PR might be dubious and will only surface in very narrow benchmarks. On the other hand, it does not seem harmful either. What do people think?

-------------

PR: https://git.openjdk.org/jdk/pull/13040