RFR: 8302877: Speed up latin1 case conversions
Eirik Bjorsnos
duke at openjdk.org
Mon Feb 20 14:47:37 UTC 2023
On Fri, 17 Feb 2023 17:31:09 GMT, Eirik Bjorsnos <duke at openjdk.org> wrote:
> This PR suggests we speed up Character.toUpperCase and Character.toLowerCase for latin1 code points by applying the 'oldest ASCII trick in the book'.
>
> This takes advantage of the fact that latin1 uppercase code points are always 0x20 lower than their lowercase (with the exception of two code points which uppercase out of latin1).
>
> To verify the correctness of the new implementation, the test `Latin1CaseConversion` is added with an exhaustive verification of toUpperCase/toLowerCase for all latin1 code points.
>
> The implementation needs to balance the performance of the various ranges in latin1. An effort has been made to favour operations on ASCII code points, without causing excessive regression for higher code points.
>
> Performance is benchmarked for 7 chosen sample code points, each representing a range or a special-case. Results in the first comment.
Benchmark results:
Baseline:
Benchmark (codePoint) Mode Cnt Score Error Units
Characters.Latin1CaseConversion.toLowerCase low avgt 15 1.267 ± 0.013 ns/op
Characters.Latin1CaseConversion.toLowerCase A avgt 15 1.657 ± 0.011 ns/op
Characters.Latin1CaseConversion.toLowerCase a avgt 15 1.258 ± 0.005 ns/op
Characters.Latin1CaseConversion.toLowerCase A-grave avgt 15 1.656 ± 0.011 ns/op
Characters.Latin1CaseConversion.toLowerCase a-grave avgt 15 1.270 ± 0.023 ns/op
Characters.Latin1CaseConversion.toLowerCase mu avgt 15 1.261 ± 0.006 ns/op
Characters.Latin1CaseConversion.toLowerCase yD avgt 15 1.260 ± 0.005 ns/op
Characters.Latin1CaseConversion.toUpperCase low avgt 15 1.284 ± 0.043 ns/op
Characters.Latin1CaseConversion.toUpperCase A avgt 15 1.264 ± 0.008 ns/op
Characters.Latin1CaseConversion.toUpperCase a avgt 15 1.818 ± 0.016 ns/op
Characters.Latin1CaseConversion.toUpperCase A-grave avgt 15 1.261 ± 0.015 ns/op
Characters.Latin1CaseConversion.toUpperCase a-grave avgt 15 1.822 ± 0.013 ns/op
Characters.Latin1CaseConversion.toUpperCase mu avgt 15 1.823 ± 0.006 ns/op
Characters.Latin1CaseConversion.toUpperCase yD avgt 15 1.822 ± 0.008 ns/op
PR:
Benchmark (codePoint) Mode Cnt Score Error Units
Characters.Latin1CaseConversion.toLowerCase low avgt 15 0.878 ± 0.005 ns/op
Characters.Latin1CaseConversion.toLowerCase A avgt 15 1.038 ± 0.009 ns/op
Characters.Latin1CaseConversion.toLowerCase a avgt 15 1.036 ± 0.007 ns/op
Characters.Latin1CaseConversion.toLowerCase A-grave avgt 15 1.357 ± 0.015 ns/op
Characters.Latin1CaseConversion.toLowerCase a-grave avgt 15 1.352 ± 0.003 ns/op
Characters.Latin1CaseConversion.toLowerCase mu avgt 15 1.273 ± 0.002 ns/op
Characters.Latin1CaseConversion.toLowerCase yD avgt 15 1.352 ± 0.004 ns/op
Characters.Latin1CaseConversion.toUpperCase low avgt 15 0.880 ± 0.013 ns/op
Characters.Latin1CaseConversion.toUpperCase A avgt 15 0.920 ± 0.071 ns/op
Characters.Latin1CaseConversion.toUpperCase a avgt 15 1.055 ± 0.013 ns/op
Characters.Latin1CaseConversion.toUpperCase A-grave avgt 15 1.394 ± 0.010 ns/op
Characters.Latin1CaseConversion.toUpperCase a-grave avgt 15 1.391 ± 0.009 ns/op
Characters.Latin1CaseConversion.toUpperCase mu avgt 15 1.597 ± 0.021 ns/op
Characters.Latin1CaseConversion.toUpperCase yD avgt 15 1.354 ± 0.003 ns/op
-------------
PR: https://git.openjdk.org/jdk/pull/12623
More information about the core-libs-dev
mailing list