Optimizing UUID#fromString(String)
Claes Redestad
claes.redestad at oracle.com
Sun Jan 28 23:19:43 UTC 2018
Hi Jon,
On 2018-01-27 17:05, Jon Chambers wrote:
> Regardless, I wanted to call this optimization opportunity to your
> attention, and would be happy to offer a proper patch if this seems like a
> worthwhile change.
this does looks promising - and at least points out there might be
some performance issues lurking here!
Found another compatibility issue, however: The current
UUID.fromString allow use of any unicode digits parseable into
digits. Say, Arabic digits (U+0660 - U+0669):
jshell> UUID.fromString("\u0669-\u0669-\u0669-\u0669-\u0669");
$2 ==> 00000009-0009-0009-0009-000000000009
That you're getting some improvement from avoiding
Character.digit(..) stands out in a way, so I wondered if there's
something going on here... and it does seem that this method could
be optimized by providing a fast-path in the CharacterDataLatin1
plane...
I experimented with a few solutions, and the best I could find was
a 256 element byte[] lookup table for digits in CharacterDataLatin1
(spanning the entire 8-bit space of CharacterDataLatin1 allows us
to avoid a branch):
http://cr.openjdk.java.net/~redestad/scratch/char.digit.00/
This speeds up your micro testing java.util.UUID
(UUIDBenchmark.benchmarkUUIDFromString) by ~1.6x. The
improvement is there on all counters when looking at -prof perfnorm
(less loads, stores, branches, instructions, better CPI etc)[1], so seems
like a pretty straightforward thing to consider independently.
Your FastUUID implementation is still quite a bit faster, though, but
it'd be good to dig around for a bit to see if there are other more
general improvements like this to be had..
Thanks!
/Claes
[1]
Baseline:
Benchmark Mode
Cnt Score Error Units
UUIDBenchmark.benchmarkUUIDFromString avgt 4
0.505 ± 0.066 us/op
UUIDBenchmark.benchmarkUUIDFromString:CPI avgt 1.024 #/op
UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-load-misses
avgt 3.000 #/op
UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-loads avgt
211.128 #/op
UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-stores avgt
34.894 #/op
UUIDBenchmark.benchmarkUUIDFromString:L1-icache-load-misses
avgt 0.067 #/op
UUIDBenchmark.benchmarkUUIDFromString:LLC-load-misses avgt
0.103 #/op
UUIDBenchmark.benchmarkUUIDFromString:LLC-loads avgt
0.812 #/op
UUIDBenchmark.benchmarkUUIDFromString:LLC-store-misses avgt
0.004 #/op
UUIDBenchmark.benchmarkUUIDFromString:LLC-stores avgt
0.054 #/op
UUIDBenchmark.benchmarkUUIDFromString:branch-misses avgt
13.269 #/op
UUIDBenchmark.benchmarkUUIDFromString:branches avgt
301.931 #/op
UUIDBenchmark.benchmarkUUIDFromString:cycles avgt
1585.057 #/op
UUIDBenchmark.benchmarkUUIDFromString:dTLB-load-misses avgt
0.005 #/op
UUIDBenchmark.benchmarkUUIDFromString:dTLB-loads avgt
208.788 #/op
UUIDBenchmark.benchmarkUUIDFromString:dTLB-store-misses avgt ≈
10⁻⁴ #/op
UUIDBenchmark.benchmarkUUIDFromString:dTLB-stores avgt
30.540 #/op
UUIDBenchmark.benchmarkUUIDFromString:iTLB-load-misses avgt
0.002 #/op
UUIDBenchmark.benchmarkUUIDFromString:iTLB-loads avgt
0.006 #/op
UUIDBenchmark.benchmarkUUIDFromString:instructions avgt
1547.568 #/op
Results with http://cr.openjdk.java.net/~redestad/scratch/char.digit.00/
-- :
Benchmark Mode
Cnt Score Error Units
UUIDBenchmark.benchmarkUUIDFromString avgt 4
0.310 ± 0.022 us/op
UUIDBenchmark.benchmarkUUIDFromString:CPI avgt 0.832 #/op
UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-load-misses
avgt 2.026 #/op
UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-loads avgt
208.227 #/op
UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-stores avgt
30.673 #/op
UUIDBenchmark.benchmarkUUIDFromString:L1-icache-load-misses
avgt 0.052 #/op
UUIDBenchmark.benchmarkUUIDFromString:LLC-load-misses avgt
0.042 #/op
UUIDBenchmark.benchmarkUUIDFromString:LLC-loads avgt
1.027 #/op
UUIDBenchmark.benchmarkUUIDFromString:LLC-store-misses avgt
0.007 #/op
UUIDBenchmark.benchmarkUUIDFromString:LLC-stores avgt
0.044 #/op
UUIDBenchmark.benchmarkUUIDFromString:branch-misses avgt
0.040 #/op
UUIDBenchmark.benchmarkUUIDFromString:branches avgt
255.547 #/op
UUIDBenchmark.benchmarkUUIDFromString:cycles avgt
981.466 #/op
UUIDBenchmark.benchmarkUUIDFromString:dTLB-load-misses avgt
0.003 #/op
UUIDBenchmark.benchmarkUUIDFromString:dTLB-loads avgt
213.797 #/op
UUIDBenchmark.benchmarkUUIDFromString:dTLB-store-misses avgt ≈
10⁻⁴ #/op
UUIDBenchmark.benchmarkUUIDFromString:dTLB-stores avgt
30.888 #/op
UUIDBenchmark.benchmarkUUIDFromString:iTLB-load-misses avgt
0.002 #/op
UUIDBenchmark.benchmarkUUIDFromString:iTLB-loads avgt
0.002 #/op
UUIDBenchmark.benchmarkUUIDFromString:instructions avgt
1180.129 #/op
More information about the core-libs-dev
mailing list