Optimizing UUID#fromString(String)

Claes Redestad claes.redestad at oracle.com
Sun Jan 28 23:19:43 UTC 2018


Hi Jon,

On 2018-01-27 17:05, Jon Chambers wrote:
> Regardless, I wanted to call this optimization opportunity to your
> attention, and would be happy to offer a proper patch if this seems like a
> worthwhile change.

this does looks promising - and at least points out there might be
some performance issues lurking here!

Found another compatibility issue, however: The current
UUID.fromString allow use of any unicode digits parseable into
digits. Say, Arabic digits (U+0660 - U+0669):

jshell> UUID.fromString("\u0669-\u0669-\u0669-\u0669-\u0669");
$2 ==> 00000009-0009-0009-0009-000000000009

That you're getting some improvement from avoiding
Character.digit(..) stands out in a way, so I wondered if there's
something going on here... and it does seem that this method could
be optimized by providing a fast-path in the CharacterDataLatin1
plane...

I experimented with a few solutions, and the best I could find was
a 256 element byte[] lookup table for digits in CharacterDataLatin1
(spanning the entire 8-bit space of CharacterDataLatin1 allows us
to avoid a branch):

http://cr.openjdk.java.net/~redestad/scratch/char.digit.00/

This speeds up your micro testing java.util.UUID
(UUIDBenchmark.benchmarkUUIDFromString) by ~1.6x. The
improvement is there on all counters when looking at -prof perfnorm
(less loads, stores, branches, instructions, better CPI etc)[1], so seems
like a pretty straightforward thing to consider independently.

Your FastUUID implementation is still quite a bit faster, though, but
it'd be good to dig around for a bit to see if there are other more
general improvements like this to be had..

Thanks!

/Claes

[1]
Baseline:
Benchmark                                                    Mode 
Cnt     Score   Error  Units
UUIDBenchmark.benchmarkUUIDFromString                        avgt 4     
0.505 ± 0.066  us/op
UUIDBenchmark.benchmarkUUIDFromString:CPI avgt          1.024           #/op
UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-load-misses 
avgt          3.000           #/op
UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-loads avgt        
211.128           #/op
UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-stores avgt         
34.894           #/op
UUIDBenchmark.benchmarkUUIDFromString:L1-icache-load-misses 
avgt          0.067           #/op
UUIDBenchmark.benchmarkUUIDFromString:LLC-load-misses avgt          
0.103           #/op
UUIDBenchmark.benchmarkUUIDFromString:LLC-loads avgt          
0.812           #/op
UUIDBenchmark.benchmarkUUIDFromString:LLC-store-misses avgt          
0.004           #/op
UUIDBenchmark.benchmarkUUIDFromString:LLC-stores avgt          
0.054           #/op
UUIDBenchmark.benchmarkUUIDFromString:branch-misses avgt         
13.269           #/op
UUIDBenchmark.benchmarkUUIDFromString:branches avgt        
301.931           #/op
UUIDBenchmark.benchmarkUUIDFromString:cycles avgt       
1585.057           #/op
UUIDBenchmark.benchmarkUUIDFromString:dTLB-load-misses avgt          
0.005           #/op
UUIDBenchmark.benchmarkUUIDFromString:dTLB-loads avgt        
208.788           #/op
UUIDBenchmark.benchmarkUUIDFromString:dTLB-store-misses avgt         ≈ 
10⁻⁴           #/op
UUIDBenchmark.benchmarkUUIDFromString:dTLB-stores avgt         
30.540           #/op
UUIDBenchmark.benchmarkUUIDFromString:iTLB-load-misses avgt          
0.002           #/op
UUIDBenchmark.benchmarkUUIDFromString:iTLB-loads avgt          
0.006           #/op
UUIDBenchmark.benchmarkUUIDFromString:instructions avgt       
1547.568           #/op

Results with http://cr.openjdk.java.net/~redestad/scratch/char.digit.00/ 
-- :
Benchmark                                                    Mode 
Cnt     Score   Error  Units
UUIDBenchmark.benchmarkUUIDFromString                        avgt 4     
0.310 ± 0.022  us/op
UUIDBenchmark.benchmarkUUIDFromString:CPI avgt          0.832           #/op
UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-load-misses 
avgt          2.026           #/op
UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-loads avgt        
208.227           #/op
UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-stores avgt         
30.673           #/op
UUIDBenchmark.benchmarkUUIDFromString:L1-icache-load-misses 
avgt          0.052           #/op
UUIDBenchmark.benchmarkUUIDFromString:LLC-load-misses avgt          
0.042           #/op
UUIDBenchmark.benchmarkUUIDFromString:LLC-loads avgt          
1.027           #/op
UUIDBenchmark.benchmarkUUIDFromString:LLC-store-misses avgt          
0.007           #/op
UUIDBenchmark.benchmarkUUIDFromString:LLC-stores avgt          
0.044           #/op
UUIDBenchmark.benchmarkUUIDFromString:branch-misses avgt          
0.040           #/op
UUIDBenchmark.benchmarkUUIDFromString:branches avgt        
255.547           #/op
UUIDBenchmark.benchmarkUUIDFromString:cycles avgt        
981.466           #/op
UUIDBenchmark.benchmarkUUIDFromString:dTLB-load-misses avgt          
0.003           #/op
UUIDBenchmark.benchmarkUUIDFromString:dTLB-loads avgt        
213.797           #/op
UUIDBenchmark.benchmarkUUIDFromString:dTLB-store-misses avgt         ≈ 
10⁻⁴           #/op
UUIDBenchmark.benchmarkUUIDFromString:dTLB-stores avgt         
30.888           #/op
UUIDBenchmark.benchmarkUUIDFromString:iTLB-load-misses avgt          
0.002           #/op
UUIDBenchmark.benchmarkUUIDFromString:iTLB-loads avgt          
0.002           #/op
UUIDBenchmark.benchmarkUUIDFromString:instructions avgt       
1180.129           #/op


More information about the core-libs-dev mailing list