Optimizing UUID#fromString(String)

Claes Redestad claes.redestad at oracle.com
Mon Jan 29 11:30:05 UTC 2018


Hi,

I've filed two RFEs:

- https://bugs.openjdk.java.net/browse/JDK-8196331 for providing a
   fast-path for ASCII/Latin1 input to Character.digit. Patch out
   for review
- https://bugs.openjdk.java.net/browse/JDK-8196334 for tracking
   the optimizations specific to UUID proposed here

Jon, if you want to provide a patch I'd be happy to sponsor. I have
a few ideas on how to retrofit existing behavior that we can discuss
offline. Also, if you haven't filed an OCA then this might be a good
time to do so[1]

Thanks!

/Claes

[1] http://openjdk.java.net/contribute/

On 2018-01-29 00:19, Claes Redestad wrote:
> Hi Jon,
>
> On 2018-01-27 17:05, Jon Chambers wrote:
>> Regardless, I wanted to call this optimization opportunity to your
>> attention, and would be happy to offer a proper patch if this seems 
>> like a
>> worthwhile change.
>
> this does looks promising - and at least points out there might be
> some performance issues lurking here!
>
> Found another compatibility issue, however: The current
> UUID.fromString allow use of any unicode digits parseable into
> digits. Say, Arabic digits (U+0660 - U+0669):
>
> jshell> UUID.fromString("\u0669-\u0669-\u0669-\u0669-\u0669");
> $2 ==> 00000009-0009-0009-0009-000000000009
>
> That you're getting some improvement from avoiding
> Character.digit(..) stands out in a way, so I wondered if there's
> something going on here... and it does seem that this method could
> be optimized by providing a fast-path in the CharacterDataLatin1
> plane...
>
> I experimented with a few solutions, and the best I could find was
> a 256 element byte[] lookup table for digits in CharacterDataLatin1
> (spanning the entire 8-bit space of CharacterDataLatin1 allows us
> to avoid a branch):
>
> http://cr.openjdk.java.net/~redestad/scratch/char.digit.00/
>
> This speeds up your micro testing java.util.UUID
> (UUIDBenchmark.benchmarkUUIDFromString) by ~1.6x. The
> improvement is there on all counters when looking at -prof perfnorm
> (less loads, stores, branches, instructions, better CPI etc)[1], so seems
> like a pretty straightforward thing to consider independently.
>
> Your FastUUID implementation is still quite a bit faster, though, but
> it'd be good to dig around for a bit to see if there are other more
> general improvements like this to be had..
>
> Thanks!
>
> /Claes
>
> [1]
> Baseline:
> Benchmark                                                    Mode 
> Cnt     Score   Error  Units
> UUIDBenchmark.benchmarkUUIDFromString                        avgt 
> 4     0.505 ± 0.066  us/op
> UUIDBenchmark.benchmarkUUIDFromString:CPI avgt 1.024           #/op
> UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-load-misses 
> avgt          3.000           #/op
> UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-loads avgt 
> 211.128           #/op
> UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-stores avgt         
> 34.894           #/op
> UUIDBenchmark.benchmarkUUIDFromString:L1-icache-load-misses 
> avgt          0.067           #/op
> UUIDBenchmark.benchmarkUUIDFromString:LLC-load-misses avgt          
> 0.103           #/op
> UUIDBenchmark.benchmarkUUIDFromString:LLC-loads avgt 0.812           #/op
> UUIDBenchmark.benchmarkUUIDFromString:LLC-store-misses avgt          
> 0.004           #/op
> UUIDBenchmark.benchmarkUUIDFromString:LLC-stores avgt 0.054           
> #/op
> UUIDBenchmark.benchmarkUUIDFromString:branch-misses avgt 
> 13.269           #/op
> UUIDBenchmark.benchmarkUUIDFromString:branches avgt 301.931           
> #/op
> UUIDBenchmark.benchmarkUUIDFromString:cycles avgt 1585.057           #/op
> UUIDBenchmark.benchmarkUUIDFromString:dTLB-load-misses avgt          
> 0.005           #/op
> UUIDBenchmark.benchmarkUUIDFromString:dTLB-loads avgt 
> 208.788           #/op
> UUIDBenchmark.benchmarkUUIDFromString:dTLB-store-misses avgt         ≈ 
> 10⁻⁴           #/op
> UUIDBenchmark.benchmarkUUIDFromString:dTLB-stores avgt 
> 30.540           #/op
> UUIDBenchmark.benchmarkUUIDFromString:iTLB-load-misses avgt          
> 0.002           #/op
> UUIDBenchmark.benchmarkUUIDFromString:iTLB-loads avgt 0.006           
> #/op
> UUIDBenchmark.benchmarkUUIDFromString:instructions avgt 
> 1547.568           #/op
>
> Results with 
> http://cr.openjdk.java.net/~redestad/scratch/char.digit.00/ -- :
> Benchmark                                                    Mode 
> Cnt     Score   Error  Units
> UUIDBenchmark.benchmarkUUIDFromString                        avgt 
> 4     0.310 ± 0.022  us/op
> UUIDBenchmark.benchmarkUUIDFromString:CPI avgt 0.832           #/op
> UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-load-misses 
> avgt          2.026           #/op
> UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-loads avgt 
> 208.227           #/op
> UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-stores avgt         
> 30.673           #/op
> UUIDBenchmark.benchmarkUUIDFromString:L1-icache-load-misses 
> avgt          0.052           #/op
> UUIDBenchmark.benchmarkUUIDFromString:LLC-load-misses avgt          
> 0.042           #/op
> UUIDBenchmark.benchmarkUUIDFromString:LLC-loads avgt 1.027           #/op
> UUIDBenchmark.benchmarkUUIDFromString:LLC-store-misses avgt          
> 0.007           #/op
> UUIDBenchmark.benchmarkUUIDFromString:LLC-stores avgt 0.044           
> #/op
> UUIDBenchmark.benchmarkUUIDFromString:branch-misses avgt 
> 0.040           #/op
> UUIDBenchmark.benchmarkUUIDFromString:branches avgt 255.547           
> #/op
> UUIDBenchmark.benchmarkUUIDFromString:cycles avgt 981.466           #/op
> UUIDBenchmark.benchmarkUUIDFromString:dTLB-load-misses avgt          
> 0.003           #/op
> UUIDBenchmark.benchmarkUUIDFromString:dTLB-loads avgt 
> 213.797           #/op
> UUIDBenchmark.benchmarkUUIDFromString:dTLB-store-misses avgt         ≈ 
> 10⁻⁴           #/op
> UUIDBenchmark.benchmarkUUIDFromString:dTLB-stores avgt 
> 30.888           #/op
> UUIDBenchmark.benchmarkUUIDFromString:iTLB-load-misses avgt          
> 0.002           #/op
> UUIDBenchmark.benchmarkUUIDFromString:iTLB-loads avgt 0.002           
> #/op
> UUIDBenchmark.benchmarkUUIDFromString:instructions avgt 
> 1180.129           #/op



More information about the core-libs-dev mailing list