Optimizing UUID#fromString(String)

Jon Chambers jon.chambers at gmail.com
Mon Jan 29 13:59:22 UTC 2018


Awesome. Excited to see the Character.digit improvements!

To set expectations from here, I'll need to do two things to move forward:

1. I need to sign the OCA, as you've pointed out. As a courtesy, I'd like
to alert my employer before doing so. This has been a non-issue for similar
agreements for other open-source projects in the past.
2. I'll need to work through the OpenJDK build/test process and (I
suspect?) do some very preliminary reading on Mercurial. Because this is a
personal project, I may not have an opportunity to make much headway there
until this weekend.

That said, I'm grateful for the sponsorship and look forward to working
with you. I'll follow up directly to discuss the details.

Cheers!

-Jon

On Mon, Jan 29, 2018 at 6:30 AM, Claes Redestad <claes.redestad at oracle.com>
wrote:

> Hi,
>
> I've filed two RFEs:
>
> - https://bugs.openjdk.java.net/browse/JDK-8196331 for providing a
>   fast-path for ASCII/Latin1 input to Character.digit. Patch out
>   for review
> - https://bugs.openjdk.java.net/browse/JDK-8196334 for tracking
>   the optimizations specific to UUID proposed here
>
> Jon, if you want to provide a patch I'd be happy to sponsor. I have
> a few ideas on how to retrofit existing behavior that we can discuss
> offline. Also, if you haven't filed an OCA then this might be a good
> time to do so[1]
>
> Thanks!
>
> /Claes
>
> [1] http://openjdk.java.net/contribute/
>
>
> On 2018-01-29 00:19, Claes Redestad wrote:
>
>> Hi Jon,
>>
>> On 2018-01-27 17:05, Jon Chambers wrote:
>>
>>> Regardless, I wanted to call this optimization opportunity to your
>>> attention, and would be happy to offer a proper patch if this seems like
>>> a
>>> worthwhile change.
>>>
>>
>> this does looks promising - and at least points out there might be
>> some performance issues lurking here!
>>
>> Found another compatibility issue, however: The current
>> UUID.fromString allow use of any unicode digits parseable into
>> digits. Say, Arabic digits (U+0660 - U+0669):
>>
>> jshell> UUID.fromString("\u0669-\u0669-\u0669-\u0669-\u0669");
>> $2 ==> 00000009-0009-0009-0009-000000000009
>>
>> That you're getting some improvement from avoiding
>> Character.digit(..) stands out in a way, so I wondered if there's
>> something going on here... and it does seem that this method could
>> be optimized by providing a fast-path in the CharacterDataLatin1
>> plane...
>>
>> I experimented with a few solutions, and the best I could find was
>> a 256 element byte[] lookup table for digits in CharacterDataLatin1
>> (spanning the entire 8-bit space of CharacterDataLatin1 allows us
>> to avoid a branch):
>>
>> http://cr.openjdk.java.net/~redestad/scratch/char.digit.00/
>>
>> This speeds up your micro testing java.util.UUID
>> (UUIDBenchmark.benchmarkUUIDFromString) by ~1.6x. The
>> improvement is there on all counters when looking at -prof perfnorm
>> (less loads, stores, branches, instructions, better CPI etc)[1], so seems
>> like a pretty straightforward thing to consider independently.
>>
>> Your FastUUID implementation is still quite a bit faster, though, but
>> it'd be good to dig around for a bit to see if there are other more
>> general improvements like this to be had..
>>
>> Thanks!
>>
>> /Claes
>>
>> [1]
>> Baseline:
>> Benchmark                                                    Mode
>> Cnt     Score   Error  Units
>> UUIDBenchmark.benchmarkUUIDFromString                        avgt 4
>> 0.505 ± 0.066  us/op
>> UUIDBenchmark.benchmarkUUIDFromString:CPI avgt 1.024           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-load-misses
>> avgt          3.000           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-loads avgt
>> 211.128           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-stores avgt
>> 34.894           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:L1-icache-load-misses
>> avgt          0.067           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:LLC-load-misses avgt
>> 0.103           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:LLC-loads avgt 0.812           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:LLC-store-misses avgt
>> 0.004           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:LLC-stores avgt 0.054
>> #/op
>> UUIDBenchmark.benchmarkUUIDFromString:branch-misses avgt
>> 13.269           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:branches avgt 301.931
>> #/op
>> UUIDBenchmark.benchmarkUUIDFromString:cycles avgt 1585.057           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:dTLB-load-misses avgt
>> 0.005           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:dTLB-loads avgt 208.788
>> #/op
>> UUIDBenchmark.benchmarkUUIDFromString:dTLB-store-misses avgt         ≈
>> 10⁻⁴           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:dTLB-stores avgt 30.540
>> #/op
>> UUIDBenchmark.benchmarkUUIDFromString:iTLB-load-misses avgt
>> 0.002           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:iTLB-loads avgt 0.006
>> #/op
>> UUIDBenchmark.benchmarkUUIDFromString:instructions avgt
>> 1547.568           #/op
>>
>> Results with http://cr.openjdk.java.net/~redestad/scratch/char.digit.00/
>> -- :
>> Benchmark                                                    Mode
>> Cnt     Score   Error  Units
>> UUIDBenchmark.benchmarkUUIDFromString                        avgt 4
>> 0.310 ± 0.022  us/op
>> UUIDBenchmark.benchmarkUUIDFromString:CPI avgt 0.832           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-load-misses
>> avgt          2.026           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-loads avgt
>> 208.227           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:L1-dcache-stores avgt
>> 30.673           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:L1-icache-load-misses
>> avgt          0.052           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:LLC-load-misses avgt
>> 0.042           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:LLC-loads avgt 1.027           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:LLC-store-misses avgt
>> 0.007           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:LLC-stores avgt 0.044
>> #/op
>> UUIDBenchmark.benchmarkUUIDFromString:branch-misses avgt 0.040
>> #/op
>> UUIDBenchmark.benchmarkUUIDFromString:branches avgt 255.547
>> #/op
>> UUIDBenchmark.benchmarkUUIDFromString:cycles avgt 981.466           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:dTLB-load-misses avgt
>> 0.003           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:dTLB-loads avgt 213.797
>> #/op
>> UUIDBenchmark.benchmarkUUIDFromString:dTLB-store-misses avgt         ≈
>> 10⁻⁴           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:dTLB-stores avgt 30.888
>> #/op
>> UUIDBenchmark.benchmarkUUIDFromString:iTLB-load-misses avgt
>> 0.002           #/op
>> UUIDBenchmark.benchmarkUUIDFromString:iTLB-loads avgt 0.002
>> #/op
>> UUIDBenchmark.benchmarkUUIDFromString:instructions avgt
>> 1180.129           #/op
>>
>
>


More information about the core-libs-dev mailing list