RFR [9] 8006627: UUID to/from String performance should be improved by reducing object allocations
Claes Redestad
claes.redestad at oracle.com
Mon Jul 21 21:12:58 UTC 2014
On 2014-07-21 22:05, Peter Levart wrote:
>
> On 07/21/2014 09:21 PM, Claes Redestad wrote:
>> Hi,
>>
>> was asked offline to add a length check at the start (prevents
>> potentially costly scans for huge, invalid input; negligible
>> performance impact for normal cases) and realized dash1 < 0 || dash2
>> < 0 || dash3 < 0 implies dash4 < 0, so the first three checks are
>> unnecessary and can be skipped:
>>
>> http://cr.openjdk.java.net/~redestad/8006627/webrev.5
>
> Hi Claes,
>
> dash1 < 0 || dash2 < 0 || dash3 < 0 does not imply dash4 < 0
>
> Take for example the following input: "0-0-0"
>
> dash1 = 1
> dash2 = 3
> dash3 = -1
> dash4 = 1 (again)
Well, this is embarrassing...
>
> but the following check:
>
> if (dash4 < 0 || name.indexOf('-', dash4 + 1) > 0)
>
> is true in this case. It seems either dash4 < 0 or dash5 > 0 for any
> number of dashes in the string except exactly 4. It's just that this
> is not immediately evident for the casual reader. I recommend adding a
> comment for posterity.
... but that explains why all test cases worked and reinforced the
erroneous assumption. Skipping the checks bring back a few percent in my
microbenchmarks, so I guess adding an explanation as to why this
actually works in context in an inline comment is the least we should do
here:
+ // For any valid input, dash1 through dash4 will be positive and dash5
+ // negative, but it's enough to check dash4 and dash5:
+ // - if dash1 is -1, dash4 will be -1
+ // - if dash1 is positive but dash2 is -1, dash4 will be -1
+ // - if dash1 and dash2 is positive, dash3 will be -1, dash4 will be
+ // positive, but so will dash5
http://cr.openjdk.java.net/~redestad/8006627/webrev.6
Thanks!
/Claes
>
> Regards, Peter
>
>>
>> /Claes
>>
>> On 2014-07-21 20:32, Claes Redestad wrote:
>>> Hi,
>>>
>>> new webrev which ensures we always throw some kind of IAE for
>>> invalid inputs and adds a few tests to cover this behavior:
>>> http://cr.openjdk.java.net/~redestad/8006627/webrev.4
>>>
>>> /Claes
>>>
>>> On 2014-07-21 20:05, Claes Redestad wrote:
>>>> Hi,
>>>>
>>>> IIOB is invalid to throw here, so I'll fix that.
>>>> NumberFormatException is a IllegalArgumentException, so I think
>>>> it's a gray area if the dash4 + 1 < name.length()-check is needed.
>>>>
>>>> I sincerely hope keeping error messages as-is isn't required, though.
>>>>
>>>> /Claes
>>>>
>>>> On 2014-07-21 18:51, Peter Levart wrote:
>>>>> Hi Claes,
>>>>>
>>>>> Invalid inputs to UUID.fromString() behave a little different than
>>>>> before. IllegalArgumentException is not thrown for the following
>>>>> inputs:
>>>>>
>>>>> For example:
>>>>>
>>>>> "0": IllegalArgumentException: Invalid UUID string: 0 (before patch)
>>>>> "0": IndexOutOfBoundsException (after patch)
>>>>>
>>>>> "-0": IllegalArgumentException: Invalid UUID string: -0 (before
>>>>> patch)
>>>>> "-0": NumberFormatException (after patch)
>>>>>
>>>>> "0-0-0-0-": IllegalArgumentException: Invalid UUID string:
>>>>> 0-0-0-0- (before patch)
>>>>> "0-0-0-0-": NumberFormatException (after patch)
>>>>>
>>>>> The following input (and similar) do throw NumberFormatException
>>>>> as before, but messages are a little different. That's OK, I suppose.
>>>>>
>>>>> "0-0-0-x-0": NumberFormatException: For input string: "x" (before
>>>>> patch)
>>>>> "0-0-0-x-0": NumberFormatException: Error at index 1 in: "x"
>>>>> (after patch)
>>>>>
>>>>> "0-0-0--0": NumberFormatException: For input string: "" (before
>>>>> patch)
>>>>> "0-0-0--0": NumberFormatException: (after patch)
>>>>>
>>>>>
>>>>> The 1st 3 examples could be fixed by checking that dash1,2,3,4 are
>>>>> all > 0 and that dash4 + 1 < name.length()
>>>>>
>>>>>
>>>>> Regards, Peter
>>>>>
>>>>>
>>>>> On 07/21/2014 01:41 PM, Claes Redestad wrote:
>>>>>> On 07/19/2014 02:59 PM, Ivan Gerasimov wrote:
>>>>>>> This looks just beautiful!
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>>> But why do you need the digits() function at all?
>>>>>>> In my opinion, using formatUnsignedLong directly would be no
>>>>>>> less clearer.
>>>>>>
>>>>>> Sure!
>>>>>>
>>>>>> http://cr.openjdk.java.net/~redestad/8006627/webrev.2/
>>>>>>
>>>>>> Small improvement with client compiler; no measurable change with
>>>>>> tiered.
>>>>>>
>>>>>> /Claes
>>>>>>
>>>>>>>
>>>>>>> Sincerely yours,
>>>>>>> Ivan
>>>>>>>
>>>>>>> On 19.07.2014 8:59, Claes Redestad wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> after recent changes, this patch has been revisited and
>>>>>>>> improved slightly, primarily simplifying and speeding up the
>>>>>>>> toString method slightly more:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~redestad/8006627/webrev.1/
>>>>>>>>
>>>>>>>> /Claes
>>>>>>>>
>>>>>>>> On 2014-06-15 00:41, Claes Redestad wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> please review this patch to improve UUID performance,
>>>>>>>>> originally proposed by Steven Schlansker, rebased to use the
>>>>>>>>> allocation-free methods added in
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8041972
>>>>>>>>>
>>>>>>>>> Webrev: http://cr.openjdk.java.net/~redestad/8006627/webrev.0/
>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8006627
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> /Claes
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
More information about the core-libs-dev
mailing list