RFR [9] 8006627: UUID to/from String performance should be improved by reducing object allocations

Claes Redestad claes.redestad at oracle.com
Mon Jul 21 21:12:58 UTC 2014


On 2014-07-21 22:05, Peter Levart wrote:
>
> On 07/21/2014 09:21 PM, Claes Redestad wrote:
>> Hi,
>>
>>  was asked offline to add a length check at the start (prevents 
>> potentially costly scans for huge, invalid input; negligible 
>> performance impact for normal cases) and realized dash1 < 0 || dash2 
>> < 0 || dash3 < 0 implies dash4 < 0, so the first three checks are 
>> unnecessary and can be skipped:
>>
>> http://cr.openjdk.java.net/~redestad/8006627/webrev.5
>
> Hi Claes,
>
> dash1 < 0 || dash2 < 0 || dash3 < 0 does not imply dash4 < 0
>
> Take for example the following input: "0-0-0"
>
> dash1 = 1
> dash2 = 3
> dash3 = -1
> dash4 = 1 (again)
Well, this is embarrassing...
>
> but the following check:
>
> if (dash4 < 0 || name.indexOf('-', dash4 + 1) > 0)
>
> is true in this case. It seems either dash4 < 0 or dash5 > 0 for any 
> number of dashes in the string except exactly 4. It's just that this 
> is not immediately evident for the casual reader. I recommend adding a 
> comment for posterity.

... but that explains why all test cases worked and reinforced the 
erroneous assumption. Skipping the checks bring back a few percent in my 
microbenchmarks, so I guess adding an explanation as to why this 
actually works in context in an inline comment is the least we should do 
here:

+        // For any valid input, dash1 through dash4 will be positive and dash5
+        // negative, but it's enough to check dash4 and dash5:
+        // - if dash1 is -1, dash4 will be -1
+        // - if dash1 is positive but dash2 is -1, dash4 will be -1
+        // - if dash1 and dash2 is positive, dash3 will be -1, dash4 will be
+        //   positive, but so will dash5

http://cr.openjdk.java.net/~redestad/8006627/webrev.6

Thanks!

/Claes

>
> Regards, Peter
>
>>
>>  /Claes
>>
>> On 2014-07-21 20:32, Claes Redestad wrote:
>>> Hi,
>>>
>>>  new webrev which ensures we always throw some kind of IAE for 
>>> invalid inputs and adds a few tests to cover this behavior: 
>>> http://cr.openjdk.java.net/~redestad/8006627/webrev.4
>>>
>>> /Claes
>>>
>>> On 2014-07-21 20:05, Claes Redestad wrote:
>>>> Hi,
>>>>
>>>>  IIOB is invalid to throw here, so I'll fix that. 
>>>> NumberFormatException is a IllegalArgumentException, so I think 
>>>> it's a gray area if the dash4 + 1 < name.length()-check is needed.
>>>>
>>>>  I sincerely hope keeping error messages as-is isn't required, though.
>>>>
>>>> /Claes
>>>>
>>>> On 2014-07-21 18:51, Peter Levart wrote:
>>>>> Hi Claes,
>>>>>
>>>>> Invalid inputs to UUID.fromString() behave a little different than 
>>>>> before. IllegalArgumentException is not thrown for the following 
>>>>> inputs:
>>>>>
>>>>> For example:
>>>>>
>>>>> "0": IllegalArgumentException: Invalid UUID string: 0 (before patch)
>>>>> "0": IndexOutOfBoundsException (after patch)
>>>>>
>>>>> "-0": IllegalArgumentException: Invalid UUID string: -0 (before 
>>>>> patch)
>>>>> "-0": NumberFormatException (after patch)
>>>>>
>>>>> "0-0-0-0-": IllegalArgumentException: Invalid UUID string: 
>>>>> 0-0-0-0- (before patch)
>>>>> "0-0-0-0-": NumberFormatException (after patch)
>>>>>
>>>>> The following input (and similar) do throw NumberFormatException 
>>>>> as before, but messages are a little different. That's OK, I suppose.
>>>>>
>>>>> "0-0-0-x-0": NumberFormatException: For input string: "x" (before 
>>>>> patch)
>>>>> "0-0-0-x-0": NumberFormatException: Error at index 1 in: "x" 
>>>>> (after patch)
>>>>>
>>>>> "0-0-0--0": NumberFormatException: For input string: "" (before 
>>>>> patch)
>>>>> "0-0-0--0": NumberFormatException: (after patch)
>>>>>
>>>>>
>>>>> The 1st 3 examples could be fixed by checking that dash1,2,3,4 are 
>>>>> all > 0 and that dash4 + 1 < name.length()
>>>>>
>>>>>
>>>>> Regards, Peter
>>>>>
>>>>>
>>>>> On 07/21/2014 01:41 PM, Claes Redestad wrote:
>>>>>> On 07/19/2014 02:59 PM, Ivan Gerasimov wrote:
>>>>>>> This looks just beautiful!
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>>> But why do you need the digits() function at all?
>>>>>>> In my opinion, using formatUnsignedLong directly would be no 
>>>>>>> less clearer.
>>>>>>
>>>>>> Sure!
>>>>>>
>>>>>> http://cr.openjdk.java.net/~redestad/8006627/webrev.2/
>>>>>>
>>>>>> Small improvement with client compiler; no measurable change with 
>>>>>> tiered.
>>>>>>
>>>>>> /Claes
>>>>>>
>>>>>>>
>>>>>>> Sincerely yours,
>>>>>>> Ivan
>>>>>>>
>>>>>>> On 19.07.2014 8:59, Claes Redestad wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>>  after recent changes, this patch has been revisited and 
>>>>>>>> improved slightly, primarily simplifying and speeding up the 
>>>>>>>> toString method slightly more:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~redestad/8006627/webrev.1/
>>>>>>>>
>>>>>>>>  /Claes
>>>>>>>>
>>>>>>>> On 2014-06-15 00:41, Claes Redestad wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> please review this patch to improve UUID performance, 
>>>>>>>>> originally proposed by Steven Schlansker, rebased to use the 
>>>>>>>>> allocation-free methods added in 
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8041972
>>>>>>>>>
>>>>>>>>> Webrev: http://cr.openjdk.java.net/~redestad/8006627/webrev.0/
>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8006627
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> /Claes
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>




More information about the core-libs-dev mailing list