Improving performance and reducing object allocations of java.util.UUID to/from string
Steven Schlansker
stevenschlansker at gmail.com
Sun Jan 20 00:54:03 UTC 2013
Thank you for the feedback!
On Jan 10, 2013, at 4:50 AM, Aleksey Shipilev <aleksey.shipilev at oracle.com> wrote:
> On 01/09/2013 09:51 PM, Steven Schlansker wrote:
>> Hello again,
>>
>> I sent this email a week ago and have received no replies. Is there
>> any step I have missed necessary to contribute to the JDK libraries?
>
> I think the crucial part is OCA, as per:
> http://openjdk.java.net/contribute/
I have now taken care of this :-)
The OCA is under the company name "Ness Computing" and was emailed yesterday.
>
>> I am very interested in making your lives easier, so please let me
>> know if I am in the wrong place or are otherwise misguided.
>
> You are at the correct place.
>
> On the first glance, the change looks good for the start. A few comments
> though:
> a) Do you need the masks before or-ing with most/leastSigBits?
So, they seem to exist to satisfy a few Apache Harmony test cases:
http://svn.apache.org/viewvc/harmony/enhanced/java/branches/java6/classlib/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/util/UUIDTest.java?revision=929252
For example,
uuid = UUID.fromString("123456789-0-0-0-0");
Assert.assertEquals(0x2345678900000000L, uuid.getMostSignificantBits());
Assert.assertEquals(0x0L, uuid.getLeastSignificantBits());
Without the masking, the values can smear together. It would be possible to avoid converting extra characters, but I believe that the mask is probably cheaper than an if to check for the presence of extra characters.
Were I writing the code from scratch I would throw an IllegalArgumentException on such a UUID but I assume that is not acceptable for compatibility reasons.
> b) Is there a more standard (and still performant) way to do the hex
> conversion? Look around JDK source, I think there should be something
> else needing the same kind of conversion.
I didn't see any candidates. I thought of at least the following places to look: Long, String, Formatter, BitSet, and did some general browsing around java.lang and java.util with no luck. The Long methods are in java.lang and so could not be used without making them public. I assume that having private helper methods for my class is preferable to introducing new public methods. It's entirely possible I missed something, but I'm not seeing it.
> c) I'd go for making utility methods a bit more generic. For one, I
> would rather make decodeHex(String str, int start, int end), and
> encodeHex(char[] dest, int offset, int value).
I can do this if you feel strongly, but I believe that this compromises the readability of the code. For example, right now you can see:
long mostSigBits = decode(str, dashPos, 0);
mostSigBits <<= 16;
mostSigBits |= decode(str, dashPos, 1);
…
whereas to make it use a "more general" helper it would have to look like this:
long mostSigBits = decode(str, dashPos[0]+1, dashPos[1]);
mostSigBits <<= 16;
mostSigBits |= decode(str, dashPos[1], dashPos[2]+1);
This is an increase from one "magic value" per invocation to three.
It's not a huge difference either way, but I'm not sure that your suggestion is better?
I suppose I could make it non-private helper method available to others, although I am not sure if this is considered a good practice in the JDK code. Let me know if you think I should do this.
>
> Microbenchmark glitches:
> a) % is the integer division, and at the scale of the operations you
> are measuring, it could incur significant costs; the usual practice is
> having power-of-2 size, and then (i % size) -> (i & (size - 1)).
I fixed this, but it made no meaningful change to the numbers.
> b) Not sure if you want to stick with random UUIDs for comparisons.
> While the law of large numbers is on your side, 1000 random UUIDs might
> be not random enough.
I don't believe that this has any bearing on the result. I've run the trials many, many times and the number rarely changes by even 1% from the values that I posted beforehand. If you think this is an actual problem, I can come up with an alternate methodology. I don't think it's a problem.
Thanks again for the review. Please let me know if you agree or believe I am mistaken in my analysis above.
More information about the core-libs-dev
mailing list