UUID creation performance
Roger Riggs
roger.riggs at oracle.com
Mon Mar 6 15:40:55 UTC 2023
Hi Brett,
You might be trying to over optimize, making the source more complex
than is useful or necessary.
I think it is likely that on most architectures the varhandles take
advantage of appropriate machine instructions and both have the same
performance.
I expect the performance difference to be barely noticeable, and if so
keep it simple and direct.
Even so, a localized static cache inUUID of whichever VarHandle is
native would serve the purpose better than a runtime check.
Regards, Roger
On 3/5/23 6:49 PM, Brett Okken wrote:
> The new ByteArray class works great for the nameUUIDFromBytes method,
> which must be in big endian.
> For randomUUID, byte order does not matter, so using native would be
> fastest, but there does not appear to be a utility class for that.
> Is there a preference of just having a native order VarHandle to use
> in UUID vs. having a utility method which chooses which utility class
> to call based on the native order vs. some other option?
>
> Thanks,
> Brett
>
> On Wed, Mar 1, 2023 at 9:08 AM Roger Riggs <roger.riggs at oracle.com> wrote:
>> Hi,
>>
>> That's an interesting idea. Recently VarHandle access methods were
>> created by JDK-8300236 [1] [2]
>> in the jdk.internal.util package. See the ByteArray and
>> ByteArrayLittleEndian classes.
>>
>> See how that would affect performance and leverage existing VarHandles.
>>
>> Thanks, Roger
>>
>> [1] https://bugs.openjdk.org/browse/JDK-8300236
>> [2] https://urldefense.com/v3/__https://github.com/openjdk/jdk/pull/12076__;!!ACWV5N9M2RV99hQ!LF6TU_iFxi-lfSpZXjna9bCMPRs_gRvdXsfL5Tih6k2oYN94Hfb3sV_gBzdTvty-gXm0rzSgEZ0_xdQdYa7b05LArw$
>>
>> On 3/1/23 7:50 AM, Brett Okken wrote:
>>> Is there any interest in updating the static UUID.randomUUID() and
>>> UUID.nameUUIDFromBytes(byte[]) factory methods to use either a
>>> ByteBuffer or byteArrayViewVarHandle to convert the byte[] to 2 long
>>> values then do the bit twiddling?
>>> These methods are really dominated by time to create/populate the
>>> byte[], but this does reduce the time to create the 2 long values by
>>> at least half.
>>> It would also allow the removal of the private UUID(byte[] data).
>>>
>>> public static UUID randomUUID() {
>>> SecureRandom ng = Holder.numberGenerator;
>>>
>>> byte[] randomBytes = new byte[16];
>>> ng.nextBytes(randomBytes);
>>> final ByteBuffer bb = ByteBuffer.wrap(randomBytes);
>>> bb.order(ByteOrder.nativeOrder());
>>>
>>> long msb = bb.getLong();
>>> long lsb = bb.getLong();
>>>
>>> msb &= 0xFFFFFFFFFFFF0FFFL; /* clear version */
>>> msb |= 0x4000L; /* set to version 4 */
>>>
>>> lsb &= 0x3FFFFFFFFFFFFFFFL; /* clear variant */
>>> lsb |= 0x8000000000000000L; /* set to IETF variant */
>>>
>>> return new UUID(msb, lsb);
>>> }
>>>
>>> public static UUID nameUUIDFromBytes(byte[] name) {
>>> MessageDigest md;
>>> try {
>>> md = MessageDigest.getInstance("MD5");
>>> } catch (NoSuchAlgorithmException nsae) {
>>> throw new InternalError("MD5 not supported", nsae);
>>> }
>>> byte[] md5Bytes = md.digest(name);
>>>
>>> // default byte order is BIG_ENDIAN
>>> final ByteBuffer bb = ByteBuffer.wrap(md5Bytes);
>>>
>>> long msb = bb.getLong();
>>> long lsb = bb.getLong();
>>>
>>> msb &= 0xFFFFFFFFFFFF0FFFL; /* clear version */
>>> msb |= 0x3000L; /* set to version 3 */
>>>
>>> lsb &= 0x3FFFFFFFFFFFFFFFL; /* clear variant */
>>> lsb |= 0x8000000000000000L; /* set to IETF variant */
>>>
>>> return new UUID(msb, lsb);
>>> }
>>>
>>> Benchmark Mode Cnt Score Error Units
>>> UUIDBenchmark.jdk_name avgt 3 11.885 ± 4.025 ns/op
>>> UUIDBenchmark.jdk_random avgt 3 11.656 ± 0.987 ns/op
>>> UUIDBenchmark.longs avgt 3 7.618 ± 1.047 ns/op
>>> UUIDBenchmark.longs_bb avgt 3 7.755 ± 1.643 ns/op
>>> UUIDBenchmark.longs_name avgt 3 8.467 ± 1.784 ns/op
>>> UUIDBenchmark.longs_name_bb avgt 3 8.455 ± 1.662 ns/op
>>> UUIDBenchmark.randomBytes avgt 3 6.132 ± 0.447 ns/op
>>>
>>>
>>> @BenchmarkMode(Mode.AverageTime)
>>> @OutputTimeUnit(TimeUnit.NANOSECONDS)
>>> @Warmup(iterations = 3, time = 2, timeUnit = TimeUnit.SECONDS)
>>> @Measurement(iterations = 3, time = 2, timeUnit = TimeUnit.SECONDS)
>>> @Fork(1)
>>> @State(Scope.Benchmark)
>>> public class UUIDBenchmark {
>>>
>>> private static final VarHandle LONGS_ACCESS =
>>> MethodHandles.byteArrayViewVarHandle(long[].class,
>>> ByteOrder.nativeOrder());
>>>
>>> private static final VarHandle BE_LONGS_ACCESS =
>>> MethodHandles.byteArrayViewVarHandle(long[].class,
>>> ByteOrder.BIG_ENDIAN);
>>>
>>> @Benchmark
>>> public byte[] randomBytes() {
>>> final byte[] bytes = new byte[16];
>>> randomBytes(bytes);
>>> return bytes;
>>> }
>>>
>>> @Benchmark
>>> public void jdk_random(Blackhole bh) {
>>> final byte[] data = new byte[16];
>>> randomBytes(data);
>>> data[6] &= 0x0f; /* clear version */
>>> data[6] |= 0x40; /* set to version 4 */
>>> data[8] &= 0x3f; /* clear variant */
>>> data[8] |= 0x80; /* set to IETF variant */
>>> long msb = 0;
>>> long lsb = 0;
>>> assert data.length == 16 : "data must be 16 bytes in length";
>>> for (int i=0; i<8; i++)
>>> msb = (msb << 8) | (data[i] & 0xff);
>>> for (int i=8; i<16; i++)
>>> lsb = (lsb << 8) | (data[i] & 0xff);
>>> bh.consume(msb);
>>> bh.consume(lsb);
>>> }
>>>
>>> @Benchmark
>>> public void jdk_name(Blackhole bh)
>>> {
>>> final byte[] md5Bytes = new byte[16];
>>> randomBytes(md5Bytes);
>>> md5Bytes[6] &= 0x0f; /* clear version */
>>> md5Bytes[6] |= 0x30; /* set to version 3 */
>>> md5Bytes[8] &= 0x3f; /* clear variant */
>>> md5Bytes[8] |= 0x80; /* set to IETF variant */
>>> long msb = 0;
>>> long lsb = 0;
>>> assert md5Bytes.length == 16 : "data must be 16 bytes in length";
>>> for (int i=0; i<8; i++)
>>> msb = (msb << 8) | (md5Bytes[i] & 0xff);
>>> for (int i=8; i<16; i++)
>>> lsb = (lsb << 8) | (md5Bytes[i] & 0xff);
>>> bh.consume(msb);
>>> bh.consume(lsb);
>>> }
>>>
>>> @Benchmark
>>> public void longs(Blackhole bh) {
>>> final byte[] data = new byte[16];
>>> randomBytes(data);
>>>
>>> long msb = (long) LONGS_ACCESS.get(data, 0);
>>> long lsb = (long) LONGS_ACCESS.get(data, 8);
>>>
>>> msb &= 0xFFFFFFFFFFFF0FFFL;
>>> msb |= 0x4000L;
>>>
>>> lsb &= 0x3FFFFFFFFFFFFFFFL;
>>> lsb |= 0x8000000000000000L;
>>>
>>> bh.consume(msb);
>>> bh.consume(lsb);
>>> }
>>>
>>> @Benchmark
>>> public void longs_name(Blackhole bh) {
>>> final byte[] data = new byte[16];
>>> randomBytes(data);
>>>
>>> long msb = (long) BE_LONGS_ACCESS.get(data, 0);
>>> long lsb = (long) BE_LONGS_ACCESS.get(data, 8);
>>>
>>> msb &= 0xFFFFFFFFFFFF0FFFL;
>>> msb |= 0x3000L;
>>>
>>> lsb &= 0x3FFFFFFFFFFFFFFFL;
>>> lsb |= 0x8000000000000000L;
>>>
>>> bh.consume(msb);
>>> bh.consume(lsb);
>>> }
>>>
>>> @Benchmark
>>> public void longs_bb(Blackhole bh) {
>>> final byte[] data = new byte[16];
>>> randomBytes(data);
>>>
>>> final ByteBuffer bb = ByteBuffer.wrap(data);
>>> bb.order(ByteOrder.nativeOrder());
>>>
>>> long msb = bb.getLong();
>>> long lsb = bb.getLong();
>>>
>>> msb &= 0xFFFFFFFFFFFF0FFFL;
>>> msb |= 0x4000L;
>>>
>>> lsb &= 0x3FFFFFFFFFFFFFFFL;
>>> lsb |= 0x8000000000000000L;
>>>
>>> bh.consume(msb);
>>> bh.consume(lsb);
>>> }
>>>
>>> @Benchmark
>>> public void longs_name_bb(Blackhole bh) {
>>> final byte[] data = new byte[16];
>>> randomBytes(data);
>>>
>>> final ByteBuffer bb = ByteBuffer.wrap(data);
>>> // bb.order(ByteOrder.BIG_ENDIAN);
>>>
>>> long msb = bb.getLong();
>>> long lsb = bb.getLong();
>>>
>>> msb &= 0xFFFFFFFFFFFF0FFFL;
>>> msb |= 0x3000L;
>>>
>>> lsb &= 0x3FFFFFFFFFFFFFFFL;
>>> lsb |= 0x8000000000000000L;
>>>
>>> bh.consume(msb);
>>> bh.consume(lsb);
>>> }
>>>
>>> static void randomBytes(byte[] bytes) {
>>> ThreadLocalRandom tlr = ThreadLocalRandom.current();
>>> LONGS_ACCESS.set(bytes, 0, tlr.nextLong());
>>> LONGS_ACCESS.set(bytes, 8, tlr.nextLong());
>>> }
>>> }
More information about the core-libs-dev
mailing list