RFR: 8334015: Add Support for UUID Version 7 (UUIDv7) defined in RFC 9562 [v3]
Roger Riggs
rriggs at openjdk.org
Tue May 20 19:35:55 UTC 2025
On Tue, 20 May 2025 13:56:42 GMT, kieran-farrell <duke at openjdk.org> wrote:
>>> Can the sub-microsecond value just be truncated and avoid the expensive divide operation?'
>>
>> method 3 of secion 6.2 of https://www.rfc-editor.org/rfc/rfc9562.html#name-monotonicity-and-counters states
>>
>>> start with the portion of the timestamp expressed as a fraction of the clock's tick value (fraction of a millisecond for UUIDv7). Compute the count of possible values that can be represented in the available bit space, 4096 for the UUIDv7 rand_a field. Using floating point or scaled integer arithmetic, multiply this fraction of a millisecond value by 4096 and round down (toward zero) to an integer result to arrive at a number between 0 and the maximum allowed for the indicated bits, which sorts monotonically based on time. '
>>
>> so i think we might have to keep the division? though i re-shuffled the equation to
>>
>> `int nsBits = (int) ((nsTime % 1_000_000) / 1_000_000.0 * 4096);`
>>
>> which gives scaled integer division rather than floating point and gave a very slight imporved perfomance to 143.758 ± 2.135 ns/op
>
>> This could remove the allocation by composing the high and low longs using shifts and binary operations and ng.next().
>
> do you mean to create the UUID using most and least significant bytes? if so, I've tried out some variations, i found creating the 64 bit lsb with ng.nextLong() brings a large pefomance decrease over using the nextBytes method, but the below implemntation keeping with the nextByte(byte[]) api brings a performance increase to 121.128 ± 30.486 ns/op, though the code might appear a little roundabout.
>
>
> public static UUID timestampUUID() {
> long msTime = System.currentTimeMillis();
> long nsTime = System.nanoTime();
>
> // Scale sub-ms nanoseconds to a 12-bit value
> int nsBits = (int) ((nsTime % 1_000_000L) * 4096L / 1_000_000L);
>
> // Compose the 64 most significant bits: [48-bit msTime | 4-bit version | 12-bit nsBits]
> long mostSigBits =
> ((msTime & 0xFFFFFFFFFFFFL) << 16) |
> (0x7L << 12) |
> nsBits;
>
> // Generate 8 random bytes for least significant bits
> byte[] randomBytes = new byte[8];
> SecureRandom ng = UUID.Holder.numberGenerator;
> ng.nextBytes(randomBytes);
>
> long leastSigBits = 0;
> for (int i = 0; i < 8; i++) {
> leastSigBits = (leastSigBits << 8) | (randomBytes[i] & 0xFF);
> }
>
> // Set variant (bits 62–63) to '10'
> leastSigBits &= 0x3FFFFFFFFFFFFFFFL;
> leastSigBits |= 0x8000000000000000L;
>
> return new UUID(mostSigBits, leastSigBits);
> }
There's no (time-based) relationship between the currentTimeMillis() value and the nanoTime value.
They are independent clocks and are read separately and are un-correlated. They won't be usable as lsb of the millis value.
I'm surprised that the `nextBytes` is slower, since it looks like it calls `nextLong` and puts it in a newly allocated byte[8]. Normal perf measurements won't account for the gc overhead to recover it.
The nsBits computation looks odd, nanoTme returns nanoseconds (10^9), the remainder (% 1_000_000) is then milliseconds.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/25303#discussion_r2098719719
More information about the core-libs-dev
mailing list