RFR: 8334015: Add Support for UUID Version 7 (UUIDv7) defined in RFC 9562 [v14]
Jaikiran Pai
jpai at openjdk.org
Tue Sep 23 14:21:41 UTC 2025
On Wed, 10 Sep 2025 19:46:12 GMT, Kieran Farrell <kfarrell at openjdk.org> wrote:
>> With the recent approval of UUIDv7 (https://datatracker.ietf.org/doc/rfc9562/), this PR aims to add a new static method UUID.timestampUUID() which constructs and returns a UUID in support of the new time generated UUID version.
>>
>> The specification requires embedding the current timestamp in milliseconds into the first bits 0–47. The version number in bits 48–51, bits 52–63 are available for sub-millisecond precision or for pseudorandom data. The variant is set in bits 64–65. The remaining bits 66–127 are free to use for more pseudorandom data or to employ a counter based approach for increased time percision (https://www.rfc-editor.org/rfc/rfc9562.html#name-uuid-version-7).
>>
>> The choice of implementation comes down to balancing the sensitivity level of being able to distingush UUIDs created below <1ms apart with performance. A test simulating a high-concurrency environment with 4 threads generating 10000 UUIDv7 values in parallel to measure the collision rate of each implementation (the amount of times the time based portion of the UUID was not unique and entries could not distinguished by time) yeilded the following results for each implemtation:
>>
>>
>> - random-byte-only - 99.8%
>> - higher-precision - 3.5%
>> - counter-based - 0%
>>
>>
>> Performance tests show a decrease in performance as expected with the counter based implementation due to the introduction of synchronization:
>>
>> - random-byte-only 143.487 ± 10.932 ns/op
>> - higher-precision 149.651 ± 8.438 ns/op
>> - counter-based 245.036 ± 2.943 ns/op
>>
>> The best balance here might be to employ a higher-precision implementation as the large increase in time sensitivity comes at a very slight performance cost.
>
> Kieran Farrell has updated the pull request incrementally with one additional commit since the last revision:
>
> update method name
An initial remark about the APIs being proposed in this PR. Reading through the motivation section of RFC-9562 https://www.rfc-editor.org/rfc/rfc9562.html#name-update-motivation, I think a few important things that we should consider for the API we are proposing area:
> Many things have changed in the time since UUIDs were originally created. Modern applications have a need to create and utilize UUIDs as the primary identifier for a variety of different items ...
> In such cases, "auto-increment" schemes that are often used by databases do not work well: the effort required to coordinate sequential numeric identifiers across a network can easily become a burden.
> The fact that UUIDs can be used to create unique, reasonably short values in distributed systems without requiring coordination makes them a good alternative, but UUID versions 1-5, which were originally defined by [RFC4122], lack certain other desirable characteristics...
>
> ...
> Due to the aforementioned issues, many widely distributed database applications and large application vendors have sought to solve the problem of creating a better time-based, sortable unique identifier for use as a database key. This has led to numerous implementations over the past 10+ years solving the same problem in slightly different ways ...
Then later in section 6.1 and 6.2 https://www.rfc-editor.org/rfc/rfc9562.html#section-6.1 it's further stated that:
> UUID timestamp source, precision, and length were topics of great debate while creating UUIDv7 for this specification. Choosing the right timestamp for your application is very important.
...
> Monotonicity (each subsequent value being greater than the last) is the backbone of time-based sortable UUIDs.
Given all this, I think the API we provide must try and achieve these primary motivations. That would then mean, not allowing arbitrary values to be passed by applications for generating a UUIDv7 `UUID` instance. So I think we shouldn't introduce the:
public static UUID epochMillis(long timestamp)
being proposed in this PR. The implementation of this method will have no control (unless we add some logic of keeping track of each call) over what "timestamp" gets passed for subsequent calls and thus cannot guarantee the generated UUIDv7 value to be monotonic. Of course, we could expect the applications to make sure they pass the right timestamp(s) for each call, but then that brings us back to what the RFC motivation stated - that several libraries do it differently. So I think having libraries/applications do the work of passing the right timestamp may not be an useful way to expose the UUIDv7 generation.
I think the other API being proposed in this PR:
public static UUID epochMillis()
is the only one we should introduce. I'm still reviewing the monotonicity implementation and discussion of this `epochMillis()` method in this PR and will reply separately on that.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/25303#issuecomment-3324223138
More information about the core-libs-dev
mailing list