RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
Claes Redestad
redestad at openjdk.java.net
Sun Dec 20 21:48:56 UTC 2020
On Sun, 20 Dec 2020 20:45:55 GMT, Claes Redestad <redestad at openjdk.org> wrote:
>>> I have to say that introducing a ThreadLocal here seems like a step in the wrong direction. With a ThreadLocal, if I read this correctly, a MessageDigest will be cached with each thread that ever calls this API, and it won't be garbage collected until the thread dies. Some threads live indefinitely, e.g., the ones in the fork-join common pool. Is this what we want? Is UUID creation so frequent that each thread needs its own, or could thread safety be handled using a simple locking protocol?
>>
>> This is a good point. Significant effort has gone into recent releases to reduce the use of TLs in the (JDK-8245309, JDK-8245308, JDK-8245304, JDK-8245302) so adding new usages is a disappointing. So I think this PR does need good justifications, and probably a lot more exploration of options.
>
> As Stuart argues a TL approach is likely to look better in a microbenchmark, but make things worse in other regards.
>
> I started exploring options to this in #1855 (not logged an RFE, yet), and I think there is potential to get most of the gains seen here without the introduction of a `ThreadLocal`. The current patch also speeds up MD5.digest for small chunks (+19% for 16 bytes, but less relative impact on larger chunks, down to being lost in the noise on 16Kb). Speed-up of `UUID.nameUUIDFromBytes` is somewhat modest right now (+4%, -17% allocations), but I think it can be improved further without complicating things too much. The MD5 intrinsic added by JDK-8250902 adds some constraints on the implementation that holds back some polish. This can be fixed, but requires some coordination.
A trick that could be used here instead of `ThreadLocal` is to store an instance of MD5 retrieved via MessageDigest.getInstance, but clone it before use. See [this commit](https://github.com/openjdk/jdk/pull/1855/commits/2f266316d62ca875c38e2f771412d02625414bf4).
This maintains thread safety, avoids the allure of TLs, and gives a substantial speed-up on the `UUIDBench.fromType3Bytes` micro (comparing it against the other optimizations that were already in #1855):
Benchmark (size) Mode Cnt Score Error Units
UUIDBench.fromType3Bytes 20000 thrpt 12 1.523 ± 0.066 ops/us
UUIDBench.fromType3Bytes:·gc.alloc.rate.norm 20000 thrpt 12 408.036 ± 0.003 B/op
Benchmark (size) Mode Cnt Score Error Units
UUIDBench.fromType3Bytes 20000 thrpt 12 2.186 ± 0.228 ops/us
UUIDBench.fromType3Bytes:·gc.alloc.rate.norm 20000 thrpt 12 264.023 ± 0.006 B/op
-------------
PR: https://git.openjdk.java.net/jdk/pull/1821
More information about the security-dev
mailing list