RFR: 8292914: Introduce a system property that enables stable names for lambda proxy classes [v6]
Strahinja Stanojevic
duke at openjdk.org
Mon Nov 21 14:31:45 UTC 2022
> This PR introduces an option to output stable names for the lambda classes in the JDK. A stable name consists of two parts: The first part is the predefined value `$$Lambda$` appended to the lambda capturing class, and the second is a 64-bit hash part of the name. Thus, it looks like `lambdaCapturingClass$$Lambda$hashValue`.
> Parameters used to create a stable hash are a superset of the parameters used for lambda class archiving when the CDS dumping option is enabled. During this process, all the mutual parameters are in the same form as they are in the low-level implementation (`SystemDictionaryShared::add_lambda_proxy_class`) of the archiving process.
> We decided to use a well-specified `CRC32` algorithm from the standard Java library. We created two 32-bit hashes from the parameters used to create stable names. Then, we combined those two 32-bit hashes into one 64-bit hash value.
> We chose `CRC32` because it is a well-specified hash function, and we don't need to write additional code in the JDK. `SHA-256, MD5`, and all other hash functions that rely on `MessageDigest` use lambdas in the implementation, so they are unsuitable for our purpose. We also considered a few different hash functions with a low collision rate. All these functions would require at least 100 lines of additional code in the JDK. The best alternative we found is 64-bit` MurmurHash2`: https://commons.apache.org/proper/commons-codec/jacoco/org.apache.commons.codec.digest/MurmurHash2.java.html. In case adding a new hash implementation (e.g., Murmur2) to the JDK is preferred, this PR can be easily modified.
> We found the post (https://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed/145633#145633) that compares different hash functions.
> We also tested the `CRC32` hash function against half a billion generated strings, and there were no collisions. Note that the capturing-class name is also part of the lambda class name, so the potential collisions can only appear in a single class. Thus, we do not expect to have name collisions due to a relatively low number of lambdas per class. Every tool that uses this feature should handle potential collisions on its own.
> We found an overall approximation of the collision rate too. You can find it here: https://preshing.com/20110504/hash-collision-probabilities/.
>
> JDK currently adds an atomic integer after `$$Lambda$`, and the names of the lambdas depend on the creation order. In the `TestStableLambdaNames`, we generate all the lambdas two times. In the first run, the method createPlainLambdas generate the following lambdas:
>
> - TestStableLambdaNames$$Lambda$1/0x0000000800c00400
> - TestStableLambdaNames$$Lambda$2/0x0000000800c01800
> - TestStableLambdaNames$$Lambda$3/0x0000000800c01a38
> The same method in the second run generates lambdas with different names:
> - TestStableLambdaNames$$Lambda$1471/0x0000000800d10000
> - TestStableLambdaNames$$Lambda$1472/0x0000000800d10238
> - TestStableLambdaNames$$Lambda$1473/0x0000000800d10470
>
> If we use the introduced flag, generated lambdas are:
> - TestStableLambdaNames$$Lambda$65ba26bbc6c7500d/0x0000000800c00400
> - TestStableLambdaNames$$Lambda$1569c8c4abe3ab18/0x0000000800c01800
> - TestStableLambdaNames$$Lambda$493c0ecaaf682428/0x0000000800c01a38
> In the second run of the method, generated lambdas are:
> - TestStableLambdaNames$$Lambda$65ba26bbc6c7500d/0x0000000800d10000
> - TestStableLambdaNames$$Lambda$1569c8c4abe3ab18/0x0000000800d10238
> - TestStableLambdaNames$$Lambda$493c0ecaaf682428/0x0000000800d10470
>
> We can see that the introduced hash value does not change between two calls of the method `createPlainLambdas`. That was not the case in the JDK run without this change. Those lambdas are extracted directly from the test.
Strahinja Stanojevic has updated the pull request incrementally with one additional commit since the last revision:
Create stable lambda names using CRC32 hash function
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/10024/files
- new: https://git.openjdk.org/jdk/pull/10024/files/6179915a..f5feb430
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=10024&range=05
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=10024&range=04-05
Stats: 82 lines in 1 file changed: 27 ins; 30 del; 25 mod
Patch: https://git.openjdk.org/jdk/pull/10024.diff
Fetch: git fetch https://git.openjdk.org/jdk pull/10024/head:pull/10024
PR: https://git.openjdk.org/jdk/pull/10024
More information about the core-libs-dev
mailing list