RFR: 8292914: Introduce a system property that enables stable names for lambda proxy classes [v5]

Strahinja Stanojevic duke at openjdk.org
Mon Nov 21 14:22:07 UTC 2022


> This PR introduces an option to output stable names for the lambda classes in the JDK. A stable name consists of two parts: The first part is the predefined value `$$Lambda$` appended to the lambda capturing class, and the second is a 64-bit hash part of the name. Thus, it looks like `lambdaCapturingClass$$Lambda$hashValue`.
> Parameters used to create a stable hash are a superset of the parameters used for lambda class archiving when the CDS dumping option is enabled. During this process, all the mutual parameters are in the same form as they are in the low-level implementation (`SystemDictionaryShared::add_lambda_proxy_class`) of the archiving process.
> We decided to use a well-specified `CRC32` algorithm from the standard Java library. We created two 32-bit hashes from the parameters used to create stable names. Then, we combined those two 32-bit hashes into one 64-bit hash value.
> We chose `CRC32` because it is a well-specified hash function, and we don't need to write additional code in the JDK. `SHA-256, MD5`, and all other hash functions that rely on `MessageDigest` use lambdas in the implementation, so they are unsuitable for our purpose. We also considered a few different hash functions with a low collision rate. All these functions would require at least 100 lines of additional code in the JDK. The best alternative we found is 64-bit` MurmurHash2`: https://commons.apache.org/proper/commons-codec/jacoco/org.apache.commons.codec.digest/MurmurHash2.java.html.  In case adding a new hash implementation (e.g., Murmur2) to the JDK is preferred, this PR can be easily modified.
> We found the post (https://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed/145633#145633) that compares different hash functions.
> We also tested the `CRC32` hash function against half a billion generated strings, and there were no collisions. Note that the capturing-class name is also part of the lambda class name, so the potential collisions can only appear in a single class. Thus, we do not expect to have name collisions due to a relatively low number of lambdas per class. Every tool that uses this feature should handle potential collisions on its own.  
> We found an overall approximation of the collision rate too. You can find it here: https://preshing.com/20110504/hash-collision-probabilities/.
> 
> JDK currently adds an atomic integer after `$$Lambda$`, and the names of the lambdas depend on the creation order. In the `TestStableLambdaNames`, we generate all the lambdas two times. In the first run, the method createPlainLambdas generate the following lambdas:
> 
> - TestStableLambdaNames$$Lambda$1/0x0000000800c00400
> - TestStableLambdaNames$$Lambda$2/0x0000000800c01800
> - TestStableLambdaNames$$Lambda$3/0x0000000800c01a38
> The same method in the second run generates lambdas with different names:
> - TestStableLambdaNames$$Lambda$1471/0x0000000800d10000
> - TestStableLambdaNames$$Lambda$1472/0x0000000800d10238
> - TestStableLambdaNames$$Lambda$1473/0x0000000800d10470
> 
> If we use the introduced flag, generated lambdas are:
> - TestStableLambdaNames$$Lambda$65ba26bbc6c7500d/0x0000000800c00400
> - TestStableLambdaNames$$Lambda$1569c8c4abe3ab18/0x0000000800c01800
> - TestStableLambdaNames$$Lambda$493c0ecaaf682428/0x0000000800c01a38
> In the second run of the method, generated lambdas are:
> - TestStableLambdaNames$$Lambda$65ba26bbc6c7500d/0x0000000800d10000
> - TestStableLambdaNames$$Lambda$1569c8c4abe3ab18/0x0000000800d10238
> - TestStableLambdaNames$$Lambda$493c0ecaaf682428/0x0000000800d10470
> 
> We can see that the introduced hash value does not change between two calls of the method `createPlainLambdas`. That was not the case in the JDK run without this change. Those lambdas are extracted directly from the test.

Strahinja Stanojevic has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:

  Calculate stable names for lambda classes using different hash function.

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/10024/files
  - new: https://git.openjdk.org/jdk/pull/10024/files/dd8e592d..6179915a

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=10024&range=04
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=10024&range=03-04

  Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/10024.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/10024/head:pull/10024

PR: https://git.openjdk.org/jdk/pull/10024


More information about the core-libs-dev mailing list