JDK-8275509: (jlink) SystemModulesPlugin generates a jdk.internal.module.SystemModules$all.class which isn't reproducible
Jaikiran Pai
jai.forums2013 at gmail.com
Tue Oct 19 13:49:21 UTC 2021
Hello Alan,
On 19/10/21 6:59 pm, Alan Bateman wrote:
> On 19/10/2021 13:31, Jaikiran Pai wrote:
>> This relates to the intermittent failures in
>> tools/jlink/JLinkReproducibleTest.java test case which has been
>> ProblemListed for a while now. The root cause is
>> https://bugs.openjdk.java.net/browse/JDK-8275509. I couldn't find any
>> specific mailing lists for jlink tool and I remember seeing
>> jlink/jpackage related discussions on this mailing list previously,
>> so creating this discussion here.
>
> Reproducible builds have been a game of whack-a-mole. Many issues have
> been fixed in recent releases to the JDK is a lot better than it used
> to be. As it happens, someone else interested in reproducible builds
> brought up the issue of the hashCode on jigsaw-dev a few weeks ago [1].
Ah! So this exact same investigation had already happened a few weeks
back then. I haven't subscribed to that list, so missed it. I see in one
of those messages this part:
"Off hand I can't think of any issues with the ModuleDescriptor
hashCode. It is computed at link time and should be deterministic. If I
were to guess then then this may be something to do with the module
version recorded at compile-time at that is one of the components that
the hash is based on."
To be clear, is the ModuleDescriptor#hashCode() expected to return
reproducible (same) hashCode across multiple runs? What currently
changes the hashCode() across multiple runs is various components within
ModuleDescriptor's hashCode() implementation using the hashCode() of the
enums (specifically the various Modifier enums). For example here
https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/module/ModuleDescriptor.java#L330
(the mods.hashCode()). Since the hashCode() returned by enums is
literally through a call to java.lang.Object#hashCode(), those
hashCode() value ended up changing across JVM runs, in one of the setups
I was testing (which I didn't consider a surprise since that's what the
Object#hashCode() stated).
The other approach that I talked about in my previous mail of trying to
make ModuleDescriptor#hashCode() reproducible involved using the enum's
ordinal value as a part of the hashCode() computation instead of calling
the enum's hashCode() method. Very crudely, it looked like:
diff --git
a/src/java.base/share/classes/java/lang/module/ModuleDescriptor.java
b/src/java.base/share/classes/java/lang/module/ModuleDescriptor.java
index a412dd753cc..13c8cd04360 100644
--- a/src/java.base/share/classes/java/lang/module/ModuleDescriptor.java
+++ b/src/java.base/share/classes/java/lang/module/ModuleDescriptor.java
@@ -327,7 +327,7 @@ public class ModuleDescriptor
*/
@Override
public int hashCode() {
- int hash = name.hashCode() * 43 + mods.hashCode();
+ int hash = name.hashCode() * 43 + enumOrdinalHashCode(mods);
if (compiledVersion != null)
hash = hash * 43 + compiledVersion.hashCode();
if (rawCompiledVersion != null)
@@ -505,7 +505,7 @@ public class ModuleDescriptor
*/
@Override
public int hashCode() {
- int hash = mods.hashCode();
+ int hash = enumOrdinalHashCode(mods);
hash = hash * 43 + source.hashCode();
return hash * 43 + targets.hashCode();
}
@@ -708,7 +708,7 @@ public class ModuleDescriptor
*/
@Override
public int hashCode() {
- int hash = mods.hashCode();
+ int hash = enumOrdinalHashCode(mods);
hash = hash * 43 + source.hashCode();
return hash * 43 + targets.hashCode();
}
@@ -2261,7 +2261,7 @@ public class ModuleDescriptor
int hc = hash;
if (hc == 0) {
hc = name.hashCode();
- hc = hc * 43 + Objects.hashCode(modifiers);
+ hc = hc * 43 + enumOrdinalHashCode(modifiers);
hc = hc * 43 + requires.hashCode();
hc = hc * 43 + Objects.hashCode(packages);
hc = hc * 43 + exports.hashCode();
@@ -2546,6 +2546,21 @@ public class ModuleDescriptor
.collect(Collectors.joining(" "));
}
+ /**
+ * Generates and returns a hashcode for the enum instances. The
returned hashcode
+ * is a sum of each of the enum instances' {@link Enum#ordinal()
ordinal} value.
+ */
+ private static int enumOrdinalHashCode(final Iterable<? extends
Enum<?>> enums) {
+ int h = 0;
+ for (final Enum<?> e : enums) {
+ if (e == null) {
+ continue;
+ }
+ h += e.ordinal();
+ }
+ return h;
+ }
+
private static <T extends Object & Comparable<? super T>>
int compare(T obj1, T obj2) {
if (obj1 != null) {
With this change (and this change only, no changes in
SystemModulesPlugin were needed) I was able to consistently run that
test without any failures. But I didn't pursue this effort further
because I thought making ModuleDescriptor#hashCode() return reproducible
result wasn't a goal.
Do you think we should be making ModuleDescriptor#hashCode()
deterministic and reproducible across runs? If so, if that above patch
looks reasonable, I can clean it up a bit and run some further tests.
-Jaikiran
More information about the core-libs-dev
mailing list