RFR(S) 8241071 Generation of classes.jsa is not deterministic
https://bugs.openjdk.java.net/browse/JDK-8241071 http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0... The goal is to for "java -Xshare:dump" to produce deterministic contents in the CDS archive that depend only on the contents of $JAVA_HOME/lib/modules. Problems: [1] Symbols in the CDS archive may have non-deterministic order because Arena allocation is non-deterministic. [2] The contents of the CDS shared heap region may be randomized due to ImmutableCollections.SALT32L. Fixes: [1] With -Xshare:dump, allocate Symbols from the class space (64-bit only). See changes in symbol.cpp for details. [2] When running the VM with -Xshare:dump, ImmutableCollections.SALT32L is initialized with a deterministic seed. NOTE: this affects ONLY when the VM is running with the special flag -Xshare:dump to dump the CDS archive. It does NOT affect normal execution of Java programs. --- I also cleaned up the -Xlog:cds output and print out the CRC of each CDS region, to help diagnose why two CDS archives may have different contents. Thanks - Ioi
Hi Ioi, On 27/04/2020 3:22 pm, Ioi Lam wrote:
https://bugs.openjdk.java.net/browse/JDK-8241071 http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
The goal is to for "java -Xshare:dump" to produce deterministic contents in the CDS archive that depend only on the contents of $JAVA_HOME/lib/modules.
Problems: [1] Symbols in the CDS archive may have non-deterministic order because Arena allocation is non-deterministic. [2] The contents of the CDS shared heap region may be randomized due to ImmutableCollections.SALT32L.
Fixes: [1] With -Xshare:dump, allocate Symbols from the class space (64-bit only).
I'm not seeing what restricts this to 64-bit only ?? Will changing the allocation of symbols cause any problems with the amount of "class space" we need?
See changes in symbol.cpp for details. [2] When running the VM with -Xshare:dump, ImmutableCollections.SALT32L is initialized with a deterministic seed. NOTE: this affects ONLY when the VM is running with the special flag -Xshare:dump to dump the CDS archive. It does NOT affect normal execution of Java programs.
I can't say I like adding in the VM call for this. Another alternative would be a VM defined system property. But I'll let core-libs folk make the call here.
--- I also cleaned up the -Xlog:cds output and print out the CRC of each CDS region, to help diagnose why two CDS archives may have different contents.
Generally seems okay. A couple of minor nits: src/hotspot/share/oops/symbol.cpp typo: adresses -> addresses test/hotspot/jtreg/runtime/cds/DeterministicDump.java Should the test use @driver mode? 39 // use of SharedArchiveFile argument. s/argument/arguments/ 40 // methods to form command line to create/use shared archive. s/line/lines/ s/shared/the shared/ 71 if (n0 == 0) { Shouldn't this check come before the loop at line 63? Thanks, David -----
Thanks - Ioi
On Mon, Apr 27, 2020 at 8:23 AM David Holmes <david.holmes@oracle.com> wrote:
Hi Ioi,
On 27/04/2020 3:22 pm, Ioi Lam wrote:
http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
The goal is to for "java -Xshare:dump" to produce deterministic contents
in
the CDS archive that depend only on the contents of $JAVA_HOME/lib/modules.
Problems: [1] Symbols in the CDS archive may have non-deterministic order because Arena allocation is non-deterministic. [2] The contents of the CDS shared heap region may be randomized due to ImmutableCollections.SALT32L.
Fixes: [1] With -Xshare:dump, allocate Symbols from the class space (64-bit only).
I'm not seeing what restricts this to 64-bit only ??
Will changing the allocation of symbols cause any problems with the amount of "class space" we need?
I think also this is improbable but still possible. ccs size is set to 1G when we dump which is usually oversized. But a theoretical case where a customer were to use a large portion of it and we put the Symbols atop of it may bring us over that limit.
See changes in symbol.cpp for details. [2] When running the VM with -Xshare:dump, ImmutableCollections.SALT32L
is
initialized with a deterministic seed. NOTE: this affects ONLY when the VM is running with the special flag -Xshare:dump to dump the CDS archive. It does NOT affect normal execution of Java programs.
I can't say I like adding in the VM call for this. Another alternative would be a VM defined system property. But I'll let core-libs folk make the call here.
--- I also cleaned up the -Xlog:cds output and print out the CRC of each CDS region, to help diagnose why two CDS archives may have different contents.
Generally seems okay. A couple of minor nits:
src/hotspot/share/oops/symbol.cpp
typo: adresses -> addresses
test/hotspot/jtreg/runtime/cds/DeterministicDump.java
Should the test use @driver mode?
39 // use of SharedArchiveFile argument.
s/argument/arguments/
40 // methods to form command line to create/use shared archive.
s/line/lines/
s/shared/the shared/
71 if (n0 == 0) {
Shouldn't this check come before the loop at line 63?
Thanks, David -----
Thanks - Ioi
Hi David & Jiangli, Thanks for your comments. I thought about using a system property, but the user can also specify it like java -Djdk.xshare.dump.salt=0 MyProgram This would circumvent the calculation of ImmutableCollections.SALT32L for normal execution. I am not sure if this is desirable, or if there's a way to prevent the user from doing that. I'll see what the core lib folks suggest. Updated webrev: http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0... http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0... More comments below On 4/26/20 11:20 PM, David Holmes wrote:
Hi Ioi,
On 27/04/2020 3:22 pm, Ioi Lam wrote:
https://bugs.openjdk.java.net/browse/JDK-8241071 http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
The goal is to for "java -Xshare:dump" to produce deterministic contents in the CDS archive that depend only on the contents of $JAVA_HOME/lib/modules.
Problems: [1] Symbols in the CDS archive may have non-deterministic order because Arena allocation is non-deterministic. [2] The contents of the CDS shared heap region may be randomized due to ImmutableCollections.SALT32L.
Fixes: [1] With -Xshare:dump, allocate Symbols from the class space (64-bit only).
I'm not seeing what restricts this to 64-bit only ??
The last version of the patch allocated using MetaspaceObj::ClassType. On 64-bit+compressed classes, this will allocate from the class space. But anyway, as Thomas commented, doing this is problematic.
Will changing the allocation of symbols cause any problems with the amount of "class space" we need?
Per Thomas's comment, I moved the allocation of Symbols in a separate virtual space. Please see my comments to Thomas's e-mail.
See changes in symbol.cpp for details. [2] When running the VM with -Xshare:dump, ImmutableCollections.SALT32L is initialized with a deterministic seed. NOTE: this affects ONLY when the VM is running with the special flag -Xshare:dump to dump the CDS archive. It does NOT affect normal execution of Java programs.
I can't say I like adding in the VM call for this. Another alternative would be a VM defined system property. But I'll let core-libs folk make the call here.
--- I also cleaned up the -Xlog:cds output and print out the CRC of each CDS region, to help diagnose why two CDS archives may have different contents.
Generally seems okay. A couple of minor nits:
src/hotspot/share/oops/symbol.cpp
typo: adresses -> addresses
Fixed
test/hotspot/jtreg/runtime/cds/DeterministicDump.java
Should the test use @driver mode?
Fixed
39 // use of SharedArchiveFile argument.
s/argument/arguments/
40 // methods to form command line to create/use shared archive.
s/line/lines/
s/shared/the shared/
I copied these comments from another test. They are not needed here so I deleted them in the new webrev.
71 if (n0 == 0) {
Shouldn't this check come before the loop at line 63?
Fixed. Thanks - Ioi
Thanks, David -----
Thanks - Ioi
On 28/04/2020 9:37 am, Ioi Lam wrote:
Hi David & Jiangli,
Thanks for your comments.
I thought about using a system property, but the user can also specify it like
java -Djdk.xshare.dump.salt=0 MyProgram
There's a way to pass properties from the VM that the user can't override. I'll need to lookup the details. David
This would circumvent the calculation of ImmutableCollections.SALT32L for normal execution. I am not sure if this is desirable, or if there's a way to prevent the user from doing that. I'll see what the core lib folks suggest.
Updated webrev: http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
More comments below
On 4/26/20 11:20 PM, David Holmes wrote:
Hi Ioi,
On 27/04/2020 3:22 pm, Ioi Lam wrote:
https://bugs.openjdk.java.net/browse/JDK-8241071 http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
The goal is to for "java -Xshare:dump" to produce deterministic contents in the CDS archive that depend only on the contents of $JAVA_HOME/lib/modules.
Problems: [1] Symbols in the CDS archive may have non-deterministic order because Arena allocation is non-deterministic. [2] The contents of the CDS shared heap region may be randomized due to ImmutableCollections.SALT32L.
Fixes: [1] With -Xshare:dump, allocate Symbols from the class space (64-bit only).
I'm not seeing what restricts this to 64-bit only ??
The last version of the patch allocated using MetaspaceObj::ClassType. On 64-bit+compressed classes, this will allocate from the class space.
But anyway, as Thomas commented, doing this is problematic.
Will changing the allocation of symbols cause any problems with the amount of "class space" we need?
Per Thomas's comment, I moved the allocation of Symbols in a separate virtual space. Please see my comments to Thomas's e-mail.
See changes in symbol.cpp for details. [2] When running the VM with -Xshare:dump, ImmutableCollections.SALT32L is initialized with a deterministic seed. NOTE: this affects ONLY when the VM is running with the special flag -Xshare:dump to dump the CDS archive. It does NOT affect normal execution of Java programs.
I can't say I like adding in the VM call for this. Another alternative would be a VM defined system property. But I'll let core-libs folk make the call here.
--- I also cleaned up the -Xlog:cds output and print out the CRC of each CDS region, to help diagnose why two CDS archives may have different contents.
Generally seems okay. A couple of minor nits:
src/hotspot/share/oops/symbol.cpp
typo: adresses -> addresses
Fixed
test/hotspot/jtreg/runtime/cds/DeterministicDump.java
Should the test use @driver mode?
Fixed
39 // use of SharedArchiveFile argument.
s/argument/arguments/
40 // methods to form command line to create/use shared archive.
s/line/lines/
s/shared/the shared/
I copied these comments from another test. They are not needed here so I deleted them in the new webrev.
71 if (n0 == 0) {
Shouldn't this check come before the loop at line 63?
Fixed.
Thanks - Ioi
Thanks, David -----
Thanks - Ioi
On 28/04/2020 10:01 am, David Holmes wrote:
On 28/04/2020 9:37 am, Ioi Lam wrote:
Hi David & Jiangli,
Thanks for your comments.
I thought about using a system property, but the user can also specify it like
java -Djdk.xshare.dump.salt=0 MyProgram
There's a way to pass properties from the VM that the user can't override. I'll need to lookup the details.
It should suffice to define SystemProperty that is non-writeable and internal (for good measure). But it seems we have introduced a bug since Java 9 whereby a non-writeable property is actually being overwritten! :( David
David
This would circumvent the calculation of ImmutableCollections.SALT32L for normal execution. I am not sure if this is desirable, or if there's a way to prevent the user from doing that. I'll see what the core lib folks suggest.
Updated webrev: http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
More comments below
On 4/26/20 11:20 PM, David Holmes wrote:
Hi Ioi,
On 27/04/2020 3:22 pm, Ioi Lam wrote:
https://bugs.openjdk.java.net/browse/JDK-8241071 http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
The goal is to for "java -Xshare:dump" to produce deterministic contents in the CDS archive that depend only on the contents of $JAVA_HOME/lib/modules.
Problems: [1] Symbols in the CDS archive may have non-deterministic order because Arena allocation is non-deterministic. [2] The contents of the CDS shared heap region may be randomized due to ImmutableCollections.SALT32L.
Fixes: [1] With -Xshare:dump, allocate Symbols from the class space (64-bit only).
I'm not seeing what restricts this to 64-bit only ??
The last version of the patch allocated using MetaspaceObj::ClassType. On 64-bit+compressed classes, this will allocate from the class space.
But anyway, as Thomas commented, doing this is problematic.
Will changing the allocation of symbols cause any problems with the amount of "class space" we need?
Per Thomas's comment, I moved the allocation of Symbols in a separate virtual space. Please see my comments to Thomas's e-mail.
See changes in symbol.cpp for details. [2] When running the VM with -Xshare:dump, ImmutableCollections.SALT32L is initialized with a deterministic seed. NOTE: this affects ONLY when the VM is running with the special flag -Xshare:dump to dump the CDS archive. It does NOT affect normal execution of Java programs.
I can't say I like adding in the VM call for this. Another alternative would be a VM defined system property. But I'll let core-libs folk make the call here.
--- I also cleaned up the -Xlog:cds output and print out the CRC of each CDS region, to help diagnose why two CDS archives may have different contents.
Generally seems okay. A couple of minor nits:
src/hotspot/share/oops/symbol.cpp
typo: adresses -> addresses
Fixed
test/hotspot/jtreg/runtime/cds/DeterministicDump.java
Should the test use @driver mode?
Fixed
39 // use of SharedArchiveFile argument.
s/argument/arguments/
40 // methods to form command line to create/use shared archive.
s/line/lines/
s/shared/the shared/
I copied these comments from another test. They are not needed here so I deleted them in the new webrev.
71 if (n0 == 0) {
Shouldn't this check come before the loop at line 63?
Fixed.
Thanks - Ioi
Thanks, David -----
Thanks - Ioi
On 4/27/20 7:05 PM, David Holmes wrote:
On 28/04/2020 10:01 am, David Holmes wrote:
On 28/04/2020 9:37 am, Ioi Lam wrote:
Hi David & Jiangli,
Thanks for your comments.
I thought about using a system property, but the user can also specify it like
java -Djdk.xshare.dump.salt=0 MyProgram
There's a way to pass properties from the VM that the user can't override. I'll need to lookup the details.
It should suffice to define SystemProperty that is non-writeable and internal (for good measure). But it seems we have introduced a bug since Java 9 whereby a non-writeable property is actually being overwritten! :(
David
Hi David, I tried implementing the seed as a system property. However, ImmutableCollections.<clinit>() is executed very early during JDK bootstrap (about 500-th bytecode), much earlier than when System.props is initialized (about 43689-th bytecode). So attempts to read the system property would give a NullPointerException Error occurred during initialization of VM java.lang.ExceptionInInitializerError at java.util.ImmutableCollections$Set12.<init>(java.base/ImmutableCollections.java:626) at java.util.Set.of(java.base/Set.java:468) at java.lang.Module.<clinit>(java.base/Module.java:251) Caused by: java.lang.NullPointerException at java.lang.System.getProperty(java.base/System.java:834) at java.util.ImmutableCollections.<clinit>(java.base/ImmutableCollections.java:80) at java.util.ImmutableCollections$Set12.<init>(java.base/ImmutableCollections.java:626) at java.util.Set.of(java.base/Set.java:468) at java.lang.Module.<clinit>(java.base/Module.java:251) Because of this, I have to keep using the JVM_GetRandomSeedForCDSDump() native function. Thanks - Ioi
David
This would circumvent the calculation of ImmutableCollections.SALT32L for normal execution. I am not sure if this is desirable, or if there's a way to prevent the user from doing that. I'll see what the core lib folks suggest.
Updated webrev: http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
More comments below
On 4/26/20 11:20 PM, David Holmes wrote:
Hi Ioi,
On 27/04/2020 3:22 pm, Ioi Lam wrote:
https://bugs.openjdk.java.net/browse/JDK-8241071 http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
The goal is to for "java -Xshare:dump" to produce deterministic contents in the CDS archive that depend only on the contents of $JAVA_HOME/lib/modules.
Problems: [1] Symbols in the CDS archive may have non-deterministic order because Arena allocation is non-deterministic. [2] The contents of the CDS shared heap region may be randomized due to ImmutableCollections.SALT32L.
Fixes: [1] With -Xshare:dump, allocate Symbols from the class space (64-bit only).
I'm not seeing what restricts this to 64-bit only ??
The last version of the patch allocated using MetaspaceObj::ClassType. On 64-bit+compressed classes, this will allocate from the class space.
But anyway, as Thomas commented, doing this is problematic.
Will changing the allocation of symbols cause any problems with the amount of "class space" we need?
Per Thomas's comment, I moved the allocation of Symbols in a separate virtual space. Please see my comments to Thomas's e-mail.
See changes in symbol.cpp for details. [2] When running the VM with -Xshare:dump, ImmutableCollections.SALT32L is initialized with a deterministic seed. NOTE: this affects ONLY when the VM is running with the special flag -Xshare:dump to dump the CDS archive. It does NOT affect normal execution of Java programs.
I can't say I like adding in the VM call for this. Another alternative would be a VM defined system property. But I'll let core-libs folk make the call here.
--- I also cleaned up the -Xlog:cds output and print out the CRC of each CDS region, to help diagnose why two CDS archives may have different contents.
Generally seems okay. A couple of minor nits:
src/hotspot/share/oops/symbol.cpp
typo: adresses -> addresses
Fixed
test/hotspot/jtreg/runtime/cds/DeterministicDump.java
Should the test use @driver mode?
Fixed
39 // use of SharedArchiveFile argument.
s/argument/arguments/
40 // methods to form command line to create/use shared archive.
s/line/lines/
s/shared/the shared/
I copied these comments from another test. They are not needed here so I deleted them in the new webrev.
71 if (n0 == 0) {
Shouldn't this check come before the loop at line 63?
Fixed.
Thanks - Ioi
Thanks, David -----
Thanks - Ioi
Hi Ioi, On 5/05/2020 3:43 am, Ioi Lam wrote:
On 4/27/20 7:05 PM, David Holmes wrote:
On 28/04/2020 10:01 am, David Holmes wrote:
On 28/04/2020 9:37 am, Ioi Lam wrote:
Hi David & Jiangli,
Thanks for your comments.
I thought about using a system property, but the user can also specify it like
java -Djdk.xshare.dump.salt=0 MyProgram
There's a way to pass properties from the VM that the user can't override. I'll need to lookup the details.
It should suffice to define SystemProperty that is non-writeable and internal (for good measure). But it seems we have introduced a bug since Java 9 whereby a non-writeable property is actually being overwritten! :(
David
Hi David,
I tried implementing the seed as a system property. However, ImmutableCollections.<clinit>() is executed very early during JDK bootstrap (about 500-th bytecode), much earlier than when System.props is initialized (about 43689-th bytecode). So attempts to read the system property would give a NullPointerException
Error occurred during initialization of VM java.lang.ExceptionInInitializerError at java.util.ImmutableCollections$Set12.<init>(java.base/ImmutableCollections.java:626)
at java.util.Set.of(java.base/Set.java:468) at java.lang.Module.<clinit>(java.base/Module.java:251) Caused by: java.lang.NullPointerException at java.lang.System.getProperty(java.base/System.java:834) at java.util.ImmutableCollections.<clinit>(java.base/ImmutableCollections.java:80)
at java.util.ImmutableCollections$Set12.<init>(java.base/ImmutableCollections.java:626)
at java.util.Set.of(java.base/Set.java:468) at java.lang.Module.<clinit>(java.base/Module.java:251)
Because of this, I have to keep using the JVM_GetRandomSeedForCDSDump() native function.
Okay - thanks for trying the alternative. Initialization is always tricky. David
Thanks - Ioi
David
This would circumvent the calculation of ImmutableCollections.SALT32L for normal execution. I am not sure if this is desirable, or if there's a way to prevent the user from doing that. I'll see what the core lib folks suggest.
Updated webrev: http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
More comments below
On 4/26/20 11:20 PM, David Holmes wrote:
Hi Ioi,
On 27/04/2020 3:22 pm, Ioi Lam wrote:
https://bugs.openjdk.java.net/browse/JDK-8241071 http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
The goal is to for "java -Xshare:dump" to produce deterministic contents in the CDS archive that depend only on the contents of $JAVA_HOME/lib/modules.
Problems: [1] Symbols in the CDS archive may have non-deterministic order because Arena allocation is non-deterministic. [2] The contents of the CDS shared heap region may be randomized due to ImmutableCollections.SALT32L.
Fixes: [1] With -Xshare:dump, allocate Symbols from the class space (64-bit only).
I'm not seeing what restricts this to 64-bit only ??
The last version of the patch allocated using MetaspaceObj::ClassType. On 64-bit+compressed classes, this will allocate from the class space.
But anyway, as Thomas commented, doing this is problematic.
Will changing the allocation of symbols cause any problems with the amount of "class space" we need?
Per Thomas's comment, I moved the allocation of Symbols in a separate virtual space. Please see my comments to Thomas's e-mail.
See changes in symbol.cpp for details. [2] When running the VM with -Xshare:dump, ImmutableCollections.SALT32L is initialized with a deterministic seed. NOTE: this affects ONLY when the VM is running with the special flag -Xshare:dump to dump the CDS archive. It does NOT affect normal execution of Java programs.
I can't say I like adding in the VM call for this. Another alternative would be a VM defined system property. But I'll let core-libs folk make the call here.
--- I also cleaned up the -Xlog:cds output and print out the CRC of each CDS region, to help diagnose why two CDS archives may have different contents.
Generally seems okay. A couple of minor nits:
src/hotspot/share/oops/symbol.cpp
typo: adresses -> addresses
Fixed
test/hotspot/jtreg/runtime/cds/DeterministicDump.java
Should the test use @driver mode?
Fixed
39 // use of SharedArchiveFile argument.
s/argument/arguments/
40 // methods to form command line to create/use shared archive.
s/line/lines/
s/shared/the shared/
I copied these comments from another test. They are not needed here so I deleted them in the new webrev.
71 if (n0 == 0) {
Shouldn't this check come before the loop at line 63?
Fixed.
Thanks - Ioi
Thanks, David -----
Thanks - Ioi
Hi Ioi, Please don't do this :) First off, how would this work when dumping with UseCompressedClassPointers off? In that case allocation would be relegated to non-class metaspace which cannot guarantee that kind of address stability. Even in class space, I do not think you can guarantee addresses growing monotonously. Class unloading could happens during dump time, so space may be returned to class freelist and later reused. Metadata can be prematurely deallocated, e.g. if a class load error occurs or byte code is rewritten by some agent. Remainder of Metachunks are used up in a delayed fashion. All these cases will present you with pointers which are not growing monotonously. I also believe this problem of non-deterministic placement is not limited to Symbols, but that you should see it for Klass structures too, albeit rarely. I believe the fact that you do not see this is an accident, or we are not looking that closely. E.g. if we were to change the frequency at which we retrieve MetaBlocks from the freeblocklist (see http://hg.openjdk.java.net/jdk/jdk/file/534855a30ef5/src/hotspot/share/memor...), you would get more reuse of deallocated blocks and would certainly see more volatility in the addresses. This is all true with the current implementation; the upcoming new one uses a buddy style allocator behind the scenes where it is by no means guaranteed that the first chunks get used first. I think this is what still happens, by sheer accident, but I am hesitant to promise such a behavior in the future. It removes freedom from the implementation in a lot of ways. Small examples, we might want to shepherd certain allocations into separate parts of class space (e.g. Klass structures from hidden classes) to minimize fragmentation. Or add a mode, for testing, where we would allocate Klass at the very end of ccs, or at certain "round" addresses, to shake loose errors in the calling layers which rely too much on how Klass pointers look like. Bottomline I think the assumption that ccs allocates in monotonously ascending order is not even correct today, and may break very easily, and we should not rely on this. I think instead of misusing ccs for this, it would be cleaner to just allocate a large C heap area as backing storage for the symbols? How much space are we talking about? If memory is a concern, we could just reserve a range and commit it manually as we go. Or could we not order the placement of Klass and Symbol at dump time? Dump time is not that time critical, no? Thanks & Sorry, Thomas On Mon, Apr 27, 2020 at 7:31 AM Ioi Lam <ioi.lam@oracle.com> wrote:
https://bugs.openjdk.java.net/browse/JDK-8241071
http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
The goal is to for "java -Xshare:dump" to produce deterministic contents in the CDS archive that depend only on the contents of $JAVA_HOME/lib/modules.
Problems: [1] Symbols in the CDS archive may have non-deterministic order because Arena allocation is non-deterministic. [2] The contents of the CDS shared heap region may be randomized due to ImmutableCollections.SALT32L.
Fixes: [1] With -Xshare:dump, allocate Symbols from the class space (64-bit only). See changes in symbol.cpp for details. [2] When running the VM with -Xshare:dump, ImmutableCollections.SALT32L is initialized with a deterministic seed. NOTE: this affects ONLY when the VM is running with the special flag -Xshare:dump to dump the CDS archive. It does NOT affect normal execution of Java programs.
--- I also cleaned up the -Xlog:cds output and print out the CRC of each CDS region, to help diagnose why two CDS archives may have different contents.
Thanks - Ioi
Rethinking this a bit more I realize you need not addresses growing monotonously but deterministic allocation: given a sequence of Metaspace allocation operations (Metaspace::allocate(), Metaspace::deallocate(), and collection of class loaders), the pointers returned by Metaspace::allocate() should come in the same order each time that sequence is repeated for a new VM. This invalidates some of my arguments in my last mail, but not all. I also thought about restrictions this places on the callers. - class loader collection are triggered by GCs. Can be be sure that this happens at exactly the same point at each run? Some GCs do class unloading concurrently, which adds a nondeterministic timing factor. - classes may be loaded concurrently by multiple threads, adding a timing factor. - You may have classes which are implicitly created like hidden classes for lambdas, or reflection glue classes. Their creation may not be deterministic. Even though they are not put into the archive, they live in class space too and their allocation mixes up things. Also, requiring Metaspace allocation to be deterministic requires each part of it being deterministic (e.g. the deallocation block management). E.g. we never could base any decision on the numerical form of an address, which is location dependent and can vary between VM runs. I really think reproducable builds are valuable, but my fear is that relying on Metaspace for deterministic allocation would be too fragile. Thanks again, Thomas On Mon, Apr 27, 2020 at 9:58 AM Thomas Stüfe <thomas.stuefe@gmail.com> wrote:
Hi Ioi,
Please don't do this :)
First off, how would this work when dumping with UseCompressedClassPointers off? In that case allocation would be relegated to non-class metaspace which cannot guarantee that kind of address stability.
Even in class space, I do not think you can guarantee addresses growing monotonously. Class unloading could happens during dump time, so space may be returned to class freelist and later reused. Metadata can be prematurely deallocated, e.g. if a class load error occurs or byte code is rewritten by some agent. Remainder of Metachunks are used up in a delayed fashion. All these cases will present you with pointers which are not growing monotonously.
I also believe this problem of non-deterministic placement is not limited to Symbols, but that you should see it for Klass structures too, albeit rarely. I believe the fact that you do not see this is an accident, or we are not looking that closely. E.g. if we were to change the frequency at which we retrieve MetaBlocks from the freeblocklist (see http://hg.openjdk.java.net/jdk/jdk/file/534855a30ef5/src/hotspot/share/memor...), you would get more reuse of deallocated blocks and would certainly see more volatility in the addresses.
This is all true with the current implementation; the upcoming new one uses a buddy style allocator behind the scenes where it is by no means guaranteed that the first chunks get used first. I think this is what still happens, by sheer accident, but I am hesitant to promise such a behavior in the future. It removes freedom from the implementation in a lot of ways.
Small examples, we might want to shepherd certain allocations into separate parts of class space (e.g. Klass structures from hidden classes) to minimize fragmentation. Or add a mode, for testing, where we would allocate Klass at the very end of ccs, or at certain "round" addresses, to shake loose errors in the calling layers which rely too much on how Klass pointers look like.
Bottomline I think the assumption that ccs allocates in monotonously ascending order is not even correct today, and may break very easily, and we should not rely on this.
I think instead of misusing ccs for this, it would be cleaner to just allocate a large C heap area as backing storage for the symbols? How much space are we talking about? If memory is a concern, we could just reserve a range and commit it manually as we go.
Or could we not order the placement of Klass and Symbol at dump time? Dump time is not that time critical, no?
Thanks & Sorry,
Thomas
On Mon, Apr 27, 2020 at 7:31 AM Ioi Lam <ioi.lam@oracle.com> wrote:
https://bugs.openjdk.java.net/browse/JDK-8241071
http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
The goal is to for "java -Xshare:dump" to produce deterministic contents in the CDS archive that depend only on the contents of $JAVA_HOME/lib/modules.
Problems: [1] Symbols in the CDS archive may have non-deterministic order because Arena allocation is non-deterministic. [2] The contents of the CDS shared heap region may be randomized due to ImmutableCollections.SALT32L.
Fixes: [1] With -Xshare:dump, allocate Symbols from the class space (64-bit only). See changes in symbol.cpp for details. [2] When running the VM with -Xshare:dump, ImmutableCollections.SALT32L is initialized with a deterministic seed. NOTE: this affects ONLY when the VM is running with the special flag -Xshare:dump to dump the CDS archive. It does NOT affect normal execution of Java programs.
--- I also cleaned up the -Xlog:cds output and print out the CRC of each CDS region, to help diagnose why two CDS archives may have different contents.
Thanks - Ioi
Hi Thomas, Thanks for looking into this in detail. Actually the problem is not as bad as you think (I hope ....). First of all, this patch is intended for -Xshare:dump only (aka static CDS archive). Dynamic CDS archive (-XX:ArchiveClassesAtExit) is much harder to make deterministic because of concurrent thread execution. I've updated the RFE title to "Generation of classes.jsa with -Xshare:dump is not deterministic". Also, the objects are not archived at the address allocated in the Metaspace. Instead, we copy them linearly into the archive and we have control on the order of copying: 1. All classes are loaded by a single thread. JIT is also disabled (so it cannot trigger constant pool resolution, etc, in the background) 2. Classes are loaded in deterministic order as specified in SharedClassListFile. 3. As as result of 2, Symbols are also created in a deterministic order So, as long as I can guarantee that Symbols are allocated in monotonically increasing addresses, I can write them into the archive at deterministic locations. All MetaspaceObjs are copied into the archive via a depth-first search inside ArchiveCompactor::iterate_roots(). The objects are copied as they are walked, e,g. InstanceKlass -> methods -> methodA, methodB, .... Currently the VM lays out the classes deterministically. (For example, InstanceKlass::methods() is sorted by ascending names. Fields are ordered according to the field type and their order of appearance in the classfile.) This means that ArchiveCompactor::iterate_roots() will always walk the MetaspaceObjs in the same order. In essence, iterate_roots() doesn't care where the source MetaspaceObjs are. It will probably work if every MetaspaceObjs is allocated at a random address. So hopefully this will not tie your hands in your Metaspace work. The only restriction is the ordering of Symbols, which is the main fix in this patch. Now I allocate them in a VirtualSpace that's incrementally committed. It doesn't matter where this VistualSpace is located. I've added some comments in iterate_roots() to explain what's happening, but I didn't want to wring a long essay. Do you think the comments are sufficient? I updated the test case for uncompressed oops/klasses. I also run with a smaller -XX:MetaspaceSize which has the side effect of triggering GC. This found a problem with the archived heap objects, which I also fixed. http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0... http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0... Thanks - Ioi On 4/27/20 2:14 AM, Thomas Stüfe wrote:
Rethinking this a bit more I realize you need not addresses growing monotonously but deterministic allocation: given a sequence of Metaspace allocation operations (Metaspace::allocate(), Metaspace::deallocate(), and collection of class loaders), the pointers returned by Metaspace::allocate() should come in the same order each time that sequence is repeated for a new VM. This invalidates some of my arguments in my last mail, but not all.
I also thought about restrictions this places on the callers. - class loader collection are triggered by GCs. Can be be sure that this happens at exactly the same point at each run? Some GCs do class unloading concurrently, which adds a nondeterministic timing factor. - classes may be loaded concurrently by multiple threads, adding a timing factor. - You may have classes which are implicitly created like hidden classes for lambdas, or reflection glue classes. Their creation may not be deterministic. Even though they are not put into the archive, they live in class space too and their allocation mixes up things.
Also, requiring Metaspace allocation to be deterministic requires each part of it being deterministic (e.g. the deallocation block management). E.g. we never could base any decision on the numerical form of an address, which is location dependent and can vary between VM runs.
I really think reproducable builds are valuable, but my fear is that relying on Metaspace for deterministic allocation would be too fragile.
Thanks again, Thomas
On Mon, Apr 27, 2020 at 9:58 AM Thomas Stüfe <thomas.stuefe@gmail.com <mailto:thomas.stuefe@gmail.com>> wrote:
Hi Ioi,
Please don't do this :)
First off, how would this work when dumping with UseCompressedClassPointers off? In that case allocation would be relegated to non-class metaspace which cannot guarantee that kind of address stability.
Even in class space, I do not think you can guarantee addresses growing monotonously. Class unloading could happens during dump time, so space may be returned to class freelist and later reused. Metadata can be prematurely deallocated, e.g. if a class load error occurs or byte code is rewritten by some agent. Remainder of Metachunks are used up in a delayed fashion. All these cases will present you with pointers which are not growing monotonously.
I also believe this problem of non-deterministic placement is not limited to Symbols, but that you should see it for Klass structures too, albeit rarely. I believe the fact that you do not see this is an accident, or we are not looking that closely. E.g. if we were to change the frequency at which we retrieve MetaBlocks from the freeblocklist (see http://hg.openjdk.java.net/jdk/jdk/file/534855a30ef5/src/hotspot/share/memor...), you would get more reuse of deallocated blocks and would certainly see more volatility in the addresses.
This is all true with the current implementation; the upcoming new one uses a buddy style allocator behind the scenes where it is by no means guaranteed that the first chunks get used first. I think this is what still happens, by sheer accident, but I am hesitant to promise such a behavior in the future. It removes freedom from the implementation in a lot of ways.
Small examples, we might want to shepherd certain allocations into separate parts of class space (e.g. Klass structures from hidden classes) to minimize fragmentation. Or add a mode, for testing, where we would allocate Klass at the very end of ccs, or at certain "round" addresses, to shake loose errors in the calling layers which rely too much on how Klass pointers look like.
Bottomline I think the assumption that ccs allocates in monotonously ascending order is not even correct today, and may break very easily, and we should not rely on this.
I think instead of misusing ccs for this, it would be cleaner to just allocate a large C heap area as backing storage for the symbols? How much space are we talking about? If memory is a concern, we could just reserve a range and commit it manually as we go.
Or could we not order the placement of Klass and Symbol at dump time? Dump time is not that time critical, no?
Thanks & Sorry,
Thomas
On Mon, Apr 27, 2020 at 7:31 AM Ioi Lam <ioi.lam@oracle.com <mailto:ioi.lam@oracle.com>> wrote:
https://bugs.openjdk.java.net/browse/JDK-8241071 http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
The goal is to for "java -Xshare:dump" to produce deterministic contents in the CDS archive that depend only on the contents of $JAVA_HOME/lib/modules.
Problems: [1] Symbols in the CDS archive may have non-deterministic order because Arena allocation is non-deterministic. [2] The contents of the CDS shared heap region may be randomized due to ImmutableCollections.SALT32L.
Fixes: [1] With -Xshare:dump, allocate Symbols from the class space (64-bit only). See changes in symbol.cpp for details. [2] When running the VM with -Xshare:dump, ImmutableCollections.SALT32L is initialized with a deterministic seed. NOTE: this affects ONLY when the VM is running with the special flag -Xshare:dump to dump the CDS archive. It does NOT affect normal execution of Java programs.
--- I also cleaned up the -Xlog:cds output and print out the CRC of each CDS region, to help diagnose why two CDS archives may have different contents.
Thanks - Ioi
Thanks a lot Ioi for explaining! I completely missed the fact that this was for Xshare=dump only, that removes a lot of the complexities I was worried about. I like the comments, well written and not too much :) I will look at the coding closer in the coming days, but for now I am happy, thank you. ..Thomas On Tue, Apr 28, 2020 at 2:09 AM Ioi Lam <ioi.lam@oracle.com> wrote:
Hi Thomas,
Thanks for looking into this in detail.
Actually the problem is not as bad as you think (I hope ....). First of all, this patch is intended for -Xshare:dump only (aka static CDS archive). Dynamic CDS archive (-XX:ArchiveClassesAtExit) is much harder to make deterministic because of concurrent thread execution. I've updated the RFE title to "Generation of classes.jsa with -Xshare:dump is not deterministic".
Also, the objects are not archived at the address allocated in the Metaspace. Instead, we copy them linearly into the archive and we have control on the order of copying:
1. All classes are loaded by a single thread. JIT is also disabled (so it cannot trigger constant pool resolution, etc, in the background) 2. Classes are loaded in deterministic order as specified in SharedClassListFile. 3. As as result of 2, Symbols are also created in a deterministic order
So, as long as I can guarantee that Symbols are allocated in monotonically increasing addresses, I can write them into the archive at deterministic locations.
All MetaspaceObjs are copied into the archive via a depth-first search inside ArchiveCompactor::iterate_roots(). The objects are copied as they are walked, e,g. InstanceKlass -> methods -> methodA, methodB, ....
Currently the VM lays out the classes deterministically. (For example, InstanceKlass::methods() is sorted by ascending names. Fields are ordered according to the field type and their order of appearance in the classfile.) This means that ArchiveCompactor::iterate_roots() will always walk the MetaspaceObjs in the same order.
In essence, iterate_roots() doesn't care where the source MetaspaceObjs are. It will probably work if every MetaspaceObjs is allocated at a random address. So hopefully this will not tie your hands in your Metaspace work.
The only restriction is the ordering of Symbols, which is the main fix in this patch. Now I allocate them in a VirtualSpace that's incrementally committed. It doesn't matter where this VistualSpace is located.
I've added some comments in iterate_roots() to explain what's happening, but I didn't want to wring a long essay. Do you think the comments are sufficient?
I updated the test case for uncompressed oops/klasses. I also run with a smaller -XX:MetaspaceSize which has the side effect of triggering GC. This found a problem with the archived heap objects, which I also fixed.
http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
Thanks - Ioi
On 4/27/20 2:14 AM, Thomas Stüfe wrote:
Rethinking this a bit more I realize you need not addresses growing monotonously but deterministic allocation: given a sequence of Metaspace allocation operations (Metaspace::allocate(), Metaspace::deallocate(), and collection of class loaders), the pointers returned by Metaspace::allocate() should come in the same order each time that sequence is repeated for a new VM. This invalidates some of my arguments in my last mail, but not all.
I also thought about restrictions this places on the callers. - class loader collection are triggered by GCs. Can be be sure that this happens at exactly the same point at each run? Some GCs do class unloading concurrently, which adds a nondeterministic timing factor. - classes may be loaded concurrently by multiple threads, adding a timing factor. - You may have classes which are implicitly created like hidden classes for lambdas, or reflection glue classes. Their creation may not be deterministic. Even though they are not put into the archive, they live in class space too and their allocation mixes up things.
Also, requiring Metaspace allocation to be deterministic requires each part of it being deterministic (e.g. the deallocation block management). E.g. we never could base any decision on the numerical form of an address, which is location dependent and can vary between VM runs.
I really think reproducable builds are valuable, but my fear is that relying on Metaspace for deterministic allocation would be too fragile.
Thanks again, Thomas
On Mon, Apr 27, 2020 at 9:58 AM Thomas Stüfe <thomas.stuefe@gmail.com> wrote:
Hi Ioi,
Please don't do this :)
First off, how would this work when dumping with UseCompressedClassPointers off? In that case allocation would be relegated to non-class metaspace which cannot guarantee that kind of address stability.
Even in class space, I do not think you can guarantee addresses growing monotonously. Class unloading could happens during dump time, so space may be returned to class freelist and later reused. Metadata can be prematurely deallocated, e.g. if a class load error occurs or byte code is rewritten by some agent. Remainder of Metachunks are used up in a delayed fashion. All these cases will present you with pointers which are not growing monotonously.
I also believe this problem of non-deterministic placement is not limited to Symbols, but that you should see it for Klass structures too, albeit rarely. I believe the fact that you do not see this is an accident, or we are not looking that closely. E.g. if we were to change the frequency at which we retrieve MetaBlocks from the freeblocklist (see http://hg.openjdk.java.net/jdk/jdk/file/534855a30ef5/src/hotspot/share/memor...), you would get more reuse of deallocated blocks and would certainly see more volatility in the addresses.
This is all true with the current implementation; the upcoming new one uses a buddy style allocator behind the scenes where it is by no means guaranteed that the first chunks get used first. I think this is what still happens, by sheer accident, but I am hesitant to promise such a behavior in the future. It removes freedom from the implementation in a lot of ways.
Small examples, we might want to shepherd certain allocations into separate parts of class space (e.g. Klass structures from hidden classes) to minimize fragmentation. Or add a mode, for testing, where we would allocate Klass at the very end of ccs, or at certain "round" addresses, to shake loose errors in the calling layers which rely too much on how Klass pointers look like.
Bottomline I think the assumption that ccs allocates in monotonously ascending order is not even correct today, and may break very easily, and we should not rely on this.
I think instead of misusing ccs for this, it would be cleaner to just allocate a large C heap area as backing storage for the symbols? How much space are we talking about? If memory is a concern, we could just reserve a range and commit it manually as we go.
Or could we not order the placement of Klass and Symbol at dump time? Dump time is not that time critical, no?
Thanks & Sorry,
Thomas
On Mon, Apr 27, 2020 at 7:31 AM Ioi Lam <ioi.lam@oracle.com> wrote:
https://bugs.openjdk.java.net/browse/JDK-8241071
http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
The goal is to for "java -Xshare:dump" to produce deterministic contents in the CDS archive that depend only on the contents of $JAVA_HOME/lib/modules.
Problems: [1] Symbols in the CDS archive may have non-deterministic order because Arena allocation is non-deterministic. [2] The contents of the CDS shared heap region may be randomized due to ImmutableCollections.SALT32L.
Fixes: [1] With -Xshare:dump, allocate Symbols from the class space (64-bit only). See changes in symbol.cpp for details. [2] When running the VM with -Xshare:dump, ImmutableCollections.SALT32L is initialized with a deterministic seed. NOTE: this affects ONLY when the VM is running with the special flag -Xshare:dump to dump the CDS archive. It does NOT affect normal execution of Java programs.
--- I also cleaned up the -Xlog:cds output and print out the CRC of each CDS region, to help diagnose why two CDS archives may have different contents.
Thanks - Ioi
Hi Ioi, On 2020-04-27 07:22, Ioi Lam wrote:
https://bugs.openjdk.java.net/browse/JDK-8241071 http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
I can't really comment on the code changes, but I'd like to thank you for the effort of getting a deterministic classes.jsa. This is a big step towards the goal of getting deterministic, reproducible builds of the JDK. /Magnus
The goal is to for "java -Xshare:dump" to produce deterministic contents in the CDS archive that depend only on the contents of $JAVA_HOME/lib/modules.
Problems: [1] Symbols in the CDS archive may have non-deterministic order because Arena allocation is non-deterministic. [2] The contents of the CDS shared heap region may be randomized due to ImmutableCollections.SALT32L.
Fixes: [1] With -Xshare:dump, allocate Symbols from the class space (64-bit only). See changes in symbol.cpp for details. [2] When running the VM with -Xshare:dump, ImmutableCollections.SALT32L is initialized with a deterministic seed. NOTE: this affects ONLY when the VM is running with the special flag -Xshare:dump to dump the CDS archive. It does NOT affect normal execution of Java programs.
--- I also cleaned up the -Xlog:cds output and print out the CRC of each CDS region, to help diagnose why two CDS archives may have different contents.
Thanks - Ioi
Hi Ioi, I only took a quick look and this looked okay generally. I will read it more closely. I like David's alternative suggestion of using a VM defined system property. What's your thought on that? Thank you for the effort on improving the .jsa deterministic. This will help improve CDS usability longer term and also improve build process in some cases. Best, Jiangli On Sun, Apr 26, 2020 at 10:33 PM Ioi Lam <ioi.lam@oracle.com> wrote:
https://bugs.openjdk.java.net/browse/JDK-8241071 http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v0...
The goal is to for "java -Xshare:dump" to produce deterministic contents in the CDS archive that depend only on the contents of $JAVA_HOME/lib/modules.
Problems: [1] Symbols in the CDS archive may have non-deterministic order because Arena allocation is non-deterministic. [2] The contents of the CDS shared heap region may be randomized due to ImmutableCollections.SALT32L.
Fixes: [1] With -Xshare:dump, allocate Symbols from the class space (64-bit only). See changes in symbol.cpp for details. [2] When running the VM with -Xshare:dump, ImmutableCollections.SALT32L is initialized with a deterministic seed. NOTE: this affects ONLY when the VM is running with the special flag -Xshare:dump to dump the CDS archive. It does NOT affect normal execution of Java programs.
--- I also cleaned up the -Xlog:cds output and print out the CRC of each CDS region, to help diagnose why two CDS archives may have different contents.
Thanks - Ioi
participants (5)
-
David Holmes
-
Ioi Lam
-
Jiangli Zhou
-
Magnus Ihse Bursie
-
Thomas Stüfe