From per.liden at oracle.com Wed Dec 6 14:20:33 2017 From: per.liden at oracle.com (Per Liden) Date: Wed, 6 Dec 2017 15:20:33 +0100 Subject: Welcome! Message-ID: Hi, The ZGC mailing list is now up and running! Project Committers and Reviewers are automatically subscribed. We're working on getting the rest of the project infrastructure up and running. You might have noticed that a project repository has been created (http://hg.openjdk.java.net/zgc/zgc), but it currently only contains the initial seed from jdk/hs, and not ZGC itself. We hope to commit the ZGC patch set shortly, and we will of course keep people posted when that happens. cheers, Per From shade at redhat.com Thu Dec 7 19:42:10 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 7 Dec 2017 20:42:10 +0100 Subject: Binary builds and workspaces Message-ID: <776aa4f0-78f0-6e99-c37f-512f2136aa58@redhat.com> Hey, Even though zgc/zgc forest have no ZGC sources yet, our CI builds it now nevertheless, like it builds Shenandoah and Epsilon. This gives us two artifacts: *) OpenJDK workspace: https://builds.shipilev.net/workspaces/zgc-zgc.tar.xz This should be helpful for Europe-residing folks, because cloning the entire monorepo workspace would take a while otherwise (>1 hr, by my last measurement). The tarball is rebuilt every night, and trivial "hg pull" / "hg up" would bring it to the most up-to-date version. One-liners are in: https://builds.shipilev.net/workspaces/README.txt *) Binary builds: https://builds.shipilev.net/openjdk-zgc/ x86_64 is built natively, other targets are cross-compiled. Builds are updated every night, build frequency might change in future, in both directions. I would try to make build logs available too, for easier debugging. Cheers, -Aleksey From per.liden at oracle.com Fri Dec 8 12:06:16 2017 From: per.liden at oracle.com (Per Liden) Date: Fri, 8 Dec 2017 13:06:16 +0100 Subject: Binary builds and workspaces In-Reply-To: <776aa4f0-78f0-6e99-c37f-512f2136aa58@redhat.com> References: <776aa4f0-78f0-6e99-c37f-512f2136aa58@redhat.com> Message-ID: <903568e7-cfa7-f506-650e-93c392d038da@oracle.com> Hi Aleksey, On 2017-12-07 20:42, Aleksey Shipilev wrote: > Hey, > > Even though zgc/zgc forest have no ZGC sources yet, our CI builds it now nevertheless, like it > builds Shenandoah and Epsilon. Cool! > > This gives us two artifacts: > > *) OpenJDK workspace: > https://builds.shipilev.net/workspaces/zgc-zgc.tar.xz > > This should be helpful for Europe-residing folks, because cloning the entire monorepo workspace > would take a while otherwise (>1 hr, by my last measurement). The tarball is rebuilt every night, > and trivial "hg pull" / "hg up" would bring it to the most up-to-date version. One-liners are in: > https://builds.shipilev.net/workspaces/README.txt > > *) Binary builds: > https://builds.shipilev.net/openjdk-zgc/ > > x86_64 is built natively, other targets are cross-compiled. Builds are updated every night, > build frequency might change in future, in both directions. I would try to make build logs > available too, for easier debugging. I noticed that you have aarch64 and IA32 builds there. Just a heads up, the ZGC code base only supports Linux/x86_64 and Solaris/Sparc at this time. cheers, Per > > Cheers, > -Aleksey > From per.liden at oracle.com Fri Dec 8 17:33:17 2017 From: per.liden at oracle.com (Per Liden) Date: Fri, 8 Dec 2017 18:33:17 +0100 Subject: ZGC is now open source! Message-ID: <279a4d84-c3f5-9901-35cf-fb07dfa4844e@oracle.com> Hi, I'm happy to announce that ZGC is now open source and that the source code is available in the project repository: http://hg.openjdk.java.net/zgc/zgc/ We've had some issues getting the project Web/Wiki in place, so in the meantime I've uploaded a temporary "Getting Started" document here: http://cr.openjdk.java.net/~pliden/zgc/ cheers, Per From stefan.karlsson at oracle.com Fri Dec 8 17:34:31 2017 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 8 Dec 2017 18:34:31 +0100 Subject: ZGC is now open source! In-Reply-To: <279a4d84-c3f5-9901-35cf-fb07dfa4844e@oracle.com> References: <279a4d84-c3f5-9901-35cf-fb07dfa4844e@oracle.com> Message-ID: <874c05e7-de30-0ba1-56e1-d1fbd6ff1286@oracle.com> Wohoo! StefanK On 2017-12-08 18:33, Per Liden wrote: > Hi, > > I'm happy to announce that ZGC is now open source and that the source > code is available in the project repository: > > http://hg.openjdk.java.net/zgc/zgc/ > > We've had some issues getting the project Web/Wiki in place, so in the > meantime I've uploaded a temporary "Getting Started" document here: > > http://cr.openjdk.java.net/~pliden/zgc/ > > cheers, > Per From rkennke at redhat.com Fri Dec 8 17:38:12 2017 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 8 Dec 2017 18:38:12 +0100 Subject: ZGC is now open source! In-Reply-To: <279a4d84-c3f5-9901-35cf-fb07dfa4844e@oracle.com> References: <279a4d84-c3f5-9901-35cf-fb07dfa4844e@oracle.com> Message-ID: <507e66c5-4df4-914b-2fe6-1d08ad7a3812@redhat.com> Yippie! Cheers, Roman > Hi, > > I'm happy to announce that ZGC is now open source and that the source > code is available in the project repository: > > http://hg.openjdk.java.net/zgc/zgc/ > > We've had some issues getting the project Web/Wiki in place, so in the > meantime I've uploaded a temporary "Getting Started" document here: > > http://cr.openjdk.java.net/~pliden/zgc/ > > cheers, > Per From stefan.johansson at oracle.com Fri Dec 8 19:16:32 2017 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Fri, 8 Dec 2017 20:16:32 +0100 Subject: ZGC is now open source! In-Reply-To: <279a4d84-c3f5-9901-35cf-fb07dfa4844e@oracle.com> References: <279a4d84-c3f5-9901-35cf-fb07dfa4844e@oracle.com> Message-ID: <5A2AE510.2030406@oracle.com> Yeah! Great work guys! Cheers, Stefan On 2017-12-08 18:33, Per Liden wrote: > Hi, > > I'm happy to announce that ZGC is now open source and that the source > code is available in the project repository: > > http://hg.openjdk.java.net/zgc/zgc/ > > We've had some issues getting the project Web/Wiki in place, so in the > meantime I've uploaded a temporary "Getting Started" document here: > > http://cr.openjdk.java.net/~pliden/zgc/ > > cheers, > Per From shade at redhat.com Sat Dec 9 00:09:17 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Sat, 9 Dec 2017 01:09:17 +0100 Subject: Binary builds and workspaces In-Reply-To: <903568e7-cfa7-f506-650e-93c392d038da@oracle.com> References: <776aa4f0-78f0-6e99-c37f-512f2136aa58@redhat.com> <903568e7-cfa7-f506-650e-93c392d038da@oracle.com> Message-ID: <10526d96-e61e-5111-f119-8556674a1898@redhat.com> On 12/08/2017 01:06 PM, Per Liden wrote: > On 2017-12-07 20:42, Aleksey Shipilev wrote: >> ?*) Binary builds: >> ???? https://builds.shipilev.net/openjdk-zgc/ >> >> ?? x86_64 is built natively, other targets are cross-compiled. Builds are updated every night, >> ?? build frequency might change in future, in both directions. I would try to make build logs >> ?? available too, for easier debugging. > > I noticed that you have aarch64 and IA32 builds there. Just a heads up, the ZGC code base only > supports Linux/x86_64 and Solaris/Sparc at this time. That's not a problem. Shenandoah currently supports Linux/x86_64 and Linux/aarch64, but we build other platforms anyway, because sometimes changes in shared code may break the other platform's builds. It also helps to check that unsupported configurations fail predictably and reliably. -Aleksey From shade at redhat.com Mon Dec 11 08:44:19 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 11 Dec 2017 09:44:19 +0100 Subject: Binary builds and workspaces In-Reply-To: <10526d96-e61e-5111-f119-8556674a1898@redhat.com> References: <776aa4f0-78f0-6e99-c37f-512f2136aa58@redhat.com> <903568e7-cfa7-f506-650e-93c392d038da@oracle.com> <10526d96-e61e-5111-f119-8556674a1898@redhat.com> Message-ID: <5e2e9fba-6409-0830-87ac-7349432569c5@redhat.com> On 12/09/2017 01:09 AM, Aleksey Shipilev wrote: > On 12/08/2017 01:06 PM, Per Liden wrote: >> On 2017-12-07 20:42, Aleksey Shipilev wrote: >>> ?*) Binary builds: >>> ???? https://builds.shipilev.net/openjdk-zgc/ >>> >>> ?? x86_64 is built natively, other targets are cross-compiled. Builds are updated every night, >>> ?? build frequency might change in future, in both directions. I would try to make build logs >>> ?? available too, for easier debugging. >> >> I noticed that you have aarch64 and IA32 builds there. Just a heads up, the ZGC code base only >> supports Linux/x86_64 and Solaris/Sparc at this time. > > That's not a problem. Shenandoah currently supports Linux/x86_64 and Linux/aarch64, but we build > other platforms anyway, because sometimes changes in shared code may break the other platform's > builds. It also helps to check that unsupported configurations fail predictably and reliably. For example, aarch64 build fails with: /pool/buildbot/slaves/sobornost/zgc/build/src/hotspot/share/gc/z/zGlobals.hpp:29:33: fatal error: zGlobals_linux_aarch64.hpp: No such file or directory #include OS_CPU_HEADER(zGlobals) And i386 build fails with multiple errors like these: /pool/buildbot/slaves/sobornost/zgc/build/src/hotspot/os_cpu/linux_x86/zGlobals_linux_x86.hpp:83:65: error: left shift count >= width of type [-Werror=shift-count-overflow] const uintptr_t ZPlatformAddressSpaceStart = (uintptr_t)1 << ZPlatformAddressOffsetBits; ^~~~~~~~~~~~~~~~~~~~~~~~~~ /pool/buildbot/slaves/sobornost/zgc/build/src/hotspot/os_cpu/linux_x86/zGlobals_linux_x86.hpp:84:66: error: left shift count >= width of type [-Werror=shift-count-overflow] const uintptr_t ZPlatformAddressSpaceSize = ((uintptr_t)1 << ZPlatformAddressOffsetBits) * 4; ^~~~~~~~~~~~~~~~~~~~~~~~~~ We were at the same position at some point, but the requirements for shipping Shenandoah with 8u forced us to maintain buildable workspace even on platforms Shenandoah does not support. Thanks, -Aleksey From shade at redhat.com Mon Dec 11 08:55:53 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 11 Dec 2017 09:55:53 +0100 Subject: ZGC heap size and RSS counters Message-ID: Hi there, I'm trying ZGC on trivial workloads, and I have a question about footprint. The workload runs with -Xms16g -Xms16g, but the RSS figures are at least 3x larger: VmPeak: 18256721392 kB VmSize: 18256721392 kB VmLck: 0 kB VmPin: 0 kB VmHWM: 50729036 kB VmRSS: 50729036 kB RssAnon: 369700 kB RssFile: 27688 kB RssShmem: 50331648 kB Is this because ZGC maps the same physical space with multiple virtual mappings? Or is it a bug? Thanks, -Aleksey From per.liden at oracle.com Mon Dec 11 08:57:35 2017 From: per.liden at oracle.com (Per Liden) Date: Mon, 11 Dec 2017 09:57:35 +0100 Subject: Binary builds and workspaces In-Reply-To: <5e2e9fba-6409-0830-87ac-7349432569c5@redhat.com> References: <776aa4f0-78f0-6e99-c37f-512f2136aa58@redhat.com> <903568e7-cfa7-f506-650e-93c392d038da@oracle.com> <10526d96-e61e-5111-f119-8556674a1898@redhat.com> <5e2e9fba-6409-0830-87ac-7349432569c5@redhat.com> Message-ID: On 2017-12-11 09:44, Aleksey Shipilev wrote: > On 12/09/2017 01:09 AM, Aleksey Shipilev wrote: >> On 12/08/2017 01:06 PM, Per Liden wrote: >>> On 2017-12-07 20:42, Aleksey Shipilev wrote: >>>> *) Binary builds: >>>> https://builds.shipilev.net/openjdk-zgc/ >>>> >>>> x86_64 is built natively, other targets are cross-compiled. Builds are updated every night, >>>> build frequency might change in future, in both directions. I would try to make build logs >>>> available too, for easier debugging. >>> >>> I noticed that you have aarch64 and IA32 builds there. Just a heads up, the ZGC code base only >>> supports Linux/x86_64 and Solaris/Sparc at this time. >> >> That's not a problem. Shenandoah currently supports Linux/x86_64 and Linux/aarch64, but we build >> other platforms anyway, because sometimes changes in shared code may break the other platform's >> builds. It also helps to check that unsupported configurations fail predictably and reliably. > > For example, aarch64 build fails with: > > /pool/buildbot/slaves/sobornost/zgc/build/src/hotspot/share/gc/z/zGlobals.hpp:29:33: fatal error: > zGlobals_linux_aarch64.hpp: No such file or directory > #include OS_CPU_HEADER(zGlobals) > > > And i386 build fails with multiple errors like these: > > /pool/buildbot/slaves/sobornost/zgc/build/src/hotspot/os_cpu/linux_x86/zGlobals_linux_x86.hpp:83:65: > error: left shift count >= width of type [-Werror=shift-count-overflow] > const uintptr_t ZPlatformAddressSpaceStart = (uintptr_t)1 << ZPlatformAddressOffsetBits; > ^~~~~~~~~~~~~~~~~~~~~~~~~~ > /pool/buildbot/slaves/sobornost/zgc/build/src/hotspot/os_cpu/linux_x86/zGlobals_linux_x86.hpp:84:66: > error: left shift count >= width of type [-Werror=shift-count-overflow] > const uintptr_t ZPlatformAddressSpaceSize = ((uintptr_t)1 << ZPlatformAddressOffsetBits) * 4; > ^~~~~~~~~~~~~~~~~~~~~~~~~~ Going forward, I'm hoping we can get to a point where we don't need "fake" platform-specific files, just to make stuff build. Instead I'm thinking we should evolve the build system to never attempt to build a GC if it's not supported on that platform. Think of it as a more fine-grained INCLUDE_ALL_GCS thing, but maybe implemented differently. /Per > > We were at the same position at some point, but the requirements for shipping Shenandoah with 8u > forced us to maintain buildable workspace even on platforms Shenandoah does not support. > > Thanks, > -Aleksey > From shade at redhat.com Mon Dec 11 09:00:31 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 11 Dec 2017 10:00:31 +0100 Subject: Bug: ZGC does not support NMT Message-ID: <5279aa25-2f2f-d39e-9149-36b3c0dbec56@redhat.com> Trying to run ZGC with NMT support, and there is no "Java Heap" section. I guess mtJavaHeap tag is missing in heap allocation code within ZGC. $ java -XX:+UseZGC -Xms8g -Xmx8g -XX:NativeMemoryTracking=summary Hello $ jcmd 57529 VM.native_memory Total: reserved=310485KB, committed=66769KB - Class (reserved=8303KB, committed=4463KB) (classes #461) (malloc=111KB #603) (mmap: reserved=8192KB, committed=4352KB) ( Metadata: ) ( reserved=8192KB, committed=4352KB) ( used=3703KB) ( free=649KB) ( waste=0KB =0.00%) - Thread (reserved=34978KB, committed=34978KB) (thread #35) (stack: reserved=34892KB, committed=34892KB) (malloc=47KB #187) (arena=40KB #56) - Code (reserved=247726KB, committed=7850KB) (malloc=42KB #474) (mmap: reserved=247684KB, committed=7808KB) - GC (reserved=16921KB, committed=16921KB) (malloc=537KB #51) (mmap: reserved=16384KB, committed=16384KB) - Compiler (reserved=149KB, committed=149KB) (malloc=18KB #53) (arena=131KB #15) - Internal (reserved=317KB, committed=317KB) (malloc=285KB #1058) (mmap: reserved=32KB, committed=32KB) - Symbol (reserved=1727KB, committed=1727KB) (malloc=1111KB #1051) (arena=616KB #1) - Native Memory Tracking (reserved=89KB, committed=89KB) (malloc=5KB #56) (tracking overhead=84KB) - Arena Chunk (reserved=185KB, committed=185KB) (malloc=185KB) - Logging (reserved=8KB, committed=8KB) (malloc=8KB #158) - Arguments (reserved=17KB, committed=17KB) (malloc=17KB #463) - Module (reserved=57KB, committed=57KB) (malloc=57KB #1129) - Unknown (reserved=8KB, committed=8KB) (mmap: reserved=8KB, committed=8KB) Thanks, -Aleksey From per.liden at oracle.com Mon Dec 11 09:14:42 2017 From: per.liden at oracle.com (Per Liden) Date: Mon, 11 Dec 2017 10:14:42 +0100 Subject: ZGC heap size and RSS counters In-Reply-To: References: Message-ID: <0cceeb2e-2e24-5359-1121-f98b6ac55bfb@oracle.com> Hi, On 2017-12-11 09:55, Aleksey Shipilev wrote: > Hi there, > > I'm trying ZGC on trivial workloads, and I have a question about footprint. The workload runs with > -Xms16g -Xms16g, but the RSS figures are at least 3x larger: > > VmPeak: 18256721392 kB > VmSize: 18256721392 kB > VmLck: 0 kB > VmPin: 0 kB > VmHWM: 50729036 kB > VmRSS: 50729036 kB > RssAnon: 369700 kB > RssFile: 27688 kB > RssShmem: 50331648 kB > > Is this because ZGC maps the same physical space with multiple virtual mappings? Or is it a bug? The kernel's RSS accounting is flaky at best, and varies depending on if you're using small or large pages (and it can also vary depending on which kernel version you're using). On Linux/x86_64, we map the heap in three different locations. When using small pages, you'll typically see that the same physical page will incorrectly be accounted for three times instead of once. On the other hand, when using large pages, you'll typically see a different behavior, as it's accounted to the hugetlbfs inode and not the process. In summary, it's not a bug in ZGC, but more a limitation in Linux's accounting. /Per > > Thanks, > -Aleksey > From shade at redhat.com Mon Dec 11 09:35:01 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 11 Dec 2017 10:35:01 +0100 Subject: Bug: ZGC crashes with AbstractMethodError Message-ID: <6f038a70-71ac-6396-9fbf-48198ad619d0@redhat.com> Hi, One of our tests (sorry, we can't share it, alas :() fails with weird ZGC crash. Running without JMH forking -- because I suspected the bug is triggered by serialization in host-forked VM link -- reveals AbstractMethodError: # Detecting actual CPU count: 16 detected # JMH version: 1.19 # VM version: JDK 10-internal, VM 10-internal+0-adhoc.shade.zgc-zgc # VM invoker: /home/shade/trunks/zgc-zgc/build/baseline/bin/java # VM options: -Xmx8g -Xms8g -XX:+AlwaysPreTouch -XX:+UseZGC # Warmup: 5 iterations, 1 s each # Measurement: 5 iterations, 10 s each # Timeout: 10 min per iteration # Threads: 16 threads, will synchronize iterations # Benchmark mode: Throughput, ops/time # Benchmark: ... # Run progress: 0.00% complete, ETA 00:00:55 # Fork: N/A, test runs in the host VM # *** WARNING: Non-forked runs may silently omit JVM options, mess up profilers, disable compiler hints, etc. *** # *** WARNING: Use non-forked runs only for debugging purposes, not for actual performance runs. *** # Warmup Iteration 1: Exception in thread "main" java.lang.AbstractMethodError: java.lang.Throwable.printStackTrace(Ljava/io/PrintWriter;)V at org.openjdk.jmh.util.Utils.throwableToString(Utils.java:162) at org.openjdk.jmh.runner.BaseRunner.doSingle(BaseRunner.java:150) at org.openjdk.jmh.runner.BaseRunner.runBenchmarksEmbedded(BaseRunner.java:111) at org.openjdk.jmh.runner.Runner.runBenchmarks(Runner.java:550) at org.openjdk.jmh.runner.Runner.internalRun(Runner.java:313) at org.openjdk.jmh.runner.Runner.run(Runner.java:206) at org.openjdk.jmh.Main.main(Main.java:71) This does not happen with UseSerialGC. This does not happen with -XX:TieredStopAtLevel=1. This does not happen with -Xint. What's more frustrating, the issue is only reproducible with release bits, not with fastdebug. It seems as if something gone awry with C2? Do you have a hunch what that might be? And, how to diagnose this better? Workload does lots of floating pointing math (which probably means XMM/YMM usage), that might be a clue. Thanks, -Aleksey From shade at redhat.com Mon Dec 11 09:36:29 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 11 Dec 2017 10:36:29 +0100 Subject: ZGC heap size and RSS counters In-Reply-To: <0cceeb2e-2e24-5359-1121-f98b6ac55bfb@oracle.com> References: <0cceeb2e-2e24-5359-1121-f98b6ac55bfb@oracle.com> Message-ID: On 12/11/2017 10:14 AM, Per Liden wrote: > On 2017-12-11 09:55, Aleksey Shipilev wrote: >> Hi there, >> >> I'm trying ZGC on trivial workloads, and I have a question about footprint. The workload runs with >> -Xms16g -Xms16g, but the RSS figures are at least 3x larger: >> >> VmPeak:??? 18256721392 kB >> VmSize:??? 18256721392 kB >> VmLck:?????????? 0 kB >> VmPin:?????????? 0 kB >> VmHWM:??? 50729036 kB >> VmRSS:??? 50729036 kB >> RssAnon:????? 369700 kB >> RssFile:?????? 27688 kB >> RssShmem:??? 50331648 kB >> >> Is this because ZGC maps the same physical space with multiple virtual mappings? Or is it a bug? > > The kernel's RSS accounting is flaky at best, and varies depending on if you're using small or large > pages (and it can also vary depending on which kernel version you're using). > > On Linux/x86_64, we map the heap in three different locations. When using small pages, you'll > typically see that the same physical page will incorrectly be accounted for three times instead of > once. On the other hand, when using large pages, you'll typically see a different behavior, as it's > accounted to the hugetlbfs inode and not the process. > > In summary, it's not a bug in ZGC, but more a limitation in Linux's accounting. Understood, that's what I thought. Do you think that is the problem in lieu of pervasive use of containers that allocate/limit resources based on RSS? Thanks, -Aleksey From per.liden at oracle.com Mon Dec 11 09:39:47 2017 From: per.liden at oracle.com (Per Liden) Date: Mon, 11 Dec 2017 10:39:47 +0100 Subject: Bug: ZGC does not support NMT In-Reply-To: <5279aa25-2f2f-d39e-9149-36b3c0dbec56@redhat.com> References: <5279aa25-2f2f-d39e-9149-36b3c0dbec56@redhat.com> Message-ID: <76cda0f6-aaf0-9434-093e-1a82da57bb86@oracle.com> We haven't added support for that yet. ZGC is not using the normal os::reserve/commit_memory paths for the heap memory so we don't get that "for free", as other GCs do. /Per On 2017-12-11 10:00, Aleksey Shipilev wrote: > Trying to run ZGC with NMT support, and there is no "Java Heap" section. I guess mtJavaHeap tag is > missing in heap allocation code within ZGC. > > $ java -XX:+UseZGC -Xms8g -Xmx8g -XX:NativeMemoryTracking=summary Hello > > $ jcmd 57529 VM.native_memory > > Total: reserved=310485KB, committed=66769KB > - Class (reserved=8303KB, committed=4463KB) > (classes #461) > (malloc=111KB #603) > (mmap: reserved=8192KB, committed=4352KB) > ( Metadata: ) > ( reserved=8192KB, committed=4352KB) > ( used=3703KB) > ( free=649KB) > ( waste=0KB =0.00%) > > - Thread (reserved=34978KB, committed=34978KB) > (thread #35) > (stack: reserved=34892KB, committed=34892KB) > (malloc=47KB #187) > (arena=40KB #56) > > - Code (reserved=247726KB, committed=7850KB) > (malloc=42KB #474) > (mmap: reserved=247684KB, committed=7808KB) > > - GC (reserved=16921KB, committed=16921KB) > (malloc=537KB #51) > (mmap: reserved=16384KB, committed=16384KB) > > - Compiler (reserved=149KB, committed=149KB) > (malloc=18KB #53) > (arena=131KB #15) > > - Internal (reserved=317KB, committed=317KB) > (malloc=285KB #1058) > (mmap: reserved=32KB, committed=32KB) > > - Symbol (reserved=1727KB, committed=1727KB) > (malloc=1111KB #1051) > (arena=616KB #1) > > - Native Memory Tracking (reserved=89KB, committed=89KB) > (malloc=5KB #56) > (tracking overhead=84KB) > > - Arena Chunk (reserved=185KB, committed=185KB) > (malloc=185KB) > > - Logging (reserved=8KB, committed=8KB) > (malloc=8KB #158) > > - Arguments (reserved=17KB, committed=17KB) > (malloc=17KB #463) > > - Module (reserved=57KB, committed=57KB) > (malloc=57KB #1129) > > - Unknown (reserved=8KB, committed=8KB) > (mmap: reserved=8KB, committed=8KB) > > Thanks, > -Aleksey > From per.liden at oracle.com Mon Dec 11 09:59:51 2017 From: per.liden at oracle.com (Per Liden) Date: Mon, 11 Dec 2017 10:59:51 +0100 Subject: ZGC heap size and RSS counters In-Reply-To: References: <0cceeb2e-2e24-5359-1121-f98b6ac55bfb@oracle.com> Message-ID: On 2017-12-11 10:36, Aleksey Shipilev wrote: > On 12/11/2017 10:14 AM, Per Liden wrote: >> On 2017-12-11 09:55, Aleksey Shipilev wrote: >>> Hi there, >>> >>> I'm trying ZGC on trivial workloads, and I have a question about footprint. The workload runs with >>> -Xms16g -Xms16g, but the RSS figures are at least 3x larger: >>> >>> VmPeak: 18256721392 kB >>> VmSize: 18256721392 kB >>> VmLck: 0 kB >>> VmPin: 0 kB >>> VmHWM: 50729036 kB >>> VmRSS: 50729036 kB >>> RssAnon: 369700 kB >>> RssFile: 27688 kB >>> RssShmem: 50331648 kB >>> >>> Is this because ZGC maps the same physical space with multiple virtual mappings? Or is it a bug? >> >> The kernel's RSS accounting is flaky at best, and varies depending on if you're using small or large >> pages (and it can also vary depending on which kernel version you're using). >> >> On Linux/x86_64, we map the heap in three different locations. When using small pages, you'll >> typically see that the same physical page will incorrectly be accounted for three times instead of >> once. On the other hand, when using large pages, you'll typically see a different behavior, as it's >> accounted to the hugetlbfs inode and not the process. >> >> In summary, it's not a bug in ZGC, but more a limitation in Linux's accounting. > > Understood, that's what I thought. Do you think that is the problem in lieu of pervasive use of > containers that allocate/limit resources based on RSS? If RSS limits are used in a container, then I'd argue that the kernel better get the accounting right, otherwise those limits is fairly useless wouldn't you say? In the kernel's defense, it is gradually getting better in this area. /Per > > Thanks, > -Aleksey > From shade at redhat.com Mon Dec 11 10:12:56 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 11 Dec 2017 11:12:56 +0100 Subject: ZGC heap size and RSS counters In-Reply-To: References: <0cceeb2e-2e24-5359-1121-f98b6ac55bfb@oracle.com> Message-ID: On 12/11/2017 10:59 AM, Per Liden wrote: > On 2017-12-11 10:36, Aleksey Shipilev wrote: >> On 12/11/2017 10:14 AM, Per Liden wrote: >>> On 2017-12-11 09:55, Aleksey Shipilev wrote: >>>> Hi there, >>>> >>>> I'm trying ZGC on trivial workloads, and I have a question about footprint. The workload runs with >>>> -Xms16g -Xms16g, but the RSS figures are at least 3x larger: >>>> >>>> VmPeak:??? 18256721392 kB >>>> VmSize:??? 18256721392 kB >>>> VmLck:?????????? 0 kB >>>> VmPin:?????????? 0 kB >>>> VmHWM:??? 50729036 kB >>>> VmRSS:??? 50729036 kB >>>> RssAnon:????? 369700 kB >>>> RssFile:?????? 27688 kB >>>> RssShmem:??? 50331648 kB >>>> >>>> Is this because ZGC maps the same physical space with multiple virtual mappings? Or is it a bug? >>> >>> The kernel's RSS accounting is flaky at best, and varies depending on if you're using small or large >>> pages (and it can also vary depending on which kernel version you're using). >>> >>> On Linux/x86_64, we map the heap in three different locations. When using small pages, you'll >>> typically see that the same physical page will incorrectly be accounted for three times instead of >>> once. On the other hand, when using large pages, you'll typically see a different behavior, as it's >>> accounted to the hugetlbfs inode and not the process. >>> >>> In summary, it's not a bug in ZGC, but more a limitation in Linux's accounting. >> >> Understood, that's what I thought. Do you think that is the problem in lieu of pervasive use of >> containers that allocate/limit resources based on RSS? > > If RSS limits are used in a container, then I'd argue that the kernel better get the accounting > right, otherwise those limits is fairly useless wouldn't you say? In the kernel's defense, it is > gradually getting better in this area. I agree that's kernel's job to account this properly. But I am also concerned about the practicalities with real deployments on current kernels :( Shenandoah is also about do to double-mapping for related reasons, and it would gradually come to the same trouble. I was wondering if you have observed problems with ZGC running in containers that shed more light on this concern. Thanks, -Aleksey From per.liden at oracle.com Mon Dec 11 10:17:05 2017 From: per.liden at oracle.com (Per Liden) Date: Mon, 11 Dec 2017 11:17:05 +0100 Subject: Bug: ZGC crashes with AbstractMethodError In-Reply-To: <6f038a70-71ac-6396-9fbf-48198ad619d0@redhat.com> References: <6f038a70-71ac-6396-9fbf-48198ad619d0@redhat.com> Message-ID: <43ee8afe-dbca-e572-61fc-03bb015974d1@oracle.com> Thanks for reporting! Are you by any chance running on a recent AMD machine? We know about a C2 bug only provoked on those CPUs. Nils is working on a patch for that. /Per On 2017-12-11 10:35, Aleksey Shipilev wrote: > Hi, > > One of our tests (sorry, we can't share it, alas :() fails with weird ZGC crash. Running without JMH > forking -- because I suspected the bug is triggered by serialization in host-forked VM link -- > reveals AbstractMethodError: > > # Detecting actual CPU count: 16 detected > # JMH version: 1.19 > # VM version: JDK 10-internal, VM 10-internal+0-adhoc.shade.zgc-zgc > # VM invoker: /home/shade/trunks/zgc-zgc/build/baseline/bin/java > # VM options: -Xmx8g -Xms8g -XX:+AlwaysPreTouch -XX:+UseZGC > # Warmup: 5 iterations, 1 s each > # Measurement: 5 iterations, 10 s each > # Timeout: 10 min per iteration > # Threads: 16 threads, will synchronize iterations > # Benchmark mode: Throughput, ops/time > # Benchmark: ... > > # Run progress: 0.00% complete, ETA 00:00:55 > # Fork: N/A, test runs in the host VM > # *** WARNING: Non-forked runs may silently omit JVM options, mess up profilers, disable compiler > hints, etc. *** > # *** WARNING: Use non-forked runs only for debugging purposes, not for actual performance runs. *** > # Warmup Iteration 1: > > Exception in thread "main" java.lang.AbstractMethodError: > java.lang.Throwable.printStackTrace(Ljava/io/PrintWriter;)V > at org.openjdk.jmh.util.Utils.throwableToString(Utils.java:162) > at org.openjdk.jmh.runner.BaseRunner.doSingle(BaseRunner.java:150) > at org.openjdk.jmh.runner.BaseRunner.runBenchmarksEmbedded(BaseRunner.java:111) > at org.openjdk.jmh.runner.Runner.runBenchmarks(Runner.java:550) > at org.openjdk.jmh.runner.Runner.internalRun(Runner.java:313) > at org.openjdk.jmh.runner.Runner.run(Runner.java:206) > at org.openjdk.jmh.Main.main(Main.java:71) > > > This does not happen with UseSerialGC. This does not happen with -XX:TieredStopAtLevel=1. This does > not happen with -Xint. What's more frustrating, the issue is only reproducible with release bits, > not with fastdebug. > > It seems as if something gone awry with C2? Do you have a hunch what that might be? And, how to > diagnose this better? > > Workload does lots of floating pointing math (which probably means XMM/YMM usage), that might be a clue. > > Thanks, > -Aleksey > From shade at redhat.com Mon Dec 11 10:23:39 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 11 Dec 2017 11:23:39 +0100 Subject: Bug: ZGC crashes with AbstractMethodError In-Reply-To: <43ee8afe-dbca-e572-61fc-03bb015974d1@oracle.com> References: <6f038a70-71ac-6396-9fbf-48198ad619d0@redhat.com> <43ee8afe-dbca-e572-61fc-03bb015974d1@oracle.com> Message-ID: On 12/11/2017 11:17 AM, Per Liden wrote: > Thanks for reporting! Are you by any chance running on a recent AMD machine? We know about a C2 bug > only provoked on those CPUs. Nils is working on a patch for that. Nope, this is i7-7820X on Linux 4.10.0-38-generic x86_64, compiled with gcc version 5.4.0. Also fails on i7-4790K on Linux 4.9.0-4-amd64 compiled with gcc version 6.3.0. What is puzzling is that only release bits are failing, not fastdebug. -Aleksey From per.liden at oracle.com Mon Dec 11 11:19:17 2017 From: per.liden at oracle.com (Per Liden) Date: Mon, 11 Dec 2017 12:19:17 +0100 Subject: ZGC heap size and RSS counters In-Reply-To: References: <0cceeb2e-2e24-5359-1121-f98b6ac55bfb@oracle.com> Message-ID: <7e2d4feb-685a-4624-cb3d-e5cea3d64d3f@oracle.com> On 2017-12-11 11:12, Aleksey Shipilev wrote: > On 12/11/2017 10:59 AM, Per Liden wrote: >> On 2017-12-11 10:36, Aleksey Shipilev wrote: >>> On 12/11/2017 10:14 AM, Per Liden wrote: >>>> On 2017-12-11 09:55, Aleksey Shipilev wrote: >>>>> Hi there, >>>>> >>>>> I'm trying ZGC on trivial workloads, and I have a question >>>>> about footprint. The workload runs with -Xms16g -Xms16g, but >>>>> the RSS figures are at least 3x larger: >>>>> >>>>> VmPeak: 18256721392 kB VmSize: 18256721392 kB VmLck: >>>>> 0 kB VmPin: 0 kB VmHWM: 50729036 kB VmRSS: >>>>> 50729036 kB RssAnon: 369700 kB RssFile: 27688 kB >>>>> RssShmem: 50331648 kB >>>>> >>>>> Is this because ZGC maps the same physical space with >>>>> multiple virtual mappings? Or is it a bug? >>>> >>>> The kernel's RSS accounting is flaky at best, and varies >>>> depending on if you're using small or large pages (and it can >>>> also vary depending on which kernel version you're using). >>>> >>>> On Linux/x86_64, we map the heap in three different locations. >>>> When using small pages, you'll typically see that the same >>>> physical page will incorrectly be accounted for three times >>>> instead of once. On the other hand, when using large pages, >>>> you'll typically see a different behavior, as it's accounted to >>>> the hugetlbfs inode and not the process. >>>> >>>> In summary, it's not a bug in ZGC, but more a limitation in >>>> Linux's accounting. >>> >>> Understood, that's what I thought. Do you think that is the >>> problem in lieu of pervasive use of containers that >>> allocate/limit resources based on RSS? >> >> If RSS limits are used in a container, then I'd argue that the >> kernel better get the accounting right, otherwise those limits is >> fairly useless wouldn't you say? In the kernel's defense, it is >> gradually getting better in this area. > > I agree that's kernel's job to account this properly. But I am also > concerned about the practicalities with real deployments on current > kernels :( Shenandoah is also about do to double-mapping for related Interesting. Do you also plan to do some kind of colored pointers or do you have some other use case for multi-mapping in Shenandoah? > reasons, and it would gradually come to the same trouble. I was > wondering if you have observed problems with ZGC running in > containers that shed more light on this concern. We haven't observed problems so far, but in all honesty ZGC has had limited exposure to deployments outside of Oracle and Intel, so I'm not sure to what degree we've just been lucky so far. Should it be a real problem, there are various solutions and workarounds to pick from. For example, in such environments ZGC could run in what we call "colorless roots"-mode (not yet implemented), which would remove the need for multi-mapping all together at the expense of an extra "and reg,imm8" instruction in the load barrier. cheers, Per > > Thanks, -Aleksey > > From per.liden at oracle.com Mon Dec 11 11:24:43 2017 From: per.liden at oracle.com (Per Liden) Date: Mon, 11 Dec 2017 12:24:43 +0100 Subject: Bug: ZGC crashes with AbstractMethodError In-Reply-To: References: <6f038a70-71ac-6396-9fbf-48198ad619d0@redhat.com> <43ee8afe-dbca-e572-61fc-03bb015974d1@oracle.com> Message-ID: On 2017-12-11 11:23, Aleksey Shipilev wrote: > On 12/11/2017 11:17 AM, Per Liden wrote: >> Thanks for reporting! Are you by any chance running on a recent AMD machine? We know about a C2 bug >> only provoked on those CPUs. Nils is working on a patch for that. > > Nope, this is i7-7820X on Linux 4.10.0-38-generic x86_64, compiled with gcc version 5.4.0. > Also fails on i7-4790K on Linux 4.9.0-4-amd64 compiled with gcc version 6.3.0. > > What is puzzling is that only release bits are failing, not fastdebug. Ok, thanks! Nils will try to looks into this. Is there anything more about this test you could share? Or can the test be trimmed down and pruned of any sensitive information to make is shareable? thanks, Per > > -Aleksey > From per.liden at oracle.com Mon Dec 11 12:31:47 2017 From: per.liden at oracle.com (Per Liden) Date: Mon, 11 Dec 2017 13:31:47 +0100 Subject: Bug: ZGC crashes with AbstractMethodError In-Reply-To: References: <6f038a70-71ac-6396-9fbf-48198ad619d0@redhat.com> <43ee8afe-dbca-e572-61fc-03bb015974d1@oracle.com> Message-ID: <2ad536ef-ff9c-f391-bb93-22d9a25fe498@oracle.com> Hi, On 2017-12-11 12:24, Per Liden wrote: > On 2017-12-11 11:23, Aleksey Shipilev wrote: >> On 12/11/2017 11:17 AM, Per Liden wrote: >>> Thanks for reporting! Are you by any chance running on a recent AMD >>> machine? We know about a C2 bug >>> only provoked on those CPUs. Nils is working on a patch for that. >> >> Nope, this is i7-7820X on Linux 4.10.0-38-generic x86_64, compiled >> with gcc version 5.4.0. >> Also fails on i7-4790K on Linux 4.9.0-4-amd64 compiled with gcc >> version 6.3.0. >> >> What is puzzling is that only release bits are failing, not fastdebug. > > Ok, thanks! Nils will try to looks into this. Is there anything more > about this test you could share? Or can the test be trimmed down and > pruned of any sensitive information to make is shareable? Would is be possible to do two additional runs with your test? One with: -XX:+UseBasicLoadBarrier and another with: -XX:-UseR15TestInLoadBarrier That might help us narrow this down a bit. thanks, Per > > thanks, > Per > >> >> -Aleksey >> From shade at redhat.com Mon Dec 11 12:39:07 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 11 Dec 2017 13:39:07 +0100 Subject: Bug: ZGC crashes with AbstractMethodError In-Reply-To: <2ad536ef-ff9c-f391-bb93-22d9a25fe498@oracle.com> References: <6f038a70-71ac-6396-9fbf-48198ad619d0@redhat.com> <43ee8afe-dbca-e572-61fc-03bb015974d1@oracle.com> <2ad536ef-ff9c-f391-bb93-22d9a25fe498@oracle.com> Message-ID: On 12/11/2017 01:31 PM, Per Liden wrote: > Would is be possible to do two additional runs with your test? One with: > > -XX:+UseBasicLoadBarrier Fails. > and another with: > > -XX:-UseR15TestInLoadBarrier Fails. -Aleksey From per.liden at oracle.com Mon Dec 11 12:40:38 2017 From: per.liden at oracle.com (Per Liden) Date: Mon, 11 Dec 2017 13:40:38 +0100 Subject: Bug: ZGC crashes with AbstractMethodError In-Reply-To: References: <6f038a70-71ac-6396-9fbf-48198ad619d0@redhat.com> <43ee8afe-dbca-e572-61fc-03bb015974d1@oracle.com> <2ad536ef-ff9c-f391-bb93-22d9a25fe498@oracle.com> Message-ID: Thanks a lot! /Per On 2017-12-11 13:39, Aleksey Shipilev wrote: > On 12/11/2017 01:31 PM, Per Liden wrote: >> Would is be possible to do two additional runs with your test? One with: >> >> -XX:+UseBasicLoadBarrier > > Fails. > >> and another with: >> >> -XX:-UseR15TestInLoadBarrier > > Fails. > > > -Aleksey > From shade at redhat.com Mon Dec 11 12:47:10 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 11 Dec 2017 13:47:10 +0100 Subject: Bug: ZGC crashes with AbstractMethodError In-Reply-To: References: <6f038a70-71ac-6396-9fbf-48198ad619d0@redhat.com> <43ee8afe-dbca-e572-61fc-03bb015974d1@oracle.com> Message-ID: <2a7698ee-9d1c-f678-a0ef-c3ccc2739b6c@redhat.com> On 12/11/2017 12:24 PM, Per Liden wrote: > On 2017-12-11 11:23, Aleksey Shipilev wrote: >> On 12/11/2017 11:17 AM, Per Liden wrote: >>> Thanks for reporting! Are you by any chance running on a recent AMD machine? We know about a C2 bug >>> only provoked on those CPUs. Nils is working on a patch for that. >> >> Nope, this is i7-7820X on Linux 4.10.0-38-generic x86_64, compiled with gcc version 5.4.0. >> Also fails on i7-4790K on Linux 4.9.0-4-amd64 compiled with gcc version 6.3.0. >> >> What is puzzling is that only release bits are failing, not fastdebug. > > Ok, thanks! Nils will try to looks into this. Is there anything more about this test you could > share? Or can the test be trimmed down and pruned of any sensitive information to make is shareable? That workload is the variant of SPECjvm2008:mpegaudio. My previous attempts to reproduce it failed, but now I can reliably fail it with: $ ~/trunks/zgc-zgc/build/linux-x86_64-normal-server-release/jdk/bin/java -Xmx1g -Xms1g -XX:+AlwaysPreTouch -XX:-TieredCompilation -XX:+UseZGC -jar SPECjvm2008.jar -ikv -ict mpegaudio -bt 1 SPECjvm2008 Base Properties file: none Benchmarks: mpegaudio WARNING: Run will not be compliant. Property specjvm.run.checksum.validation must be true for publication. Not a compliant sequence of benchmarks for publication. Property specjvm.run.initial.check must be true for publication. --- --- --- --- --- --- --- --- --- Benchmark: mpegaudio Run mode: timed run Test type: multi Threads: 1 Warmup: 120s Iterations: 1 Run length: 240s Warmup (120s) begins: Mon Dec 11 13:45:06 CET 2017 Warmup (120s) ends: Mon Dec 11 13:45:06 CET 2017 Warmup (120s) result: **NOT VALID** Errors in benchmark: mpegaudio [warmup] Harness interruped during measurement period. [warmup][bt:1|op:1] java.lang.AbstractMethodError: java.lang.Exception.printStackTrace(Ljava/io/PrintStream;)V Score on mpegaudio: **NOT VALID** Benchmark mpegaudio failed. Aborting run. Thanks, -Aleksey From stefan.karlsson at oracle.com Mon Dec 11 12:51:08 2017 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 11 Dec 2017 13:51:08 +0100 Subject: Bug: ZGC crashes with AbstractMethodError In-Reply-To: <2a7698ee-9d1c-f678-a0ef-c3ccc2739b6c@redhat.com> References: <6f038a70-71ac-6396-9fbf-48198ad619d0@redhat.com> <43ee8afe-dbca-e572-61fc-03bb015974d1@oracle.com> <2a7698ee-9d1c-f678-a0ef-c3ccc2739b6c@redhat.com> Message-ID: Hi Aleksey, On 2017-12-11 13:47, Aleksey Shipilev wrote: > On 12/11/2017 12:24 PM, Per Liden wrote: >> On 2017-12-11 11:23, Aleksey Shipilev wrote: >>> On 12/11/2017 11:17 AM, Per Liden wrote: >>>> Thanks for reporting! Are you by any chance running on a recent AMD machine? We know about a C2 bug >>>> only provoked on those CPUs. Nils is working on a patch for that. >>> >>> Nope, this is i7-7820X on Linux 4.10.0-38-generic x86_64, compiled with gcc version 5.4.0. >>> Also fails on i7-4790K on Linux 4.9.0-4-amd64 compiled with gcc version 6.3.0. >>> >>> What is puzzling is that only release bits are failing, not fastdebug. >> >> Ok, thanks! Nils will try to looks into this. Is there anything more about this test you could >> share? Or can the test be trimmed down and pruned of any sensitive information to make is shareable? > > That workload is the variant of SPECjvm2008:mpegaudio. My previous attempts to reproduce it failed, > but now I can reliably fail it with: > > $ ~/trunks/zgc-zgc/build/linux-x86_64-normal-server-release/jdk/bin/java -Xmx1g -Xms1g > -XX:+AlwaysPreTouch -XX:-TieredCompilation -XX:+UseZGC -jar SPECjvm2008.jar -ikv -ict mpegaudio -bt 1 Thanks! I can reproduce the problem here. We do run SPECjvm2008, but no with this configuration. StefanK > > SPECjvm2008 Base > Properties file: none > Benchmarks: mpegaudio > > WARNING: Run will not be compliant. > Property specjvm.run.checksum.validation must be true for publication. > Not a compliant sequence of benchmarks for publication. > Property specjvm.run.initial.check must be true for publication. > > > --- --- --- --- --- --- --- --- --- > > Benchmark: mpegaudio > Run mode: timed run > Test type: multi > Threads: 1 > Warmup: 120s > Iterations: 1 > Run length: 240s > > Warmup (120s) begins: Mon Dec 11 13:45:06 CET 2017 > Warmup (120s) ends: Mon Dec 11 13:45:06 CET 2017 > Warmup (120s) result: **NOT VALID** > > Errors in benchmark: mpegaudio > [warmup] Harness interruped during measurement period. > [warmup][bt:1|op:1] java.lang.AbstractMethodError: > java.lang.Exception.printStackTrace(Ljava/io/PrintStream;)V > Score on mpegaudio: **NOT VALID** > > Benchmark mpegaudio failed. Aborting run. > > Thanks, > -Aleksey > > From per.liden at oracle.com Mon Dec 11 12:54:43 2017 From: per.liden at oracle.com (Per Liden) Date: Mon, 11 Dec 2017 13:54:43 +0100 Subject: Bug: ZGC crashes with AbstractMethodError In-Reply-To: References: <6f038a70-71ac-6396-9fbf-48198ad619d0@redhat.com> <43ee8afe-dbca-e572-61fc-03bb015974d1@oracle.com> <2a7698ee-9d1c-f678-a0ef-c3ccc2739b6c@redhat.com> Message-ID: <5a36fc8f-0462-c8f0-0a5c-4f47f3df1a82@oracle.com> On 2017-12-11 13:51, Stefan Karlsson wrote: > Hi Aleksey, > > On 2017-12-11 13:47, Aleksey Shipilev wrote: >> On 12/11/2017 12:24 PM, Per Liden wrote: >>> On 2017-12-11 11:23, Aleksey Shipilev wrote: >>>> On 12/11/2017 11:17 AM, Per Liden wrote: >>>>> Thanks for reporting! Are you by any chance running on a recent AMD >>>>> machine? We know about a C2 bug >>>>> only provoked on those CPUs. Nils is working on a patch for that. >>>> >>>> Nope, this is i7-7820X on Linux 4.10.0-38-generic x86_64, compiled >>>> with gcc version 5.4.0. >>>> Also fails on i7-4790K on Linux 4.9.0-4-amd64 compiled with gcc >>>> version 6.3.0. >>>> >>>> What is puzzling is that only release bits are failing, not fastdebug. >>> >>> Ok, thanks! Nils will try to looks into this. Is there anything more >>> about this test you could >>> share? Or can the test be trimmed down and pruned of any sensitive >>> information to make is shareable? >> >> That workload is the variant of SPECjvm2008:mpegaudio. My previous >> attempts to reproduce it failed, >> but now I can reliably fail it with: >> >> $ >> ~/trunks/zgc-zgc/build/linux-x86_64-normal-server-release/jdk/bin/java >> -Xmx1g -Xms1g >> -XX:+AlwaysPreTouch -XX:-TieredCompilation -XX:+UseZGC -jar >> SPECjvm2008.jar -ikv -ict mpegaudio -bt 1 > > Thanks! I can reproduce the problem here. We do run SPECjvm2008, but no > with this configuration. Awesome! Thanks Aleksey! /Per > > StefanK > > >> >> SPECjvm2008 Base >> Properties file: none >> Benchmarks: mpegaudio >> >> WARNING: Run will not be compliant. >> Property specjvm.run.checksum.validation must be true for publication. >> Not a compliant sequence of benchmarks for publication. >> Property specjvm.run.initial.check must be true for publication. >> >> >> --- --- --- --- --- --- --- --- --- >> >> Benchmark: mpegaudio >> Run mode: timed run >> Test type: multi >> Threads: 1 >> Warmup: 120s >> Iterations: 1 >> Run length: 240s >> >> Warmup (120s) begins: Mon Dec 11 13:45:06 CET 2017 >> Warmup (120s) ends: Mon Dec 11 13:45:06 CET 2017 >> Warmup (120s) result: **NOT VALID** >> >> Errors in benchmark: mpegaudio >> [warmup] Harness interruped during measurement period. >> [warmup][bt:1|op:1] java.lang.AbstractMethodError: >> java.lang.Exception.printStackTrace(Ljava/io/PrintStream;)V >> Score on mpegaudio: **NOT VALID** >> >> Benchmark mpegaudio failed. Aborting run. >> >> Thanks, >> -Aleksey >> >> From rkennke at redhat.com Mon Dec 11 13:26:39 2017 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 11 Dec 2017 14:26:39 +0100 Subject: ZGC heap size and RSS counters In-Reply-To: <7e2d4feb-685a-4624-cb3d-e5cea3d64d3f@oracle.com> References: <0cceeb2e-2e24-5359-1121-f98b6ac55bfb@oracle.com> <7e2d4feb-685a-4624-cb3d-e5cea3d64d3f@oracle.com> Message-ID: <06793e93-eacc-bc7b-ca7a-6831a4b53b5c@redhat.com> >> I agree that's kernel's job to account this properly. But I am also >> concerned about the practicalities with real deployments on current >> kernels :( Shenandoah is also about do to double-mapping for related > > Interesting. Do you also plan to do some kind of colored pointers or do > you have some other use case for multi-mapping in Shenandoah? We intend to use it for failure handling. If a thread T1 fails to evacuate an object in a write-barrier (e.g. because it runs OOM), it flips a bit in the object's forwarding pointer, and thus prevents any future CASes on it (i.e. evacuations by other threads). It can then safely carry on to write into the object, even though it hasn't been evacuated. (Subsequent full-gc sorts out the mess left behind by this.) However, it would require to mask out that bit in all read-barriers (what you call load-barriers) and write-barriers, and thus impact performance in the very common paths (oom-during-evac basically never happens. except when it does). When we double-map the heap, we can flip the bit such that the fwd ptr now points to he alias mapping, and that aliased fwd pointer can still be used for memory addressing. Nice to see other uses of aliased memory mapping, and that we're not the only folks that need an os.hpp API for this ;-) Cheers, Roman From shade at redhat.com Mon Dec 11 13:50:21 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 11 Dec 2017 14:50:21 +0100 Subject: ZGC heap size and RSS counters In-Reply-To: <06793e93-eacc-bc7b-ca7a-6831a4b53b5c@redhat.com> References: <0cceeb2e-2e24-5359-1121-f98b6ac55bfb@oracle.com> <7e2d4feb-685a-4624-cb3d-e5cea3d64d3f@oracle.com> <06793e93-eacc-bc7b-ca7a-6831a4b53b5c@redhat.com> Message-ID: <4e4b01a4-ce73-1559-0e5f-042c745941f5@redhat.com> On 12/11/2017 02:26 PM, Roman Kennke wrote: > However, it would require to mask out that bit in all read-barriers (what you call load-barriers) > and write-barriers, and thus impact performance in the very common paths (oom-during-evac basically > never happens. except when it does). When we double-map the heap, we can flip the bit such that the > fwd ptr now points to he alias mapping, and that aliased fwd pointer can still be used for memory > addressing. That's actually the thing that pushed us towards aliased heap. We measured up to 5% additional overhead if we do simple masking on RB paths, making this a no-go throughput-budget-wise. And doing aliased heap raises RSS accounting questions, as we see with ZGC and current kernels. So, we have two imperfect solutions on our hands :) -Aleksey From shade at redhat.com Mon Dec 11 16:52:48 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 11 Dec 2017 17:52:48 +0100 Subject: RFE: -Xlog:gc to show pause/concurrent phases duration and heap occupancy Message-ID: <4ea0f50e-49dc-cad4-489f-7bbb2fd98ef1@redhat.com> Current -Xlog:gc log with ZGC is rather opaque: [19.964s][info][gc] GC(0) Garbage Collection (Warmup) 1646M(10%)->596M(4%) [31.690s][info][gc] GC(1) Garbage Collection (Warmup) 3304M(20%)->384M(2%) [38.666s][info][gc] GC(2) Garbage Collection (Warmup) 5004M(31%)->1428M(9%) [46.724s][info][gc] GC(3) Garbage Collection (Allocation Rate) 9198M(56%)->8132M(50%) [49.010s][info][gc] GC(4) Garbage Collection (Allocation Rate) 8596M(52%)->9786M(60%) [51.373s][info][gc] GC(5) Garbage Collection (Allocation Rate) 10348M(63%)->9800M(60%) [53.698s][info][gc] GC(6) Garbage Collection (Allocation Rate) 10002M(61%)->10724M(65%) [55.434s][info][gc] GC(7) Garbage Collection (Allocation Rate) 10778M(66%)->7956M(49%) [57.446s][info][gc] GC(8) Garbage Collection (Allocation Rate) 8420M(51%)->10896M(67%) The real pause work hides under -Xlog:gc+phases: [197.362s][info][gc,phases] GC(108) Pause Mark Start 0.697ms [198.119s][info][gc,phases] GC(108) Concurrent Mark 757.226ms [198.135s][info][gc,phases] GC(108) Pause Mark End 0.802ms [198.136s][info][gc,phases] GC(108) Concurrent References Processing 1.461ms [198.149s][info][gc,phases] GC(108) Concurrent Reset Relocation Set 12.285ms [198.154s][info][gc,phases] GC(108) Concurrent Destroy Detached Pages 0.001ms [198.158s][info][gc,phases] GC(108) Concurrent Select Relocation Set 3.239ms [198.332s][info][gc,phases] GC(108) Concurrent Prepare Relocation Set 173.931ms [198.333s][info][gc,phases] GC(108) Pause Relocate Start 1.010ms [198.819s][info][gc,phases] GC(108) Concurrent Relocate 485.961ms Can/should we redo this to match what e.g. Shenandoah is doing: phases to be at "gc" tag, and concurrent phases to report how heap had changed while phases were running? This should help new users to understand what collector is doing better, and also allow better comparison against other OpenJDK collectors, that report heap occupancy and pauses under "gc" tag. So ZGC output would look like: [info][gc] GC(108) Pause Mark Start 0.697ms [info][gc] GC(108) Concurrent Mark xxxxM->xxxxM 757.226ms [info][gc] GC(108) Pause Mark End 0.802ms [info][gc] GC(108) Concurrent References Processing xxxxM->xxxxM 1.461ms [info][gc] GC(108) Concurrent Prepare Relocate xxxxM->xxxxM 12.285ms [info][gc] GC(108) Pause Relocate Start 1.010ms [info][gc] GC(108) Concurrent Relocate xxxxM->xxxxM 485.961ms "Prepare Relocate" combines all Relocation Set activities -- those can be under gc+phases if anyone wants it. Thanks, -Aleksey From per.liden at oracle.com Tue Dec 12 09:22:55 2017 From: per.liden at oracle.com (Per Liden) Date: Tue, 12 Dec 2017 10:22:55 +0100 Subject: RFE: -Xlog:gc to show pause/concurrent phases duration and heap occupancy In-Reply-To: <4ea0f50e-49dc-cad4-489f-7bbb2fd98ef1@redhat.com> References: <4ea0f50e-49dc-cad4-489f-7bbb2fd98ef1@redhat.com> Message-ID: Hi Aleksey, The logging code in ZGC is in the middle of a transitions from the "old" style to a "new" style. The transitions isn't complete yet, which is why it might not appear optimal at this time. In the "old" style, we had pretty much what you propose, which is also sort of similar to what G1 and CMS is doing. However, we believe we have is a great opportunity to re-think and improve things, which is why we started this transition. Once completed, I think it will serve both users and GC devs better. In the "new" style, we're aiming for the following: * -Xlog:gc - Enables basic logging. Basically one single line per GC, non-verbose, giving you only the most important information. This is something you would typically enable when you just want to keep an eye on what the GC does, to get a feeling for the heap size, pause times, etc. It's not intended for anyone who wants to tune or dig deeper. Rather, something you might use once all tuning is completed, but you still want some information. A user that isn't very familiar with (or care) how ZGC works internally should still find this information useful. The line we print today is missing some time-related information. This line will in the future look something like this: GC(0) Garbage Collection (Warmup) 1646M(10%)->596M(4%), 1.0ms/1.1ms/1.2ms Where the last three numbers represent the pause times (mark start, mark end, reloc start) during that GC cycle. We might also want to add some notion of the total GC cycle time here. * -Xlog:gc* - Enable verbose logging. This is the go to logging option when you want to tune, debug or dig deeper. Obviously much more verbose, making use of appropriate log tags so that the output can be refined/filtered as needed. Interpreting information here typically requires some level of understanding of ZGC. Many of the log tag combinations have additional information on the debug/trace level, normally only useful for GC devs debugging the ZGC itself. The output you see today in this mode is also in the middle of our transition, and hence incomplete. We plan to move much of the information currently printed in the gc+stats table in under more specific tags and let them be printed once per GC. While there's some value in having different GCs do logging in a similar way, I see an even greater value in providing really good and useful logging. In both ZGC and Shenandoah we have the rare opportunity to make things better, without having to deal with old baggage or old ways of doing things, which we're trying to embrace. cheers, Per On 2017-12-11 17:52, Aleksey Shipilev wrote: > Current -Xlog:gc log with ZGC is rather opaque: > > [19.964s][info][gc] GC(0) Garbage Collection (Warmup) 1646M(10%)->596M(4%) > [31.690s][info][gc] GC(1) Garbage Collection (Warmup) 3304M(20%)->384M(2%) > [38.666s][info][gc] GC(2) Garbage Collection (Warmup) 5004M(31%)->1428M(9%) > [46.724s][info][gc] GC(3) Garbage Collection (Allocation Rate) 9198M(56%)->8132M(50%) > [49.010s][info][gc] GC(4) Garbage Collection (Allocation Rate) 8596M(52%)->9786M(60%) > [51.373s][info][gc] GC(5) Garbage Collection (Allocation Rate) 10348M(63%)->9800M(60%) > [53.698s][info][gc] GC(6) Garbage Collection (Allocation Rate) 10002M(61%)->10724M(65%) > [55.434s][info][gc] GC(7) Garbage Collection (Allocation Rate) 10778M(66%)->7956M(49%) > [57.446s][info][gc] GC(8) Garbage Collection (Allocation Rate) 8420M(51%)->10896M(67%) > > The real pause work hides under -Xlog:gc+phases: > > [197.362s][info][gc,phases] GC(108) Pause Mark Start 0.697ms > [198.119s][info][gc,phases] GC(108) Concurrent Mark 757.226ms > [198.135s][info][gc,phases] GC(108) Pause Mark End 0.802ms > [198.136s][info][gc,phases] GC(108) Concurrent References Processing 1.461ms > [198.149s][info][gc,phases] GC(108) Concurrent Reset Relocation Set 12.285ms > [198.154s][info][gc,phases] GC(108) Concurrent Destroy Detached Pages 0.001ms > [198.158s][info][gc,phases] GC(108) Concurrent Select Relocation Set 3.239ms > [198.332s][info][gc,phases] GC(108) Concurrent Prepare Relocation Set 173.931ms > [198.333s][info][gc,phases] GC(108) Pause Relocate Start 1.010ms > [198.819s][info][gc,phases] GC(108) Concurrent Relocate 485.961ms > > Can/should we redo this to match what e.g. Shenandoah is doing: phases to be at "gc" tag, and > concurrent phases to report how heap had changed while phases were running? This should help new > users to understand what collector is doing better, and also allow better comparison against other > OpenJDK collectors, that report heap occupancy and pauses under "gc" tag. > > So ZGC output would look like: > > [info][gc] GC(108) Pause Mark Start 0.697ms > [info][gc] GC(108) Concurrent Mark xxxxM->xxxxM 757.226ms > [info][gc] GC(108) Pause Mark End 0.802ms > [info][gc] GC(108) Concurrent References Processing xxxxM->xxxxM 1.461ms > [info][gc] GC(108) Concurrent Prepare Relocate xxxxM->xxxxM 12.285ms > [info][gc] GC(108) Pause Relocate Start 1.010ms > [info][gc] GC(108) Concurrent Relocate xxxxM->xxxxM 485.961ms > > "Prepare Relocate" combines all Relocation Set activities -- those can be under gc+phases if anyone > wants it. > > Thanks, > -Aleksey > From per.liden at oracle.com Tue Dec 12 10:42:43 2017 From: per.liden at oracle.com (Per Liden) Date: Tue, 12 Dec 2017 11:42:43 +0100 Subject: RFR: Add NMT support for Java heap Message-ID: <34db8ea6-fee7-9675-1ecf-a0d4be67e0a3@oracle.com> As Aleksey noticed, we don't register the Java heap with the native memory tracker. Here's a patch to do that. http://cr.openjdk.java.net/~pliden/zgc/nmt_java_heap/webrev.0/ cheers, Per From shade at redhat.com Tue Dec 12 10:50:09 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 12 Dec 2017 11:50:09 +0100 Subject: RFR: Add NMT support for Java heap In-Reply-To: <34db8ea6-fee7-9675-1ecf-a0d4be67e0a3@oracle.com> References: <34db8ea6-fee7-9675-1ecf-a0d4be67e0a3@oracle.com> Message-ID: On 12/12/2017 11:42 AM, Per Liden wrote: > As Aleksey noticed, we don't register the Java heap with the native memory tracker. Here's a patch > to do that. > > http://cr.openjdk.java.net/~pliden/zgc/nmt_java_heap/webrev.0/ Patch looks good, but the NMT "reserved" data is off the charts: Total: reserved=17180280855KB, committed=17143487KB - Java Heap (reserved=17179869184KB, committed=16777216KB) (mmap: reserved=17179869184KB, committed=16777216KB) I guess this should not pass ZAddressSpaceSize, and instead tell the reserved space of the first mapping? + // Register address space with native memory tracker + nmt_reserve(ZAddressSpaceStart, ZAddressSpaceSize); Thanks, -Aleksey From per.liden at oracle.com Tue Dec 12 10:42:47 2017 From: per.liden at oracle.com (Per Liden) Date: Tue, 12 Dec 2017 11:42:47 +0100 Subject: RFR: Remove unused ZLoadBarrierMediumPath option Message-ID: Just a clean up patch, to remove the ZLoadBarrierMediumPath option, which isn't used anymore. We'll add it again if the need arises. http://cr.openjdk.java.net/~pliden/zgc/remove_loadbarriermediumpath_option/webrev.0/ cheers, Per From shade at redhat.com Tue Dec 12 10:57:38 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 12 Dec 2017 11:57:38 +0100 Subject: RFR: Remove unused ZLoadBarrierMediumPath option In-Reply-To: References: Message-ID: On 12/12/2017 11:42 AM, Per Liden wrote: > Just a clean up patch, to remove the ZLoadBarrierMediumPath option, which isn't used anymore. We'll > add it again if the need arises. > > http://cr.openjdk.java.net/~pliden/zgc/remove_loadbarriermediumpath_option/webrev.0/ Looks good. -Aleksey From per.liden at oracle.com Tue Dec 12 11:20:26 2017 From: per.liden at oracle.com (Per Liden) Date: Tue, 12 Dec 2017 12:20:26 +0100 Subject: RFR: Add NMT support for Java heap In-Reply-To: References: <34db8ea6-fee7-9675-1ecf-a0d4be67e0a3@oracle.com> Message-ID: On 2017-12-12 11:50, Aleksey Shipilev wrote: > On 12/12/2017 11:42 AM, Per Liden wrote: >> As Aleksey noticed, we don't register the Java heap with the native memory tracker. Here's a patch >> to do that. >> >> http://cr.openjdk.java.net/~pliden/zgc/nmt_java_heap/webrev.0/ > > Patch looks good, but the NMT "reserved" data is off the charts: > > Total: reserved=17180280855KB, committed=17143487KB > - Java Heap (reserved=17179869184KB, committed=16777216KB) > (mmap: reserved=17179869184KB, committed=16777216KB) > > I guess this should not pass ZAddressSpaceSize, and instead tell the reserved space of the first > mapping? > > + // Register address space with native memory tracker > + nmt_reserve(ZAddressSpaceStart, ZAddressSpaceSize); I think this is correct actually. But it depends on how one views things I guess. As I see it, I want to be able to look in /proc/../maps and with NMT see what the different mappings correlate to. If we only registered the first heap view, then there would be a big mysterious reservation that would go unaccounted. That doesn't sound right to me, but I'm open for hearing other opinions on this. The big number there covers all addresses for all heap views/mappings (i.e. the actual address space that is reserved). It should be noted that, in ZGC, the heap address space doesn't have a 1:1 relation with max heap size. cheers, Per > > Thanks, > -Aleksey > From shade at redhat.com Tue Dec 12 11:28:30 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 12 Dec 2017 12:28:30 +0100 Subject: RFR: Add NMT support for Java heap In-Reply-To: References: <34db8ea6-fee7-9675-1ecf-a0d4be67e0a3@oracle.com> Message-ID: On 12/12/2017 12:20 PM, Per Liden wrote: > On 2017-12-12 11:50, Aleksey Shipilev wrote: >> On 12/12/2017 11:42 AM, Per Liden wrote: >>> As Aleksey noticed, we don't register the Java heap with the native memory tracker. Here's a patch >>> to do that. >>> >>> http://cr.openjdk.java.net/~pliden/zgc/nmt_java_heap/webrev.0/ >> >> Patch looks good, but the NMT "reserved" data is off the charts: >> >> Total: reserved=17180280855KB, committed=17143487KB >> -???????????????? Java Heap (reserved=17179869184KB, committed=16777216KB) >> ??????????????????????????? (mmap: reserved=17179869184KB, committed=16777216KB) >> >> I guess this should not pass ZAddressSpaceSize, and instead tell the reserved space of the first >> mapping? >> >> +? // Register address space with native memory tracker >> +? nmt_reserve(ZAddressSpaceStart, ZAddressSpaceSize); > > I think this is correct actually. But it depends on how one views things I guess. As I see it, I > want to be able to look in /proc/../maps and with NMT see what the different mappings correlate to. > If we only registered the first heap view, then there would be a big mysterious reservation that > would go unaccounted. That doesn't sound right to me, but I'm open for hearing other opinions on > this. The big number there covers all addresses for all heap views/mappings (i.e. the actual address > space that is reserved). It should be noted that, in ZGC, the heap address space doesn't have a 1:1 > relation with max heap size. In single-mapping GCs with -Xmx100g, I would expect to see reserved=100G for Java heap. In multi-mapping GCs with -Xmx100g, I would expect to see either reserved=100G, or reserved=N*100G, where N is the number of mappings. Looking at /proc for ZGC, it seems we reserve the entire bulk from "lo" of first mapping to "hi" of the last mapping for Java heap? VmPeak: 18256719348 kB VmSize: 18256719348 kB Oh wow. So NMT is not lying there. But, this does look overly pessimistic thing to do. If there are multiple mappings that differ in upper bits, that means there is enough unused space between the mappings, and we don't actually have to reserve it? The way I look at it, "reserve" is something that helps to diagnose memory handling problems, and over-reservation masks most of that. It probably makes sense to commit this NMT patch, and then figure out if we want to reserve less? Thanks, -Aleksey From per.liden at oracle.com Tue Dec 12 11:27:25 2017 From: per.liden at oracle.com (Per Liden) Date: Tue, 12 Dec 2017 12:27:25 +0100 Subject: RFR: Remove unused ZLoadBarrierMediumPath option In-Reply-To: References: Message-ID: On 2017-12-12 11:57, Aleksey Shipilev wrote: > On 12/12/2017 11:42 AM, Per Liden wrote: >> Just a clean up patch, to remove the ZLoadBarrierMediumPath option, which isn't used anymore. We'll >> add it again if the need arises. >> >> http://cr.openjdk.java.net/~pliden/zgc/remove_loadbarriermediumpath_option/webrev.0/ > > Looks good. Thanks! /Per From per.liden at oracle.com Tue Dec 12 12:06:31 2017 From: per.liden at oracle.com (Per Liden) Date: Tue, 12 Dec 2017 13:06:31 +0100 Subject: RFR: Add NMT support for Java heap In-Reply-To: References: <34db8ea6-fee7-9675-1ecf-a0d4be67e0a3@oracle.com> Message-ID: <5c13dd46-553e-545b-e552-c4fa82ecaaf3@oracle.com> On 2017-12-12 12:28, Aleksey Shipilev wrote: > On 12/12/2017 12:20 PM, Per Liden wrote: >> On 2017-12-12 11:50, Aleksey Shipilev wrote: >>> On 12/12/2017 11:42 AM, Per Liden wrote: >>>> As Aleksey noticed, we don't register the Java heap with the native memory tracker. Here's a patch >>>> to do that. >>>> >>>> http://cr.openjdk.java.net/~pliden/zgc/nmt_java_heap/webrev.0/ >>> >>> Patch looks good, but the NMT "reserved" data is off the charts: >>> >>> Total: reserved=17180280855KB, committed=17143487KB >>> - Java Heap (reserved=17179869184KB, committed=16777216KB) >>> (mmap: reserved=17179869184KB, committed=16777216KB) >>> >>> I guess this should not pass ZAddressSpaceSize, and instead tell the reserved space of the first >>> mapping? >>> >>> + // Register address space with native memory tracker >>> + nmt_reserve(ZAddressSpaceStart, ZAddressSpaceSize); >> >> I think this is correct actually. But it depends on how one views things I guess. As I see it, I >> want to be able to look in /proc/../maps and with NMT see what the different mappings correlate to. >> If we only registered the first heap view, then there would be a big mysterious reservation that >> would go unaccounted. That doesn't sound right to me, but I'm open for hearing other opinions on >> this. The big number there covers all addresses for all heap views/mappings (i.e. the actual address >> space that is reserved). It should be noted that, in ZGC, the heap address space doesn't have a 1:1 >> relation with max heap size. > > In single-mapping GCs with -Xmx100g, I would expect to see reserved=100G for Java heap. > > In multi-mapping GCs with -Xmx100g, I would expect to see either reserved=100G, or reserved=N*100G, > where N is the number of mappings. > > Looking at /proc for ZGC, it seems we reserve the entire bulk from "lo" of first mapping to "hi" of > the last mapping for Java heap? > > VmPeak: 18256719348 kB > VmSize: 18256719348 kB > > Oh wow. So NMT is not lying there. > > But, this does look overly pessimistic thing to do. If there are multiple mappings that differ in > upper bits, that means there is enough unused space between the mappings, and we don't actually have > to reserve it? We actually do want to reserve it anyway, for two reasons. 1) In ZGC, by having a heap address spare much larger than the heap size we are essentially immune to address space fragmentation. I.e. we can always find a hole big enough for any allocation, without the need to first compact the heap. 2) CollectedHeap::is_in_reserved() is often used as a inexpensive check if something point into the heap. By reserving all addresses in all heap views this check remains inexpensive as we know that some other random mmap() call in the VM didn't end up in-between two heap views. This is also very useful when debugging. For example, when dumping memory or looking at stack traces you can easily see if something is an oop or not (oops always start with 0x00001..., 0x000008... or 0x000004...). > > The way I look at it, "reserve" is something that helps to diagnose memory handling problems, and > over-reservation masks most of that. I agree that this would also be a reasonable way of looking at this. > It probably makes sense to commit this NMT patch, and then > figure out if we want to reserve less? I'll give others a change to have an opinion before committing. Thanks for reviewing! cheers, Per > > Thanks, > -Aleksey > From shade at redhat.com Tue Dec 12 12:29:55 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 12 Dec 2017 13:29:55 +0100 Subject: RFR: Add NMT support for Java heap In-Reply-To: <5c13dd46-553e-545b-e552-c4fa82ecaaf3@oracle.com> References: <34db8ea6-fee7-9675-1ecf-a0d4be67e0a3@oracle.com> <5c13dd46-553e-545b-e552-c4fa82ecaaf3@oracle.com> Message-ID: <62016ede-dea6-89c9-63c3-f92f499de005@redhat.com> On 12/12/2017 01:06 PM, Per Liden wrote: > On 2017-12-12 12:28, Aleksey Shipilev wrote: >> But, this does look overly pessimistic thing to do. If there are multiple mappings that differ in >> upper bits, that means there is enough unused space between the mappings, and we don't actually have >> to reserve it? Whaat I meant is: instead of allocating [0x0001...0x0005], we can reserve [0x00010...0, 0x00010050..0], [0x00020...0, 0x00020050..0], [0x00030...0, 0x00030050..0], etc, where the size of each reserved chunk is -Xmx. > We actually do want to reserve it anyway, for two reasons. > > 1) In ZGC, by having a heap address spare much larger than the heap size we are essentially immune > to address space fragmentation. I.e. we can always find a hole big enough for any allocation, > without the need to first compact the heap. Okay. I am quite a bit blurry on compaction that ZGC does. Does this mean the relocation phase temporarily allocates target pages outside of currently reserved slice taken by Java heap? And you want to have enough spare so that you always have contiguous virtual memory to allocate that temporary excess at? If so, you can allocate another -Xmx-sized chunk, without always reserving terabytes of space, right? > 2) CollectedHeap::is_in_reserved() is often used as a inexpensive check if something point into the > heap. By reserving all addresses in all heap views this check remains inexpensive as we know that > some other random mmap() call in the VM didn't end up in-between two heap views. is_in_reserved() is the compelling argument. Although you can probably just mask out high-bits and land in the first mapping. The mappings of the same size would make this work. > This is also very useful when debugging. For example, when dumping memory or looking at stack > traces you can easily see if something is an oop or not (oops always start with 0x00001..., > 0x000008... or 0x000004...). Yes, this does not go away: the starts of each reserved chunk stays the same. Thanks, -Aleksey From erik.osterlund at oracle.com Tue Dec 12 13:29:20 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 12 Dec 2017 14:29:20 +0100 Subject: RFR: Add NMT support for Java heap In-Reply-To: References: <34db8ea6-fee7-9675-1ecf-a0d4be67e0a3@oracle.com> Message-ID: <5A2FD9B0.9070202@oracle.com> Hi Per, I suppose this does indeed depend on how you view things. To me, when there are different interpretations what should be reported, I try to understand why we are reporting it. If we report a number, then there should be some kind of question to which the reported number provides an answer. The number you propose now is accurate in terms of what is called by mmap, but I do not know what useful question it answers. It will seemingly report ~17 TB always regardless of potentially interesting things such as, e.g. how close we are to hitting the roof of max memory that may be committed. In other words, I am not sure what useful question there is to which the answer is how much virtual address space has been reserved by the Java heap. Thanks, /Erik On 2017-12-12 12:20, Per Liden wrote: > On 2017-12-12 11:50, Aleksey Shipilev wrote: >> On 12/12/2017 11:42 AM, Per Liden wrote: >>> As Aleksey noticed, we don't register the Java heap with the native >>> memory tracker. Here's a patch >>> to do that. >>> >>> http://cr.openjdk.java.net/~pliden/zgc/nmt_java_heap/webrev.0/ >> >> Patch looks good, but the NMT "reserved" data is off the charts: >> >> Total: reserved=17180280855KB, committed=17143487KB >> - Java Heap (reserved=17179869184KB, >> committed=16777216KB) >> (mmap: reserved=17179869184KB, >> committed=16777216KB) >> >> I guess this should not pass ZAddressSpaceSize, and instead tell the >> reserved space of the first >> mapping? >> >> + // Register address space with native memory tracker >> + nmt_reserve(ZAddressSpaceStart, ZAddressSpaceSize); > > I think this is correct actually. But it depends on how one views > things I guess. As I see it, I want to be able to look in > /proc/../maps and with NMT see what the different mappings correlate > to. If we only registered the first heap view, then there would be a > big mysterious reservation that would go unaccounted. That doesn't > sound right to me, but I'm open for hearing other opinions on this. > The big number there covers all addresses for all heap views/mappings > (i.e. the actual address space that is reserved). It should be noted > that, in ZGC, the heap address space doesn't have a 1:1 relation with > max heap size. > > cheers, > Per > >> >> Thanks, >> -Aleksey >> From per.liden at oracle.com Tue Dec 12 13:35:19 2017 From: per.liden at oracle.com (Per Liden) Date: Tue, 12 Dec 2017 14:35:19 +0100 Subject: RFR: Add NMT support for Java heap In-Reply-To: <62016ede-dea6-89c9-63c3-f92f499de005@redhat.com> References: <34db8ea6-fee7-9675-1ecf-a0d4be67e0a3@oracle.com> <5c13dd46-553e-545b-e552-c4fa82ecaaf3@oracle.com> <62016ede-dea6-89c9-63c3-f92f499de005@redhat.com> Message-ID: <4ada443c-efad-fa41-819d-bd799c37d920@oracle.com> On 2017-12-12 13:29, Aleksey Shipilev wrote: > On 12/12/2017 01:06 PM, Per Liden wrote: >> On 2017-12-12 12:28, Aleksey Shipilev wrote: >>> But, this does look overly pessimistic thing to do. If there are multiple mappings that differ in >>> upper bits, that means there is enough unused space between the mappings, and we don't actually have >>> to reserve it? > > Whaat I meant is: instead of allocating [0x0001...0x0005], we can reserve [0x00010...0, > 0x00010050..0], [0x00020...0, 0x00020050..0], [0x00030...0, 0x00030050..0], etc, where the size of > each reserved chunk is -Xmx. > >> We actually do want to reserve it anyway, for two reasons. >> >> 1) In ZGC, by having a heap address spare much larger than the heap size we are essentially immune >> to address space fragmentation. I.e. we can always find a hole big enough for any allocation, >> without the need to first compact the heap. > > Okay. I am quite a bit blurry on compaction that ZGC does. Does this mean the relocation phase > temporarily allocates target pages outside of currently reserved slice taken by Java heap? And you > want to have enough spare so that you always have contiguous virtual memory to allocate that > temporary excess at? What I meant above was just that in GCs which have a 1:1 mapping between heap address space and max heap size, you can run into a situation where you have say 32M of free memory (or address space, same thing in that context) on the heap, but it's scattered across say 8 chunks 4M each, which means you'd need to compact before you can allocate an 32M object. In ZGC, we can instead just map (or remap) that 32M to some other location in the heap address space. About compaction. ZGC relocates/compacts to pages inside the heap. The algorithm allows for in-place compaction, in which case no additional pages are needed. However, in the current implementation, we typically need one free page per GC worker thread to do relocation. When all objects in a given page have been relocated, that page immediately becomes reusable, either for new Java thread allocations, or for further GC worker relocation allocations. Btw, when I say "page" here I mean ZPage, which is conceptually similar to a HeapRegion in G1/Shanandoah. > If so, you can allocate another -Xmx-sized chunk, without always reserving > terabytes of space, right? > >> 2) CollectedHeap::is_in_reserved() is often used as a inexpensive check if something point into the >> heap. By reserving all addresses in all heap views this check remains inexpensive as we know that >> some other random mmap() call in the VM didn't end up in-between two heap views. > > is_in_reserved() is the compelling argument. Although you can probably just mask out high-bits and > land in the first mapping. The mappings of the same size would make this work. > >> This is also very useful when debugging. For example, when dumping memory or looking at stack >> traces you can easily see if something is an oop or not (oops always start with 0x00001..., >> 0x000008... or 0x000004...). > > Yes, this does not go away: the starts of each reserved chunk stays the same. Unless all of that address space is reserved you can no longer be sure that those addresses are actually oops. They could just as well be native pointers, because some other part of the JVM could have mmap:ed something there. cheers, Per > > Thanks, > -Aleksey > > > > From stefan.karlsson at oracle.com Tue Dec 12 14:32:15 2017 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 12 Dec 2017 15:32:15 +0100 Subject: RFR: Remove unused ZLoadBarrierMediumPath option In-Reply-To: References: Message-ID: <3405f3bb-2232-fbb1-31de-f24edb0e0d11@oracle.com> Looks good. StefanK On 2017-12-12 11:42, Per Liden wrote: > Just a clean up patch, to remove the ZLoadBarrierMediumPath option, > which isn't used anymore. We'll add it again if the need arises. > > http://cr.openjdk.java.net/~pliden/zgc/remove_loadbarriermediumpath_option/webrev.0/ > > > cheers, > Per From erik.osterlund at oracle.com Tue Dec 12 15:17:47 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 12 Dec 2017 16:17:47 +0100 Subject: RFR: Remove unused ZLoadBarrierMediumPath option In-Reply-To: References: Message-ID: <5A2FF31B.9020505@oracle.com> Hi Per, Looks good. /Erik On 2017-12-12 11:42, Per Liden wrote: > Just a clean up patch, to remove the ZLoadBarrierMediumPath option, > which isn't used anymore. We'll add it again if the need arises. > > http://cr.openjdk.java.net/~pliden/zgc/remove_loadbarriermediumpath_option/webrev.0/ > > > cheers, > Per From gil at azul.com Wed Dec 13 06:19:12 2017 From: gil at azul.com (Gil Tene) Date: Wed, 13 Dec 2017 06:19:12 +0000 Subject: Architectural Comparison with C4/Pauseless? Message-ID: <1D58ACFC-6AC6-44A8-8FEC-F4B07A45EE4F@azul.com> To Per and the rest of ZGC team: Congratulations on having the ZGC project and sources up! It is great to see another concurrent compacting collector being actively developed. As I start digging into some of the ZGC details, it seems (at least at first pass) that we are looking at a main mechanism that is very similar to C4 (2011 ISMM paper here: https://www.azul.com/files/c4_paper_acm.pdf), or the single-generation Pauseless collector that preceded it (2005 VEE/Usenix paper here: https://www.usenix.org/legacy/events/vee05/full_papers/p46-click.pdf). This suggests that much of the JVM infrastructure and related design needs will end up being similar as well, and we can both benefit from understanding those similarities and comparing notes. Can we go through a quick stab at mapping the mechanisms and terminologies? Some high level similarities I've noted so far: - The use of a barrier-at-refernce-load that determines whether or not an action is required purely based on the contents of the reference being loaded and some expected values for that contents (as opposed to considering data that would require de-referencing through the pointer): This seems equivalent to what the C4 LVB does [are we looking at the same actions? i.e.: queue to collector if not-yet-marked-through, fixup/remap to point to actual target address if points-to-relocated-object, and relocate object if points-to-needs-to-be-relocated-but-not-yet-relocated object?]. - Colored pointers in ZGC (which appear to encode metadata in the pointer): These seem similar to the concept of using metadata information in the reference in C4 and pauseless [A combination of NMT state and page numbers or ranges], including the use of similar triggering reasons (not-yet-marked-through, points-to-relocated-object, and points-to-obejct-that-needs-to-be-relocated] and their use in marking, compaction, and eventual fixup. [Do you see fundamental differences here that go beyond representation of the metadata in the pointer field? Are there any key differences in the triggering reasons or in how they are used to create the concurrent mark, relocate, and eventual fixup passes (or their equivalent terms in ZGC if there is a logical match)? ] - Use of a common barrier test for both concurrent marking and concurrent compaction in ZGC: Same as C4 LVB. [Basically the "only leave the fast path if metadata shows that there is something to be done" test]. - "In-place compaction" in ZGC: Same as C4's Quick Release. (Basically releases and recycles compacted-from pages[/regions] before fixing up references to objects in those pages, by maintaining forwarding information outside of the object body and page) These obviously lead me into a trap of assumptions (since I keep thinking in C4 terms). Question about expected invariants and reference fixup: - Do the colored pointers and the related read barrier provide invariants similar to those described in C4 (section 2.1)? Can similar assumptions be made about mutator visible references? - Does ZGC fold a "fixup phase" (aka "remap phase" in C4) of references to compacted pages into the next Mark phase (delaying the complete fixup until then)? Or does it perform a separate fixup pass after relocation? - "Self healing": I didn't see mention of a C4/Pauseless "self healing" equivalent thus far in text, and have not followed the ZGC code far enough to determine if there is one. Does "ZGC" perform self-healing on references that were found to need attention by the barrier? Questions about pages/regions/boundaries/mapping: - Is ZGC "regional" In the sense that except for large objects (that span multiple dedicated regions) objects cannot span region boundaries? [This is the case in C4, Shenandoah, and G1], or does it handle the heap in some more fluid way? - Does ZGC use regions of fixed size ("ZPage"?) when those regions are not dedicated to single (large) object? - Does ZGC use virtual memory remapping to relocate "large" (larger than a single page) objects? - Does ZGC use virtual memory remapping on "normal" (non-large-object) regions as part of preparing or performing marking? Compaction? Question about object lifecycle and pipeline: In gaining an understanding for pretty much any collector, I usually find that gaining an understanding for what happens to an object and references to it in two main (fairly simple) scenarios really helps. To help with that, can you describe the high level pipeline in these two simple cases [lets focus on "non-large" objects, assuming larger-than-one-page objects differ in some way (which may be a wrong assumption on my part)] - For a "stays alive through one collection object: From allocation (presumably in a TLAB), through preparation for marking and marking, and then through preparation for relocation and the relocation, and then through having references to the object fixed up. - For a "died quickly and gets allocation in the next collection" object: From allocation (presumably in a TLAB), [assume death here] through preparation for marking and marking, and then through preparation for relocation of other objects allocated in the same region, the relocation of those objects, and the eventual "release" [freeing/recycling/whatever] of the region that the dead object used to be in. Anyway, the above is plenty (probably too much) for a single mailing list thread, so I'll stop there. ? Gil. From gil at azul.com Wed Dec 13 06:21:27 2017 From: gil at azul.com (Gil Tene) Date: Wed, 13 Dec 2017 06:21:27 +0000 Subject: Architectural Comparison with C4/Pauseless? In-Reply-To: <1D58ACFC-6AC6-44A8-8FEC-F4B07A45EE4F@azul.com> References: <1D58ACFC-6AC6-44A8-8FEC-F4B07A45EE4F@azul.com> Message-ID: <18F7EBEF-D46A-4226-BD05-6C7BDDD4E2C1@azul.com> Re-sending in plain-text form (so it will hopefully show up with better line-wrapping on the archive browser): > On Dec 12, 2017, at 10:19 PM, Gil Tene wrote: > > To Per and the rest of ZGC team: Congratulations on having the ZGC project and sources up! It is great to see another concurrent compacting collector being actively developed. > > As I start digging into some of the ZGC details, it seems (at least at first pass) that we are looking at a main mechanism that is very similar to C4 (2011 ISMM paper here: https://www.azul.com/files/c4_paper_acm.pdf), or the single-generation Pauseless collector that preceded it (2005 VEE/Usenix paper here: https://www.usenix.org/legacy/events/vee05/full_papers/p46-click.pdf). This suggests that much of the JVM infrastructure and related design needs will end up being similar as well, and we can both benefit from understanding those similarities and comparing notes. > > Can we go through a quick stab at mapping the mechanisms and terminologies? > > Some high level similarities I've noted so far: > > - The use of a barrier-at-refernce-load that determines whether or not an action is required purely based on the contents of the reference being loaded and some expected values for that contents (as opposed to considering data that would require de-referencing through the pointer): This seems equivalent to what the C4 LVB does [are we looking at the same actions? i.e.: queue to collector if not-yet-marked-through, fixup/remap to point to actual target address if points-to-relocated-object, and relocate object if points-to-needs-to-be-relocated-but-not-yet-relocated object?]. > > - Colored pointers in ZGC (which appear to encode metadata in the pointer): These seem similar to the concept of using metadata information in the reference in C4 and pauseless [A combination of NMT state and page numbers or ranges], including the use of similar triggering reasons (not-yet-marked-through, points-to-relocated-object, and points-to-obejct-that-needs-to-be-relocated] and their use in marking, compaction, and eventual fixup. > > [Do you see fundamental differences here that go beyond representation of the metadata in the pointer field? Are there any key differences in the triggering reasons or in how they are used to create the concurrent mark, relocate, and eventual fixup passes (or their equivalent terms in ZGC if there is a logical match)? ] > > - Use of a common barrier test for both concurrent marking and concurrent compaction in ZGC: Same as C4 LVB. [Basically the "only leave the fast path if metadata shows that there is something to be done" test]. > > - "In-place compaction" in ZGC: Same as C4's Quick Release. (Basically releases and recycles compacted-from pages[/regions] before fixing up references to objects in those pages, by maintaining forwarding information outside of the object body and page) > > These obviously lead me into a trap of assumptions (since I keep thinking in C4 terms). > > Question about expected invariants and reference fixup: > > - Do the colored pointers and the related read barrier provide invariants similar to those described in C4 (section 2.1)? Can similar assumptions be made about mutator visible references? > > - Does ZGC fold a "fixup phase" (aka "remap phase" in C4) of references to compacted pages into the next Mark phase (delaying the complete fixup until then)? Or does it perform a separate fixup pass after relocation? > > - "Self healing": I didn't see mention of a C4/Pauseless "self healing" equivalent thus far in text, and have not followed the ZGC code far enough to determine if there is one. Does "ZGC" perform self-healing on references that were found to need attention by the barrier? > > Questions about pages/regions/boundaries/mapping: > > - Is ZGC "regional" In the sense that except for large objects (that span multiple dedicated regions) objects cannot span region boundaries? [This is the case in C4, Shenandoah, and G1], or does it handle the heap in some more fluid way? > > - Does ZGC use regions of fixed size ("ZPage"?) when those regions are not dedicated to single (large) object? > > - Does ZGC use virtual memory remapping to relocate "large" (larger than a single page) objects? > > - Does ZGC use virtual memory remapping on "normal" (non-large-object) regions as part of preparing or performing marking? Compaction? > > Question about object lifecycle and pipeline: > > In gaining an understanding for pretty much any collector, I usually find that gaining an understanding for what happens to an object and references to it in two main (fairly simple) scenarios really helps. To help with that, can you describe the high level pipeline in these two simple cases [lets focus on "non-large" objects, assuming larger-than-one-page objects differ in some way (which may be a wrong assumption on my part)] > > - For a "stays alive through one collection object: From allocation (presumably in a TLAB), through preparation for marking and marking, and then through preparation for relocation and the relocation, and then through having references to the object fixed up. > > - For a "died quickly and gets allocation in the next collection" object: From allocation (presumably in a TLAB), [assume death here] through preparation for marking and marking, and then through preparation for relocation of other objects allocated in the same region, the relocation of those objects, and the eventual "release" [freeing/recycling/whatever] of the region that the dead object used to be in. > > Anyway, the above is plenty (probably too much) for a single mailing list thread, so I'll stop there. > > ? Gil. > > > > > From per.liden at oracle.com Thu Dec 14 14:31:57 2017 From: per.liden at oracle.com (Per Liden) Date: Thu, 14 Dec 2017 15:31:57 +0100 Subject: Architectural Comparison with C4/Pauseless? In-Reply-To: <1D58ACFC-6AC6-44A8-8FEC-F4B07A45EE4F@azul.com> References: <1D58ACFC-6AC6-44A8-8FEC-F4B07A45EE4F@azul.com> Message-ID: <00d94bfb-6ba0-a85b-e7dd-bf5c30a482b2@oracle.com> Hi Gil, On 2017-12-13 07:19, Gil Tene wrote: > To Per and the rest of ZGC team: Congratulations on having the ZGC project and sources up! It is great to see another concurrent compacting collector being actively developed. Thanks! > > As I start digging into some of the ZGC details, it seems (at least at first pass) that we are looking at a main mechanism that is very similar to C4 (2011 ISMM paper here: https://www.azul.com/files/c4_paper_acm.pdf), or the single-generation Pauseless collector that preceded it (2005 VEE/Usenix paper here: https://www.usenix.org/legacy/events/vee05/full_papers/p46-click.pdf). This suggests that much of the JVM infrastructure and related design needs will end up being similar as well, and we can both benefit from understanding those similarities and comparing notes. > > Can we go through a quick stab at mapping the mechanisms and terminologies? > > Some high level similarities I've noted so far: > > - The use of a barrier-at-refernce-load that determines whether or not an action is required purely based on the contents of the reference being loaded and some expected values for that contents (as opposed to considering data that would require de-referencing through the pointer): This seems equivalent to what the C4 LVB does [are we looking at the same actions? i.e.: queue to collector if not-yet-marked-through, fixup/remap to point to actual target address if points-to-relocated-object, and relocate object if points-to-needs-to-be-relocated-but-not-yet-relocated object?]. In ZGC we currently have the following main reasons why some action needs to be taken: 1) During marking - "Points to an object that is not known to be strongly marked". 2) Between end of marking and start of relocation - "Points to a final-reachable object". 3) Between end of marking and end of concurrent reference processing - "Attempt to load weak/phantom oop pointing to an unmarked object". 4) During relocation - "Points to an object that is not known to not be part of the collection set". There are a number of different actions that can then follow, depending on the above reason, the oop state, etc. For example: - Mark an unmarked object as strongly-reachable - Mark a final-reachable object as strongly-reachable - Remap pointer and mark an unmarked object as strongly-reachable - Remap pointer and mark a final-reachable object as strongly-reachable - Resurrect an unmarked or finalizable-marked object pointed to by a "weak" or "phantom" oop - Prevent resurrection of an unmarked or finalizable-marked object pointed to by a "weak" or "phantom" oop - Remap pointer pointing to an object that is not part of the collection set - Remap pointer pointing to an object that is part of the collection set - Remap pointer and relocate an object that is part of the collection set In ZGC terminology, a "remapped" pointer means it has the "remapped" metadata bit set, which in turn means we know it doesn't point into the collection set. "Finalizable-marked" means it's been marked via the referent in a Finalizer objects. > > - Colored pointers in ZGC (which appear to encode metadata in the pointer): These seem similar to the concept of using metadata information in the reference in C4 and pauseless [A combination of NMT state and page numbers or ranges], including the use of similar triggering reasons (not-yet-marked-through, points-to-relocated-object, and points-to-obejct-that-needs-to-be-relocated] and their use in marking, compaction, and eventual fixup. > > [Do you see fundamental differences here that go beyond representation of the metadata in the pointer field? Are there any key differences in the triggering reasons or in how they are used to create the concurrent mark, relocate, and eventual fixup passes (or their equivalent terms in ZGC if there is a logical match)? ] Given that C4 isn't available to study in detail, my understanding of how it works is limited. I'm thinking that once you've had time to study the ZGC source, you will be in a better position than me to call out the differences. > > - Use of a common barrier test for both concurrent marking and concurrent compaction in ZGC: Same as C4 LVB. [Basically the "only leave the fast path if metadata shows that there is something to be done" test]. > > - "In-place compaction" in ZGC: Same as C4's Quick Release. (Basically releases and recycles compacted-from pages[/regions] before fixing up references to objects in those pages, by maintaining forwarding information outside of the object body and page) It's my understanding (correct me if I'm wrong) that C4's notion of "Quick Release" means quick release/reclaim/reuse of physical memory pages, but not quick release/reclaim/reuse of the virtual address space those pages occupy. In ZGC, the physical memory pages and the address space they occupy have the same life-cycle and both can be immediately released/reclaimed/reused as a unit. From a "in-place compaction" point of view, the end result is the same, except that ZGC doesn't need to remap memory. > > These obviously lead me into a trap of assumptions (since I keep thinking in C4 terms). > > Question about expected invariants and reference fixup: > > - Do the colored pointers and the related read barrier provide invariants similar to those described in C4 (section 2.1)? Can similar assumptions be made about mutator visible references? ZGC has a strong "to-space" invariant. What other invariants you get depends on which barrier type was applied (we call them strong and weak barriers, where weak maps to AS_NO_KEEPALIVE in the new Access API), and what reference type is being accessed (strong/weak/phantom oop, which maps to ON_STRONG/WEAK/PHANTOM/_OOP_REF). Please see zBarrier.* for more details here. > > - Does ZGC fold a "fixup phase" (aka "remap phase" in C4) of references to compacted pages into the next Mark phase (delaying the complete fixup until then)? Or does it perform a separate fixup pass after relocation? ZGC does "lazy fixup", in the sense that whoever loads an oop after relocation will do the fixup. It could be a mutator or a GC worker doing that. From an algorithm point of view, we can choose to fold or not to fold. > > - "Self healing": I didn't see mention of a C4/Pauseless "self healing" equivalent thus far in text, and have not followed the ZGC code far enough to determine if there is one. Does "ZGC" perform self-healing on references that were found to need attention by the barrier? A load barrier in ZGC can heal/repair an oop, but if, when and how it happens again depends on the barrier type used and the reference type being accessed. Looking at the code in zBarrier.* should help if you're interested in the details here. For example, a weak barrier never heals the oop to the "marked" state, and a barrier on a weak/phantom oop never heals if "resurrection" is blocked. > > Questions about pages/regions/boundaries/mapping: > > - Is ZGC "regional" In the sense that except for large objects (that span multiple dedicated regions) objects cannot span region boundaries? [This is the case in C4, Shenandoah, and G1], or does it handle the heap in some more fluid way? ZGC is regional, in the sense that the heap is represented by a number of ZPages, where each ZPage represents some contiguous chunk of memory. Objects (large or small) never span across ZPage boundaries. A ZPage is sized such that this never happens. If we, for example, allocate a 100M object, then we create a single 100M ZPage for that. When that ZPage is later reclaimed, we can decide to reuse that ZPage as is (e.g. if we have a new request to allocate another 100M object), or we can throw away that ZPage and just reuse the heap memory it represented to back other ZPages with different size configurations. > > - Does ZGC use regions of fixed size ("ZPage"?) when those regions are not dedicated to single (large) object? A ZPages can have any size (subject to alignment requirements). Some ZPages only contain a single object (typically a very large object), and some contain many objects (typically smaller objects). A ZPage also belongs to a size group (currently we have three groups, small/medium/large), where each group have different size and alignments requirement. > > - Does ZGC use virtual memory remapping to relocate "large" (larger than a single page) objects? > > - Does ZGC use virtual memory remapping on "normal" (non-large-object) regions as part of preparing or performing marking? Compaction? Virtual memory remapping is not part of the marking or compaction mechanism in ZGC. And we typically don't relocate large objects at all when compacting. Large objects might in the future be relocated for some other reason, like move to "cold-storage". The only type of virtual memory mapping trick we do is mapping the heap in multiple locations on platforms that don't have support for "Virtual Address Masking/Tagging". x86 would be an example of such a platform. On platforms which do support this (e.g. SPARC and Aarch64) we map the heap in only one location. (Note that we don't have a Aarch64 port of ZGC at this time, it's just used as an example here). > > Question about object lifecycle and pipeline: > > In gaining an understanding for pretty much any collector, I usually find that gaining an understanding for what happens to an object and references to it in two main (fairly simple) scenarios really helps. To help with that, can you describe the high level pipeline in these two simple cases [lets focus on "non-large" objects, assuming larger-than-one-page objects differ in some way (which may be a wrong assumption on my part)] > > - For a "stays alive through one collection object: From allocation (presumably in a TLAB), through preparation for marking and marking, and then through preparation for relocation and the relocation, and then through having references to the object fixed up. > > - For a "died quickly and gets allocation in the next collection" object: From allocation (presumably in a TLAB), [assume death here] through preparation for marking and marking, and then through preparation for relocation of other objects allocated in the same region, the relocation of those objects, and the eventual "release" [freeing/recycling/whatever] of the region that the dead object used to be in. Quick example, which I think should answer for both scenarios above. 1) Global "expected metadata bits" is to "remapped state". 2) Object gets allocated in a TLAB. That TLAB is in turn allocated in some suitable ZPage. 3) The returned reference inherits the current global "expected metadata bits", i.e. "remapped state" in this example. 4) GC cycle starts. TLABS and ZPages containing TLAB are retired, making them candidates for compaction. Global "expected metadata bits" is set to "marked0" state. 5) During marking, either a mutator or a GC worker stumbles on the reference, detecting that its state (remapped state) doesn't match the global expected state (marked0). Adds the reference to queue of references to mark. If it's in the "marked1" state (previous mark state) a check is made see if this object was part of the previous collection set. If so, look up forwarding information and update the reference with the new location. Adjusts the reference to have the "marked0" state. 6) Marking ends. 7) A collection set (a set of ZPages) is selected. 8) Relocation starts. Global "expected metadata bits" is set to "remapped" state. 9) During relocation. GC workers walk through the collection set and relocates objects. Freeing/releasing ZPages as they become empty, making them immediately reusable. Installs relocation/forwarding information in an off-heap table. If a mutator loads an oop, it will detect that it's not in the "remapped" state. Checks to see if the object is part of the collection set. Adjusts the reference to have the "remapped" state, and if needed adjust so that it points to the new location (helps out relocate the object if it's not already done). 10) GC cycle ends cheers, Per > > Anyway, the above is plenty (probably too much) for a single mailing list thread, so I'll stop there. > > ? Gil. > From per.liden at oracle.com Thu Dec 14 14:46:15 2017 From: per.liden at oracle.com (per.liden at oracle.com) Date: Thu, 14 Dec 2017 14:46:15 +0000 Subject: hg: zgc/zgc: ZGC: Remove unused ZLoadBarrierMediumPath option Message-ID: <201712141446.vBEEkFDo022303@aojmv0008.oracle.com> Changeset: b97ac01a1c93 Author: pliden Date: 2017-12-11 14:00 +0100 URL: http://hg.openjdk.java.net/zgc/zgc/rev/b97ac01a1c93 ZGC: Remove unused ZLoadBarrierMediumPath option ! src/hotspot/share/runtime/globals.hpp From per.liden at oracle.com Thu Dec 14 15:22:45 2017 From: per.liden at oracle.com (Per Liden) Date: Thu, 14 Dec 2017 16:22:45 +0100 Subject: RFR: Add NMT support for Java heap In-Reply-To: <5A2FD9B0.9070202@oracle.com> References: <34db8ea6-fee7-9675-1ecf-a0d4be67e0a3@oracle.com> <5A2FD9B0.9070202@oracle.com> Message-ID: <6b056400-f229-08ab-890c-f65fac764f27@oracle.com> Hi, On 2017-12-12 14:29, Erik ?sterlund wrote: > Hi Per, > > I suppose this does indeed depend on how you view things. To me, when > there are different interpretations what should be reported, I try to > understand why we are reporting it. If we report a number, then there > should be some kind of question to which the reported number provides an > answer. > > The number you propose now is accurate in terms of what is called by > mmap, but I do not know what useful question it answers. It will > seemingly report ~17 TB always regardless of potentially interesting > things such as, e.g. how close we are to hitting the roof of max memory > that may be committed. In other words, I am not sure what useful > question there is to which the answer is how much virtual address space > has been reserved by the Java heap. Given how NMT records/displays data, especially in "detail" mode where it's not just one number but a ranges of reserved and committed, I think we have two options here: 1) Register all heap views as reserved. 2) Register only one heap view as reserved. Only registering "max heap size" as reserved wouldn't really work, again, since it's a range and we don't necessarily later commit memory within that reserved range. None of the alternatives above are optimal, but at this time I think #1 is slightly better, since at least it can answer two questions truthfully (the usefulness of those questions is debatable): 1) What part of the address spaces does the GC actually reserve for the Java heap. Could be useful to know if you want to e.g. set VA limits (think ulimit -v). 2) And, as I mentioned before, how does this or that line in /proc/.../maps correlate to the JVM memory usage? So, I'll suggest we do #1 for now. We'll change it if we find a good reason to do so. cheers, Per > > Thanks, > /Erik > > On 2017-12-12 12:20, Per Liden wrote: >> On 2017-12-12 11:50, Aleksey Shipilev wrote: >>> On 12/12/2017 11:42 AM, Per Liden wrote: >>>> As Aleksey noticed, we don't register the Java heap with the native >>>> memory tracker. Here's a patch >>>> to do that. >>>> >>>> http://cr.openjdk.java.net/~pliden/zgc/nmt_java_heap/webrev.0/ >>> >>> Patch looks good, but the NMT "reserved" data is off the charts: >>> >>> Total: reserved=17180280855KB, committed=17143487KB >>> - Java Heap (reserved=17179869184KB, >>> committed=16777216KB) >>> (mmap: reserved=17179869184KB, >>> committed=16777216KB) >>> >>> I guess this should not pass ZAddressSpaceSize, and instead tell the >>> reserved space of the first >>> mapping? >>> >>> + // Register address space with native memory tracker >>> + nmt_reserve(ZAddressSpaceStart, ZAddressSpaceSize); >> >> I think this is correct actually. But it depends on how one views >> things I guess. As I see it, I want to be able to look in >> /proc/../maps and with NMT see what the different mappings correlate >> to. If we only registered the first heap view, then there would be a >> big mysterious reservation that would go unaccounted. That doesn't >> sound right to me, but I'm open for hearing other opinions on this. >> The big number there covers all addresses for all heap views/mappings >> (i.e. the actual address space that is reserved). It should be noted >> that, in ZGC, the heap address space doesn't have a 1:1 relation with >> max heap size. >> >> cheers, >> Per >> >>> >>> Thanks, >>> -Aleksey >>> > From erik.osterlund at oracle.com Thu Dec 14 15:45:26 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 14 Dec 2017 16:45:26 +0100 Subject: RFR: Add NMT support for Java heap In-Reply-To: <6b056400-f229-08ab-890c-f65fac764f27@oracle.com> References: <34db8ea6-fee7-9675-1ecf-a0d4be67e0a3@oracle.com> <5A2FD9B0.9070202@oracle.com> <6b056400-f229-08ab-890c-f65fac764f27@oracle.com> Message-ID: <5A329C96.1040600@oracle.com> Hi Per, I agree. Thanks, /Erik On 2017-12-14 16:22, Per Liden wrote: > Hi, > > On 2017-12-12 14:29, Erik ?sterlund wrote: >> Hi Per, >> >> I suppose this does indeed depend on how you view things. To me, when >> there are different interpretations what should be reported, I try to >> understand why we are reporting it. If we report a number, then there >> should be some kind of question to which the reported number provides an >> answer. >> >> The number you propose now is accurate in terms of what is called by >> mmap, but I do not know what useful question it answers. It will >> seemingly report ~17 TB always regardless of potentially interesting >> things such as, e.g. how close we are to hitting the roof of max memory >> that may be committed. In other words, I am not sure what useful >> question there is to which the answer is how much virtual address space >> has been reserved by the Java heap. > > Given how NMT records/displays data, especially in "detail" mode where > it's not just one number but a ranges of reserved and committed, I > think we have two options here: > > 1) Register all heap views as reserved. > 2) Register only one heap view as reserved. > > Only registering "max heap size" as reserved wouldn't really work, > again, since it's a range and we don't necessarily later commit memory > within that reserved range. > > None of the alternatives above are optimal, but at this time I think > #1 is slightly better, since at least it can answer two questions > truthfully (the usefulness of those questions is debatable): > > 1) What part of the address spaces does the GC actually reserve for > the Java heap. Could be useful to know if you want to e.g. set VA > limits (think ulimit -v). > 2) And, as I mentioned before, how does this or that line in > /proc/.../maps correlate to the JVM memory usage? > > So, I'll suggest we do #1 for now. We'll change it if we find a good > reason to do so. > > cheers, > Per > >> >> Thanks, >> /Erik >> >> On 2017-12-12 12:20, Per Liden wrote: >>> On 2017-12-12 11:50, Aleksey Shipilev wrote: >>>> On 12/12/2017 11:42 AM, Per Liden wrote: >>>>> As Aleksey noticed, we don't register the Java heap with the native >>>>> memory tracker. Here's a patch >>>>> to do that. >>>>> >>>>> http://cr.openjdk.java.net/~pliden/zgc/nmt_java_heap/webrev.0/ >>>> >>>> Patch looks good, but the NMT "reserved" data is off the charts: >>>> >>>> Total: reserved=17180280855KB, committed=17143487KB >>>> - Java Heap (reserved=17179869184KB, >>>> committed=16777216KB) >>>> (mmap: reserved=17179869184KB, >>>> committed=16777216KB) >>>> >>>> I guess this should not pass ZAddressSpaceSize, and instead tell the >>>> reserved space of the first >>>> mapping? >>>> >>>> + // Register address space with native memory tracker >>>> + nmt_reserve(ZAddressSpaceStart, ZAddressSpaceSize); >>> >>> I think this is correct actually. But it depends on how one views >>> things I guess. As I see it, I want to be able to look in >>> /proc/../maps and with NMT see what the different mappings correlate >>> to. If we only registered the first heap view, then there would be a >>> big mysterious reservation that would go unaccounted. That doesn't >>> sound right to me, but I'm open for hearing other opinions on this. >>> The big number there covers all addresses for all heap views/mappings >>> (i.e. the actual address space that is reserved). It should be noted >>> that, in ZGC, the heap address space doesn't have a 1:1 relation with >>> max heap size. >>> >>> cheers, >>> Per >>> >>>> >>>> Thanks, >>>> -Aleksey >>>> >> From per.liden at oracle.com Fri Dec 15 09:16:07 2017 From: per.liden at oracle.com (Per Liden) Date: Fri, 15 Dec 2017 10:16:07 +0100 Subject: RFR: Add NMT support for Java heap In-Reply-To: <3a57d976-9fca-8634-e658-c6c6f47a3cf7@oracle.com> References: <34db8ea6-fee7-9675-1ecf-a0d4be67e0a3@oracle.com> <3a57d976-9fca-8634-e658-c6c6f47a3cf7@oracle.com> Message-ID: On 2017-12-15 10:08, Stefan Karlsson wrote: > Looks good to me. Thanks! > > Two issues that we discussed off-line, that we might want to deal with > in the future: > > 1) nmt_commit and friends can be inlined, which would result in the > wrong frame being extracted by CALLER_PC. > 2) We don't report the memory for our mark stacks. I agree. cheers, Per > > Thanks, > StefanK > > On 2017-12-12 11:42, Per Liden wrote: >> As Aleksey noticed, we don't register the Java heap with the native >> memory tracker. Here's a patch to do that. >> >> http://cr.openjdk.java.net/~pliden/zgc/nmt_java_heap/webrev.0/ >> >> cheers, >> Per > > From stefan.karlsson at oracle.com Fri Dec 15 09:08:19 2017 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 15 Dec 2017 10:08:19 +0100 Subject: RFR: Add NMT support for Java heap In-Reply-To: <34db8ea6-fee7-9675-1ecf-a0d4be67e0a3@oracle.com> References: <34db8ea6-fee7-9675-1ecf-a0d4be67e0a3@oracle.com> Message-ID: <3a57d976-9fca-8634-e658-c6c6f47a3cf7@oracle.com> Looks good to me. Two issues that we discussed off-line, that we might want to deal with in the future: 1) nmt_commit and friends can be inlined, which would result in the wrong frame being extracted by CALLER_PC. 2) We don't report the memory for our mark stacks. Thanks, StefanK On 2017-12-12 11:42, Per Liden wrote: > As Aleksey noticed, we don't register the Java heap with the native > memory tracker. Here's a patch to do that. > > http://cr.openjdk.java.net/~pliden/zgc/nmt_java_heap/webrev.0/ > > cheers, > Per From per.liden at oracle.com Fri Dec 15 10:05:07 2017 From: per.liden at oracle.com (per.liden at oracle.com) Date: Fri, 15 Dec 2017 10:05:07 +0000 Subject: hg: zgc/zgc: ZGC: Add NMT support for Java heap Message-ID: <201712151005.vBFA57k0006526@aojmv0008.oracle.com> Changeset: a5b85019f190 Author: pliden Date: 2017-12-14 15:43 +0100 URL: http://hg.openjdk.java.net/zgc/zgc/rev/a5b85019f190 ZGC: Add NMT support for Java heap ! src/hotspot/os_cpu/linux_x86/zPhysicalMemoryBacking_linux_x86.cpp ! src/hotspot/os_cpu/linux_x86/zPhysicalMemoryBacking_linux_x86.hpp ! src/hotspot/os_cpu/solaris_sparc/zPhysicalMemoryBacking_solaris_sparc.cpp ! src/hotspot/os_cpu/solaris_sparc/zPhysicalMemoryBacking_solaris_sparc.hpp ! src/hotspot/share/gc/z/zPhysicalMemory.cpp ! src/hotspot/share/gc/z/zPhysicalMemory.hpp ! src/hotspot/share/gc/z/zVirtualMemory.cpp ! src/hotspot/share/gc/z/zVirtualMemory.hpp From stefan.karlsson at oracle.com Fri Dec 15 10:09:20 2017 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 15 Dec 2017 11:09:20 +0100 Subject: RFR: Clean up some platform dependent code in shared directories Message-ID: <7b0c51a7-e5c3-56a7-94e6-fe748cdfeb5d@oracle.com> Hi all, Please review this small patch to get rid of some define(linux) in shared code and fix places where clang complains. This patch makes it slightly easier to experiment with support for other platforms than Linux x64. http://cr.openjdk.java.net/~stefank/zgc/cleanup/webrev/open.changeset Thanks, StefanK From per.liden at oracle.com Fri Dec 15 10:49:56 2017 From: per.liden at oracle.com (Per Liden) Date: Fri, 15 Dec 2017 11:49:56 +0100 Subject: RFR: Clean up some platform dependent code in shared directories In-Reply-To: <7b0c51a7-e5c3-56a7-94e6-fe748cdfeb5d@oracle.com> References: <7b0c51a7-e5c3-56a7-94e6-fe748cdfeb5d@oracle.com> Message-ID: <99a8068f-22e6-eee8-c381-698b22a0fa08@oracle.com> Looks good! /Per On 2017-12-15 11:09, Stefan Karlsson wrote: > Hi all, > > Please review this small patch to get rid of some define(linux) in > shared code and fix places where clang complains. This patch makes it > slightly easier to experiment with support for other platforms than > Linux x64. > http://cr.openjdk.java.net/~stefank/zgc/cleanup/webrev/open.changeset > > Thanks, > StefanK From per.liden at oracle.com Fri Dec 15 12:38:15 2017 From: per.liden at oracle.com (Per Liden) Date: Fri, 15 Dec 2017 13:38:15 +0100 Subject: RFR: Enable C2 loop strip mining by default Message-ID: Patch to enable loop strip mining by default when using ZGC. I also noticed that the file had an incorrect header, so I fixed that too. http://cr.openjdk.java.net/~pliden/zgc/c2_loop_strip_mining_by_default/webrev.0/ cheers, Per From shade at redhat.com Fri Dec 15 21:24:03 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 15 Dec 2017 22:24:03 +0100 Subject: RFR: Enable C2 loop strip mining by default In-Reply-To: References: Message-ID: <4c911404-b50e-1c5f-7947-f66802e0b614@redhat.com> On 12/15/2017 01:38 PM, Per Liden wrote: > Patch to enable loop strip mining by default when using ZGC. I also noticed that the file had an > incorrect header, so I fixed that too. > > http://cr.openjdk.java.net/~pliden/zgc/c2_loop_strip_mining_by_default/webrev.0/ Yup. It worked very well for Shenandoah. But, the relevant code block from Shenandoah code is: #ifdef COMPILER2 // Shenandoah cares more about pause times, rather than raw throughput. if (FLAG_IS_DEFAULT(UseCountedLoopSafepoints)) { FLAG_SET_DEFAULT(UseCountedLoopSafepoints, true); } if (UseCountedLoopSafepoints && FLAG_IS_DEFAULT(LoopStripMiningIter)) { FLAG_SET_DEFAULT(LoopStripMiningIter, 1000); } #ifdef ASSERT ...which is slightly different from what you are suggesting for ZGC. Don't you want to enable LoopStripMiningIter when user explicitly sets -XX:+UseCountedLoopSafepoints (which, I guess, are most users concerned with TTSP-related latency)? Thanks, -Aleksey From shade at redhat.com Fri Dec 15 21:32:54 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 15 Dec 2017 22:32:54 +0100 Subject: ZGC and disabled biased locking Message-ID: Hi, Have you found a compelling reason to disable Biased Locking by default with ZGC? 52 if (FLAG_IS_DEFAULT(UseBiasedLocking)) { 53 FLAG_SET_DEFAULT(UseBiasedLocking, false); 54 } In Shenandoah, we went back and forth on this, first we disabled it [1] on the hunch that biased locking safepoints make up for significant pause time, and then reverted back [2] because some of our adopters have complained that Shenandoah is much slower than expected -- and having the workloads that benefit greatly from biased locking throughput-wise. In some cases it was demonstrated that disabled biased locking completely blew over any additional GC barrier overhead. So I wonder if ZGC disables it on the same hunch, or is it a design/implementation quirk at this point? Thanks, -Aleksey [1] http://mail.openjdk.java.net/pipermail/shenandoah-dev/2017-September/003491.html [2] http://mail.openjdk.java.net/pipermail/shenandoah-dev/2017-November/004333.html From rednaxelafx at gmail.com Sat Dec 16 00:47:15 2017 From: rednaxelafx at gmail.com (Krystal Mok) Date: Fri, 15 Dec 2017 16:47:15 -0800 Subject: RFR: Enable C2 loop strip mining by default In-Reply-To: <4c911404-b50e-1c5f-7947-f66802e0b614@redhat.com> References: <4c911404-b50e-1c5f-7947-f66802e0b614@redhat.com> Message-ID: (Not a Reviewer) but Aleksey's version for Shenandoah makes more sense to me. Thanks, Kris On Fri, Dec 15, 2017 at 1:24 PM, Aleksey Shipilev wrote: > On 12/15/2017 01:38 PM, Per Liden wrote: > > Patch to enable loop strip mining by default when using ZGC. I also > noticed that the file had an > > incorrect header, so I fixed that too. > > > > http://cr.openjdk.java.net/~pliden/zgc/c2_loop_strip_ > mining_by_default/webrev.0/ > > Yup. It worked very well for Shenandoah. > > But, the relevant code block from Shenandoah code is: > > #ifdef COMPILER2 > // Shenandoah cares more about pause times, rather than raw throughput. > if (FLAG_IS_DEFAULT(UseCountedLoopSafepoints)) { > FLAG_SET_DEFAULT(UseCountedLoopSafepoints, true); > } > if (UseCountedLoopSafepoints && FLAG_IS_DEFAULT(LoopStripMiningIter)) { > FLAG_SET_DEFAULT(LoopStripMiningIter, 1000); > } > #ifdef ASSERT > > ...which is slightly different from what you are suggesting for ZGC. Don't > you want to enable > LoopStripMiningIter when user explicitly sets > -XX:+UseCountedLoopSafepoints (which, I guess, are > most users concerned with TTSP-related latency)? > > Thanks, > -Aleksey > > > From per.liden at oracle.com Mon Dec 18 14:19:18 2017 From: per.liden at oracle.com (Per Liden) Date: Mon, 18 Dec 2017 15:19:18 +0100 Subject: ZGC and disabled biased locking In-Reply-To: References: Message-ID: <8f210694-7968-8266-c9d7-d905f462338e@oracle.com> On 2017-12-15 22:32, Aleksey Shipilev wrote: > Hi, > > Have you found a compelling reason to disable Biased Locking by default with ZGC? > > 52 if (FLAG_IS_DEFAULT(UseBiasedLocking)) { > 53 FLAG_SET_DEFAULT(UseBiasedLocking, false); > 54 } > > In Shenandoah, we went back and forth on this, first we disabled it [1] on the hunch that biased > locking safepoints make up for significant pause time, and then reverted back [2] because some of > our adopters have complained that Shenandoah is much slower than expected -- and having the > workloads that benefit greatly from biased locking throughput-wise. In some cases it was > demonstrated that disabled biased locking completely blew over any additional GC barrier overhead. > > So I wonder if ZGC disables it on the same hunch, or is it a design/implementation quirk at this point? It's basically the same reason. On modern hardware biased locking doesn't seem to be as useful as it once was. There's even been talks about removing it completely from hotspot. By disabling it we also avoid a few safepoint. If we're proven wrong here we'll change that default. cheers, Per > > Thanks, > -Aleksey > > [1] http://mail.openjdk.java.net/pipermail/shenandoah-dev/2017-September/003491.html > [2] http://mail.openjdk.java.net/pipermail/shenandoah-dev/2017-November/004333.html > From per.liden at oracle.com Mon Dec 18 14:39:09 2017 From: per.liden at oracle.com (Per Liden) Date: Mon, 18 Dec 2017 15:39:09 +0100 Subject: RFR: Enable C2 loop strip mining by default In-Reply-To: References: <4c911404-b50e-1c5f-7947-f66802e0b614@redhat.com> Message-ID: <227c7b8d-580c-2084-3364-dfa3c49240cc@oracle.com> Hi, In ZGC we're following what G1 is doing here. G1 used to do what Shenandoah does, but Roland changed[1] that. As far as I understand, the motivation was that just using -XX:+UseCountedLoopSafepoints should actually disable strip mining, and instead provide the same behavior as we had before we had strip mining. [1] http://hg.openjdk.java.net/jdk/hs/rev/4d28288c9f9e cheers, Per On 2017-12-16 01:47, Krystal Mok wrote: > (Not a Reviewer) but Aleksey's version for Shenandoah makes more sense > to me. > > Thanks, > Kris > > On Fri, Dec 15, 2017 at 1:24 PM, Aleksey Shipilev > wrote: > > On 12/15/2017 01:38 PM, Per Liden wrote: > > Patch to enable loop strip mining by default when using ZGC. I also noticed that the file had an > > incorrect header, so I fixed that too. > > > > http://cr.openjdk.java.net/~pliden/zgc/c2_loop_strip_mining_by_default/webrev.0/ > > > Yup. It worked very well for Shenandoah. > > But, the relevant code block from Shenandoah code is: > > #ifdef COMPILER2 > ? // Shenandoah cares more about pause times, rather than raw > throughput. > ? if (FLAG_IS_DEFAULT(UseCountedLoopSafepoints)) { > ? ? FLAG_SET_DEFAULT(UseCountedLoopSafepoints, true); > ? } > ? if (UseCountedLoopSafepoints && > FLAG_IS_DEFAULT(LoopStripMiningIter)) { > ? ? FLAG_SET_DEFAULT(LoopStripMiningIter, 1000); > ? } > #ifdef ASSERT > > ...which is slightly different from what you are suggesting for ZGC. > Don't you want to enable > LoopStripMiningIter when user explicitly sets > -XX:+UseCountedLoopSafepoints (which, I guess, are > most users concerned with TTSP-related latency)? > > Thanks, > -Aleksey > > > From yasuenag at gmail.com Thu Dec 21 08:40:42 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Thu, 21 Dec 2017 17:40:42 +0900 Subject: SA for ZGC Message-ID: Hi all, I checked SA implementation for ZGC, but it is not yet available. For example, we can see WrongTypeException when CLHSDB `universe` command is executed. As the first step, I propose to implement ZCollectedHeap and related classes for SA as this webrev: http://cr.openjdk.java.net/~ysuenaga/z/sa-universe/ After applying this patch, we can use `universe` command on CLHSDB. (I followed `VM.info` jcmd output format) Of course, it is not all. We need to work more for it. Thanks, Yasumasa From per.liden at oracle.com Thu Dec 21 19:33:43 2017 From: per.liden at oracle.com (Per Liden) Date: Thu, 21 Dec 2017 20:33:43 +0100 Subject: SA for ZGC In-Reply-To: References: Message-ID: Hi Yasumasa, On 2017-12-21 09:40, Yasumasa Suenaga wrote: > Hi all, > > I checked SA implementation for ZGC, but it is not yet available. > For example, we can see WrongTypeException when CLHSDB `universe` > command is executed. > > As the first step, I propose to implement ZCollectedHeap and related classes for > SA as this webrev: > > http://cr.openjdk.java.net/~ysuenaga/z/sa-universe/ > > After applying this patch, we can use `universe` command on CLHSDB. > (I followed `VM.info` jcmd output format) Thanks! Most of us working on ZGC are currently on vacation, but we'll have a closer look at the patch after the holidays. cheers, Per > > Of course, it is not all. We need to work more for it. > > > Thanks, > > Yasumasa > From yasuenag at gmail.com Thu Dec 21 22:38:26 2017 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Fri, 22 Dec 2017 07:38:26 +0900 Subject: SA for ZGC In-Reply-To: References: Message-ID: <2c3eefd3-d1da-377c-f370-1b1b4be23d6c@gmail.com> Thanks Per! I'm waiting for comments and sponsorship. Yasumasa On 2017/12/22 4:33, Per Liden wrote: > Hi Yasumasa, > > On 2017-12-21 09:40, Yasumasa Suenaga wrote: >> Hi all, >> >> I checked SA implementation for ZGC, but it is not yet available. >> For example, we can see WrongTypeException when CLHSDB `universe` >> command is executed. >> >> As the first step, I propose to implement ZCollectedHeap and related classes for >> SA as this webrev: >> >> http://cr.openjdk.java.net/~ysuenaga/z/sa-universe/ >> >> After applying this patch, we can use `universe` command on CLHSDB. >> (I followed `VM.info` jcmd output format) > > Thanks! Most of us working on ZGC are currently on vacation, but we'll have a closer look at the patch after the holidays. > > cheers, > Per > >> >> Of course, it is not all. We need to work more for it. >> >> >> Thanks, >> >> Yasumasa >>