RFR: 8260267: vmTestbase/gc/gctests/FinalizeLock/FinalizeLock.java fails with fatal error: Mark stack space exhausted
王超
github.com+25214855+casparcwang at openjdk.java.net
Wed Mar 31 02:31:17 UTC 2021
On Tue, 30 Mar 2021 23:35:00 GMT, Hui Shi <hshi at openjdk.org> wrote:
>> ZGC mark stack will be exhausted when the live object graph is very large, which due to the pseudo-BFS style visiting of the object graph (ZGC gc worker will push children to other stripes which behave like BFS, so the maximum mark stack size is O(num_objects).). But other gc like g1 or Shenandoah use DFS to visiting the object graph, and the mark stack usage merely exceeds 32M, which may due to the maximum mark stack size is related to the depth of the object graph O(depth_of_object_graph).
>>
>> The following is the test case used to reproduce the crash,
>> import java.util.HashMap;
>> import java.util.HashSet;
>>
>> class Test2 {
>> public static int NODE_COUNT = 25000000;
>> public static int NODE_COUNT_UB = 1 << 25;
>> public static int STRING_GEN_INDEX = 1;
>> public static int WATCHER_COUNT = 100;
>> public static int MAP_COUNT = 8;
>> public static int BIN_COUNT = 16;
>> public static Object[] watchTable = new Object[MAP_COUNT];
>>
>> public static class Watcher {
>> public static long counter = 0;
>> public long index;
>> Watcher() {
>> index = counter;
>> counter++;
>> }
>> }
>>
>> public static int gen_name() {
>> int cur = STRING_GEN_INDEX++;
>> int reminder = cur % BIN_COUNT;
>> cur = cur - reminder + NODE_COUNT_UB * reminder;
>> return cur;
>> }
>>
>> public static HashMap<Integer, HashSet<Watcher>> get_map(int index) {
>> return (HashMap<Integer, HashSet<Watcher>>)watchTable[index];
>> }
>>
>> public static void gen_watcher() {
>> int name = gen_name();
>> HashSet<Watcher> set = new HashSet<Watcher>();
>> for (int i = 0; i < WATCHER_COUNT; i++) {
>> set.add(new Watcher());
>> }
>> for (int i = 0; i < MAP_COUNT; i++) {
>> get_map(i).put(name, set);
>> }
>> }
>>
>> public static void setup() {
>> for (int i = 0; i < MAP_COUNT; i++) {
>> watchTable[i] = new HashMap<Integer, HashSet<Watcher>>();
>> }
>> for (int i = 0; i < NODE_COUNT; i++) {
>> gen_watcher();
>> }
>> }
>>
>> public static void main(String[] args) {
>> setup();
>> for (int i = 0; i < 10; i++) {
>> System.gc();
>> }
>> System.out.println("pass");
>> }
>> }
>>
>> The command to run the testcase is,
>> java -XX:+UseZGC -Xmx280g -XX:ZCollectionInterval=10 -Xlog:gc*=debug:file=gc.log:time,level,tags:filesize=5g Test2
>
> src/hotspot/share/gc/z/zMark.inline.hpp line 39:
>
>> 37: assert(ZAddress::is_marked(addr), "Should be marked");
>> 38: ZMarkThreadLocalStacks* const stacks = ZThreadLocalData::stacks(Thread::current());
>> 39: ZMarkStripe* const stripe = push_local_stripe() && ZThread::is_worker() ?
>
> This introduces a volatile read in every object mark, do you have concurrent mark time mesaurement with this check?
The volatile read has impact on the mark time, the average mark time increased by 4.7%
Jdk17 zgc origin:
RUN RESULT: hbIR (max attempted) = 124943, hbIR (settled) = 121646, max-jOPS = 111199, critical-jOPS = 91446
RUN RESULT: hbIR (max attempted) = 122581, hbIR (settled) = 118652, max-jOPS = 111549, critical-jOPS = 92803
RUN RESULT: hbIR (max attempted) = 130653, hbIR (settled) = 119162, max-jOPS = 111055, critical-jOPS = 95202
[2021-03-26T16:49:41.474+0800][info ][gc,stats ] Phase: Concurrent Mark 0.000 / 0.000 1142.885 / 1689.873 800.652 / 3473.744 800.652 / 3473.744 ms
[2021-03-26T19:03:55.173+0800][info ][gc,stats ] Phase: Concurrent Mark 0.000 / 0.000 1204.003 / 1975.038 843.825 / 3006.138 843.825 / 3006.138 ms
[2021-03-26T21:14:13.816+0800][info ][gc,stats ] Phase: Concurrent Mark 0.000 / 0.000 1147.566 / 2477.271 827.487 / 2982.508 827.487 / 2982.508 ms
Jdk17 zgc patch:
RUN RESULT: hbIR (max attempted) = 130653, hbIR (settled) = 116479, max-jOPS = 111055, critical-jOPS = 94165
RUN RESULT: hbIR (max attempted) = 130653, hbIR (settled) = 116752, max-jOPS = 109749, critical-jOPS = 97445
RUN RESULT: hbIR (max attempted) = 130653, hbIR (settled) = 118685, max-jOPS = 111055, critical-jOPS = 93540
[2021-03-30T20:53:35.593+0800][info ][gc,stats ] Phase: Concurrent Mark 0.000 / 0.000 1198.332 / 3098.680 860.451 / 3198.799 860.451 / 3198.799 ms
[2021-03-30T23:07:36.532+0800][info ][gc,stats ] Phase: Concurrent Mark 0.000 / 0.000 1155.819 / 1550.257 842.028 / 3189.933 842.028 / 3189.933 ms
[2021-03-31T01:27:26.091+0800][info ][gc,stats ] Phase: Concurrent Mark 0.000 / 0.000 1211.280 / 1964.112 883.528 / 3053.926 883.528 / 3053.926 ms
-------------
PR: https://git.openjdk.java.net/jdk/pull/3262
More information about the hotspot-gc-dev
mailing list