RFR: 8260267: vmTestbase/gc/gctests/FinalizeLock/FinalizeLock.java fails with fatal error: Mark stack space exhausted

王超 github.com+25214855+casparcwang at openjdk.java.net
Wed Mar 31 02:31:17 UTC 2021


On Tue, 30 Mar 2021 23:35:00 GMT, Hui Shi <hshi at openjdk.org> wrote:

>> ZGC mark stack will be exhausted when the live object graph is very large, which due to the pseudo-BFS style visiting of the object graph (ZGC gc worker will push children to other stripes which behave like BFS, so the maximum mark stack size is O(num_objects).). But other gc like g1 or Shenandoah use DFS to visiting the object graph, and the mark stack usage merely exceeds 32M, which may due to the maximum mark stack size is related to the depth of the object graph O(depth_of_object_graph).
>> 
>> The following is the test case used to reproduce the crash,
>> import java.util.HashMap;
>> import java.util.HashSet;
>> 
>> class Test2 {
>>   public static int NODE_COUNT = 25000000;
>>   public static int NODE_COUNT_UB = 1 << 25;
>>   public static int STRING_GEN_INDEX = 1;
>>   public static int WATCHER_COUNT = 100;
>>   public static int MAP_COUNT = 8;
>>   public static int BIN_COUNT = 16;
>>   public static Object[] watchTable = new Object[MAP_COUNT];
>> 
>>   public static class Watcher {
>>     public static long counter = 0;
>>     public long index;
>>     Watcher() {
>>       index = counter;
>>       counter++;
>>     }
>>   }
>> 
>>   public static int gen_name() {
>>     int cur = STRING_GEN_INDEX++;
>>     int reminder = cur % BIN_COUNT;
>>     cur = cur - reminder + NODE_COUNT_UB * reminder;
>>     return cur;
>>   }
>> 
>>   public static HashMap<Integer, HashSet<Watcher>> get_map(int index) {
>>     return (HashMap<Integer, HashSet<Watcher>>)watchTable[index];
>>   }
>> 
>>   public static void gen_watcher() {
>>     int name = gen_name();
>>     HashSet<Watcher> set = new HashSet<Watcher>();
>>     for (int i = 0; i < WATCHER_COUNT; i++) {
>>        set.add(new Watcher());
>>     }
>>     for (int i = 0; i < MAP_COUNT; i++) {
>>       get_map(i).put(name, set);
>>     }
>>   }
>> 
>>   public static void setup() {
>>     for (int i = 0; i < MAP_COUNT; i++) {
>>       watchTable[i] = new HashMap<Integer, HashSet<Watcher>>();
>>     }
>>     for (int i = 0; i < NODE_COUNT; i++) {
>>       gen_watcher();
>>     }
>>   }
>> 
>>   public static void main(String[] args) {
>>     setup();
>>     for (int i = 0; i < 10; i++) {
>>       System.gc();
>>     }
>>     System.out.println("pass");
>>   }
>> }
>> 
>> The command to run the testcase is,
>> java -XX:+UseZGC -Xmx280g -XX:ZCollectionInterval=10  -Xlog:gc*=debug:file=gc.log:time,level,tags:filesize=5g Test2
>
> src/hotspot/share/gc/z/zMark.inline.hpp line 39:
> 
>> 37:   assert(ZAddress::is_marked(addr), "Should be marked");
>> 38:   ZMarkThreadLocalStacks* const stacks = ZThreadLocalData::stacks(Thread::current());
>> 39:   ZMarkStripe* const stripe = push_local_stripe() && ZThread::is_worker() ?
> 
> This introduces a volatile read in every object mark, do you have concurrent mark time mesaurement with this check?

The volatile read has impact on the mark time, the average mark time increased by 4.7% 

Jdk17 zgc origin:
RUN RESULT: hbIR (max attempted) = 124943, hbIR (settled) = 121646, max-jOPS = 111199, critical-jOPS = 91446
RUN RESULT: hbIR (max attempted) = 122581, hbIR (settled) = 118652, max-jOPS = 111549, critical-jOPS = 92803
RUN RESULT: hbIR (max attempted) = 130653, hbIR (settled) = 119162, max-jOPS = 111055, critical-jOPS = 95202
[2021-03-26T16:49:41.474+0800][info ][gc,stats       ]       Phase: Concurrent Mark                               0.000 / 0.000      1142.885 / 1689.873    800.652 / 3473.744    800.652 / 3473.744    ms
[2021-03-26T19:03:55.173+0800][info ][gc,stats       ]       Phase: Concurrent Mark                               0.000 / 0.000      1204.003 / 1975.038    843.825 / 3006.138    843.825 / 3006.138    ms
[2021-03-26T21:14:13.816+0800][info ][gc,stats       ]       Phase: Concurrent Mark                               0.000 / 0.000      1147.566 / 2477.271    827.487 / 2982.508    827.487 / 2982.508    ms


Jdk17 zgc patch:
RUN RESULT: hbIR (max attempted) = 130653, hbIR (settled) = 116479, max-jOPS = 111055, critical-jOPS = 94165
RUN RESULT: hbIR (max attempted) = 130653, hbIR (settled) = 116752, max-jOPS = 109749, critical-jOPS = 97445
RUN RESULT: hbIR (max attempted) = 130653, hbIR (settled) = 118685, max-jOPS = 111055, critical-jOPS = 93540
[2021-03-30T20:53:35.593+0800][info ][gc,stats       ]       Phase: Concurrent Mark                               0.000 / 0.000      1198.332 / 3098.680    860.451 / 3198.799    860.451 / 3198.799    ms
[2021-03-30T23:07:36.532+0800][info ][gc,stats       ]       Phase: Concurrent Mark                               0.000 / 0.000      1155.819 / 1550.257    842.028 / 3189.933    842.028 / 3189.933    ms
[2021-03-31T01:27:26.091+0800][info ][gc,stats       ]       Phase: Concurrent Mark                               0.000 / 0.000      1211.280 / 1964.112    883.528 / 3053.926    883.528 / 3053.926    ms

-------------

PR: https://git.openjdk.java.net/jdk/pull/3262



More information about the hotspot-gc-dev mailing list