RFR: 8247928: Refactor G1ConcurrentMarkThread for mark abort
Thomas Schatzl
tschatzl at openjdk.java.net
Mon Sep 7 09:05:55 UTC 2020
Hi all,
this is a continuation of the review of the same CR at
https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-September/030794.html; the current change is "Version 3"
above, the relevant email is https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-September/030812.html copied
verbatim here again:
Hi Kim,
On 03.09.20 19:17, Kim Barrett wrote:
>> On Sep 3, 2020, at 6:07 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>> On 03.09.20 09:24, Kim Barrett wrote:
>>> It seems backward that the phase functions return true to indicate aborted,
>>> overflow, or whatever, and return false for successful completion. And
>>> there's no mention of what the return values are in the description in the
>>> header file.
>>
>> I think the code is easier to read without the need to always negate the result. Ie. in the following example
>>
>> if (!phase1) return;
>> if (!phase2) return;
>>
>> the "!" detracts imho from reading the steps.
>>
>> Another option I can see could be a macro hiding all this, e.g.
>>
>> EXEC(phase1);
>> EXEC(phase2);
>>
>> but I kind of dislike that even more.
>>
>> Added documentation.
>
> I agree a macro is not the way to go here. I'm less bothered by the need for
> lots of preceeding "!" than you are, but I certainly see your point. So I'm
> okay with the code as is, now that there's a comment to set expectations.
>
>> http://cr.openjdk.java.net/~tschatzl/8247928/webrev.1_to_2 (diff)
>> http://cr.openjdk.java.net/~tschatzl/8247928/webrev.2 (full)
>
> I like the overall approach a lot. Thanks for doing this cleanup.
>
> Although I have a number of comments, there's nothing major here.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1ConcurrentMarkThread.cpp
> 190 uint iter = 1;
> 191 while (true) {
> ...
> 209 log_info(gc, marking)("Concurrent Mark Restart for Mark Stack Overflow (iteration #%u)",
> 210 iter++);
>
> I think that's the only use of iter, and that it's not used after the loop.
> How about instead
>
> for (uint iter = 1; true; ++iter) {
>
> and remove the increment in the logging statement.
Done.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1ConcurrentMarkThread.cpp
> 191 while (true) {
> ...
> 211 }
> 212
> 213 Ticks mark_end = Ticks::now();
> 214 log_info(gc, marking)("Concurrent Mark (%.3fs, %.3fs) %.3fms",
> 215 mark_start.seconds(), mark_end.seconds(),
> 216 (mark_end - mark_start).seconds() * 1000.0);
>
> This is a change to the logging. Some consequences I like, others I'm not so
> sure about. You talked about some of this in the RFR description, but I
> want to go through it carefully here.
>
> - This now includes the remark pause in the time, where previously that
> seemed to be intentionally excluded. I don't know the rationale for that
> exclusion, but it was pretty clearly intentional since it intentionally (per
> comment) contorted the code somewhat. So it's hard to decide whether this is
> okay. The remark time could be excluded though by capturing the mark_end
> time just before calling subphase_remark().
Since within the "Concurrent Mark" time span there can be any number of
other (gc) stw pauses, I am sceptical how subtracting only the Remark
pause this could be anything but misleading.
It could be certainly interesting to provide some concurrent duration
for the whole marking, but I think this out of scope here.
I do not know what the old "Concurrent Mark" end message actually
measured: if you look, the first message will not contain remark pause,
the next ones after a mark restart do contain them (so all time
excluding the last Remark pause). It is neither just the concurrent time
nor the full marking time. Not sure how useful that is.
This code simplifies the message to something like: from start of
marking related activity until all marking activity fully completed.
Actually there are still a few inconsistencies in the
(gc=info,gc+marking=info) log output. Let me give an example:
[23.160s][info][gc ] GC(225) Concurrent Cycle
[23.160s][info][gc,marking] GC(225) Concurrent Clear Claimed Marks
[23.160s][info][gc,marking] GC(225) Concurrent Clear Claimed Marks 0.009ms
[23.160s][info][gc,marking] GC(225) Concurrent Scan Root Regions
[23.209s][info][gc,marking] GC(225) Concurrent Scan Root Regions 48.811ms
[23.209s][info][gc,marking] GC(225) Concurrent Mark (23.209s)
^-- only the "Concurrent Mark" messages has start time - I think it is
superfluous, just repeating the output of UL.
[23.209s][info][gc,marking] GC(225) Concurrent Mark From Roots
[...]
[24.792s][info][gc,marking] GC(225) Concurrent Mark From Roots 1583.035ms
[24.792s][info][gc,marking] GC(225) Concurrent Preclean
[24.793s][info][gc,marking] GC(225) Concurrent Preclean 0.379ms
[24.886s][info][gc ] GC(225) Pause Remark 552M->225M(760M) 93.108ms
[24.886s][info][gc,marking] GC(225) Concurrent Mark (23.209s, 24.886s)
1676.831ms
^-- same here with start/end times.
[24.886s][info][gc,marking] GC(225) Concurrent Rebuild Remembered Sets
[...]
[31.592s][info][gc,marking] GC(225) Concurrent Rebuild Remembered Sets
6706.292ms
[31.594s][info][gc ] GC(225) Pause Cleanup 451M->451M(1024M) 0.774ms
[31.594s][info][gc,marking] GC(225) Concurrent Cleanup for Next Mark
[31.600s][info][gc,marking] GC(225) Concurrent Cleanup for Next Mark 6.298ms
[31.600s][info][gc ] GC(225) Concurrent Cycle 8439.504ms
I fixed those two in the next patch next week I will publish using a
github PR...
Another inconsistency is that the Concurrent Mark Abort message does not
contain any duration information (but UL captures the "end" time stamp).
But I would like to defer that (existing) issue.
>
> - It only does this logging once, rather than on each iteration. That seems
> good; as you pointed out in the RFR, multiples are pretty likely handled
> wrong by some uses.
>
> - This logging no longer happens if marking is aborted by remark. That's
> now consistent with other abort cases, so probably an improvement.
Agree to both.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1ConcurrentMarkThread.cpp
> 137 while (!should_terminate()) {
> 138 if (wait_for_next_cycle()) {
> 139 break;
> 140 }
>
> This could instead just be
>
> while (!wait_for_next_cycle()) { ...
>
> But I already thought the sense of the wait_for_next_cycle result was
> backward, and this makes an even stronger case for that. I think better
> would be to reverse the sense of the result and use
>
> while (wait_for_next_cycle()) { ...
+1. Done.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1ConcurrentMarkThread.hpp
> 65 bool phase_concurrent_cycle_start();
>
> Unused and undefined.
>
Done.
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1ConcurrentMarkThread.hpp
> 63 // ConcurrentGCBreakpoints must not be placed before the the root
> 64 // region scan phase too for this reason.
>
> This comment seems misplaced here. I think it belongs in the implementation
> of full_concurrent_cycle_do, in conjunction with the comment about not
> returning before the scan root regions phase.
>
> I think the rest of the comment about STS &etc also seems misplaced here and
> should be similarly moved (more or less where it used to be).
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1ConcurrentMarkThread.hpp
> 66 bool phase_clear_cld_claimed_marks();
>
> The result of this function is unused, because of the restriction against
> returning before the scan root regions phase. I think I'd prefer it return
> void, to make it clear there's nothing to be done with it. Even though that
> makes its signature different from all the phase functions.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1ConcurrentMarkThread.hpp
> 81 // Perform a full concurrent cycle.
>
> This comment seems kind of vacuous. The function also in some sense doesn't
> even do that, since it doesn't include the start/end parts.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1ConcurrentMarkThread.hpp
> 70 bool mark_loop_needs_restart() const;
>
> This one seems out of order or otherwise misplaced here. Seems like it
> should be after subphase_remark. Or maybe it shouldn't be in the phase
> function section at all.
>
> ------------------------------------------------------------------------------
>
All fixed.
Testing: tier1 currently running.
Thanks,
Thomas
-------------
Commit messages:
- Version 3
- Version 2
- Version 1
- Version 0
Changes: https://git.openjdk.java.net/jdk/pull/44/files
Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=44&range=00
Issue: https://bugs.openjdk.java.net/browse/JDK-8247928
Stats: 343 lines in 3 files changed: 186 ins; 136 del; 21 mod
Patch: https://git.openjdk.java.net/jdk/pull/44.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/44/head:pull/44
PR: https://git.openjdk.java.net/jdk/pull/44
More information about the hotspot-gc-dev
mailing list