RFR: 8260497: Shenandoah: Improve SATB flushing

Wed Jan 27 15:47:55 UTC 2021

On Wed, 27 Jan 2021 10:31:47 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> Currently, we periodically force flushing of SATB queues. This works by activating a flag every 100ms in every thread, which causes that thread to enqueue its SATB buffer the next time it overflows, even if it doesn't meet its threshold after filtering. This is somewhat problematic when a thread does not actually overflow its SATB queue in time. The whole point of the exercise is to try and avoid having too much left-over work when we reach final-mark.
>> 
>> We can do better than that: when concurrent mark is done we can handshake all threads, and let them flush their respective SATB queues, and re-enter concurrent mark loop again, until flushing yields no more work. Experiments show that it usually takes 1-3 flushes to clean out leftover work properly.
>> 
>> I ran benchmarks, 3 high-pressure preset runs of SPECjbb2015, 10 minutes each:
>> 
>> baseline:
>> Finish Mark                  =    0,251 s (a =      688 us) (n =   364) (lvls, us =      125,      486,      621,      824,     4156)
>> Finish Mark                  =    0,338 s (a =      922 us) (n =   366) (lvls, us =      131,      494,      652,      852,    72948)
>> Finish Mark                  =    0,257 s (a =      699 us) (n =   368) (lvls, us =      111,      492,      645,      826,     4447)
>> 
>> patched:
>> Finish Mark                  =    0,112 s (a =      301 us) (n =   370) (lvls, us =      115,      207,      250,      281,     3709)
>> Finish Mark                  =    0,107 s (a =      292 us) (n =   368) (lvls, us =      107,      209,      248,      287,     3329)
>> Finish Mark                  =    0,114 s (a =      310 us) (n =   367) (lvls, us =      115,      211,      254,      285,     3819)
>> 
>> It reliably lowers all timings for finish-mark. It also doesn't cause any regressions in throughput.
>> 
>> Testing:
>>  - [x] hotspot_gc_shenandoah
>>  - [x] benchmarks
>
> src/hotspot/share/gc/shared/satbMarkQueue.hpp line 118:
> 
>> 116:   // Return true if the queue's buffer should be enqueued, even if not full.
>> 117:   // The default method uses the buffer enqueue threshold.
>> 118:   bool should_enqueue_buffer(SATBMarkQueue& queue);
> 
> Why drop `virtual` here? Is it because Shenandoah was the only virtual override of it, and now we can do the non-virtual call?

Yes. IIRC we introduced that when we upstreamed Shenandoah, and can drop it again, thus restoring the original non-virtual version.

> src/hotspot/share/gc/shenandoah/shenandoahSATBMarkQueueSet.cpp line 59:
> 
>> 57: void ShenandoahSATBMarkQueueSet::enqueue_completed_buffer(BufferNode* node) {
>> 58:   SATBMarkQueueSet::enqueue_completed_buffer(node);
>> 59:   Atomic::inc(&_enqueued_count);
> 
> I believe `SATBMarkQueueSet` already tracks this, and we could instead use `SATBMarkQueueSet::completed_buffers_num`?

Ohh nice! Will give it a try!

> src/hotspot/share/gc/shenandoah/shenandoahSATBMarkQueueSet.hpp line 35:
> 
>> 33: class ShenandoahSATBMarkQueueSet : public SATBMarkQueueSet {
>> 34: private:
>> 35:   volatile int _enqueued_count;
> 
> I have a suspicion that `int` would overflow at some point in the long-running application. `size_t` would fit better, but then see the other comment that `SATBMQ` already tracks it itself.

Right. (I only ever compare before != after, so overflow would be ok, but it doesn't matter b/c I'll remove it)

-------------

PR: https://git.openjdk.java.net/jdk/pull/2254