RFR: 8327452: G1: Improve scalability of Merge Log Buffers [v5]

Tue Mar 12 11:51:16 UTC 2024

On Mon, 11 Mar 2024 14:19:21 GMT, Ivan Walulya <iwalulya at openjdk.org> wrote:

>> Hi all,
>> 
>> Please review this change to reduce contention on the DCQS. The parallel work done by the worker threads is very small, therefore contention on the DCQS dominates the Merge Log Buffers phase. In this change, we add a sequential phase to distribute the Log Buffers to the worker threads. This removes the DCQS bottleneck in the highly contended case, at a small cost to the cases with low contention.
>> 
>> Testing Tier 1-3
>> 
>> The graphs below are using Bigramtester at 20gb, "Distribute and Avg Log Buffers" combines the duration of the sequential distribute phase and the avg duration of the parallel `Merge Log Buffers` phase.
>> | Time (ms)  | # Cards |
>> | ------------- | ------------- |
>> | ![128_concurrent_start](https://github.com/openjdk/jdk/assets/69453999/c6663ed9-6b81-4e32-bb1e-17627591406f) | ![128_concurrent_start_cards](https://github.com/openjdk/jdk/assets/69453999/1909564b-2482-4767-a334-9f5612d5cb8c)|
>> | ![128_mixed](https://github.com/openjdk/jdk/assets/69453999/d8dd8933-3aae-4795-b426-b4022bcf1ff2)| ![128_mixed_cards](https://github.com/openjdk/jdk/assets/69453999/ed5b8279-8d70-4dc6-b53d-59bf04b6e12e)|
>> | ![64_concurrent_start](https://github.com/openjdk/jdk/assets/69453999/5fb03cc5-2b86-4b9d-83ff-caa0eb482859) | ![64_concurrent_start_cards](https://github.com/openjdk/jdk/assets/69453999/c62fb05a-dca1-4b3f-895c-62837f7af2f8)|
>> | ![64_mixed](https://github.com/openjdk/jdk/assets/69453999/896c4138-a26b-457d-a5b6-d4b4285c4b8f) |  ![64_mixed_cards](https://github.com/openjdk/jdk/assets/69453999/2a6a234e-f180-4f02-9ad6-3867c5fd1881) |
>> | ![32_mixed](https://github.com/openjdk/jdk/assets/69453999/25a45fa6-f1dd-48e9-85af-702be67f7b9b) | ![32_mixed_cards](https://github.com/openjdk/jdk/assets/69453999/f5d80538-efde-47e4-860f-9b15409085a7) |
>> |  ![8_mixed](https://github.com/openjdk/jdk/assets/69453999/112b23c5-f048-4678-8d04-2c682696b357) | ![8_mixed_cards](https://github.com/openjdk/jdk/assets/69453999/0fdb9b6b-242c-424c-83ef-3d05703f5c42) |
>
> Ivan Walulya has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Kim Review

Marked as reviewed by tschatzl (Reviewer).

src/hotspot/share/gc/g1/g1RemSet.cpp line 1313:

> 1311:           count += tail->size();
> 1312:           cur = tail->next();
> 1313:         }

Still good. Looking at this balancing code again, maybe a comment about that part ("// Balance dirty card buffers for better work distribution.") or factoring this out into a helper method for documentation purposes would be nice.
It just feels quite a bit of code in a constructor, but just a comment what this is doing is just fine. Or just ignore if you feel it is obvious.

-------------

PR Review: https://git.openjdk.org/jdk/pull/18134#pullrequestreview-1930829966
PR Review Comment: https://git.openjdk.org/jdk/pull/18134#discussion_r1521322464