[master] RFR: 8347711: [Lilliput] Parallel GC support for compact identity hashcode [v4]
Zhengyu Gu
zgu at openjdk.org
Mon Apr 7 13:51:18 UTC 2025
On Mon, 7 Apr 2025 11:12:40 GMT, Roman Kennke <rkennke at openjdk.org> wrote:
>> The Parallel GC does not yet support Lilliput 2 until now. The big problem has been that the Parallel Full GC is too rigid with respect to object sizes, and we could not make it work with compact identity hashcode, which requires that objects can grow during GC.
>>
>> The PR implements an alternative full GC for Parallel GC, which is more flexible. The algorithm mostly follows G1 and Shenandoah, with the difference that it creates temporary 'regions' (because Parallel GC does not use heap regions), with boundaries such that no object crosses region boundaries, and then after GC fill any gaps at end of regions with dummy objects.
>>
>> The implementation has a special 'serial' mode, which sets up only 4 regions which exactly match the 4 heap spaces (old, eden, from, to), and performs the forwarding and compaction phases serially to achieve perfect compaction at the expense of performance. (The marking and adjust-refs phases will still be done with parallel workers).
>>
>> I've run the micro benchmarks for systemgc, there seem to be only minor differences, and looks mostly like a few milliseconds offset in the new implementation:
>>
>> Baseline Full GC:
>>
>> AllDead.gc ss 25 31.120 ± 0.447 ms/op
>> AllLive.gc ss 25 83.655 ± 2.238 ms/op
>> DifferentObjectSizesArray.gc ss 25 179.725 ± 1.171 ms/op
>> DifferentObjectSizesHashMap.gc ss 25 186.011 ± 1.409 ms/op
>> DifferentObjectSizesTreeMap.gc ss 25 65.668 ± 3.333 ms/op
>> HalfDeadFirstPart.gc ss 25 64.862 ± 0.696 ms/op
>> HalfDeadInterleaved.gc ss 25 67.764 ± 3.139 ms/op
>> HalfDeadInterleavedChunks.gc ss 25 59.160 ± 1.667 ms/op
>> HalfDeadSecondPart.gc ss 25 66.210 ± 1.167 ms/op
>> HalfHashedHalfDead.gc ss 25 69.584 ± 2.276 ms/op
>> NoObjects.gc ss 25 18.462 ± 0.270 ms/op
>> OneBigObject.gc ss 25 587.425 ± 27.493 ms/op
>>
>>
>> New Parallel Full GC:
>>
>>
>> AllDead.gc ss 25 39.891 ± 0.461 ms/op
>> AllLive.gc ss 25 87.898 ± 1.940 ms/op
>> DifferentObjectSizesArray.gc ss 25 184.109 ± 0.795 ms/op
>> DifferentObjectSizesHashMap.gc ss 25 189.620 ± 2.236 ms/op
>> DifferentObjectSizesTreeMap.gc ss 25 69.915 ± 3.308 ms/op
>> HalfDeadFirstPart.gc ss 25 70.664 ± 0.804 ms/op
>> HalfDeadInterleaved.gc ss 25 71.31...
>
> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision:
>
> Avoid racy update in Klass::expand_for_hash()
> > How does this algorithm deal with objects larger than region size?
>
> Currently not particularily well: it doesn't move them at all (because they don't fit), and it _also_ doesn't move other objects past them. There's a TODO to make it possible to move objects around large objects (don't have to be > region-sized, also something like 1/2 region sized objects may have difficulty to move). I'll address that in a follow-up PR.
Are you talking about parallel specific? Thanks.
-------------
PR Comment: https://git.openjdk.org/lilliput/pull/195#issuecomment-2783405252
More information about the lilliput-dev
mailing list