RFR: JDK-8274249: ZGC: Bulk free empty relocated pages
Per Liden
pliden at openjdk.java.net
Tue Sep 28 13:10:57 UTC 2021
On Fri, 24 Sep 2021 03:41:27 GMT, 王超 <github.com+25214855+casparcwang at openjdk.org> wrote:
> Similar to JDK-8255237, bulk free empty relocated pages can amortize the cost of freeing a page and speed up the relocation stage.
>
> The following is the result of specjbb2015 after applying the patch (the tests turn off the option`UseDynamicNumberOfGCThreads`): the average relocation time speeds up 14%, and the max relocation time speeds up 18%.
>
> patch:
> [2021-09-18T13:11:51.736+0800][info][gc,stats ] Phase: Concurrent Relocate 0.000 / 0.000 373.180 / 569.855 275.312 / 569.855 275.312 / 569.855 ms
> [2021-09-18T15:30:07.168+0800][info][gc,stats ] Phase: Concurrent Relocate 0.000 / 0.000 381.266 / 577.812 277.272 / 577.812 277.272 / 577.812 ms
> [2021-09-18T17:37:56.305+0800][info][gc,stats ] Phase: Concurrent Relocate 0.000 / 0.000 345.037 / 494.135 259.497 / 506.815 259.497 / 506.815 ms
>
>
> origin:
> [2021-09-18T01:01:32.897+0800][info][gc,stats ] Phase: Concurrent Relocate 0.000 / 0.000 429.099 / 662.120 327.213 / 759.723 327.213 / 759.723 ms
> [2021-09-18T03:11:10.433+0800][info][gc,stats ] Phase: Concurrent Relocate 0.000 / 0.000 413.014 / 613.035 307.625 / 613.035 307.625 / 613.035 ms
> [2021-09-18T05:21:12.743+0800][info][gc,stats ] Phase: Concurrent Relocate 0.000 / 0.000 411.745 / 642.242 308.986 / 642.242 308.986 / 642.242 ms
1) Instead of checking for the allocator type in the general code, this whole thing could be moved into the ZRelocateSmallAllocator, like this: https://github.com/openjdk/jdk/compare/master...pliden:8274249_zgc_bulk_free_empty_pages
2) How did you arrive at the bulk limit of 32? Did you try other numbers and this worked the best?
3) Freeing in bulk feels like a reasonable thing to do, and I'm sure it will cause less contention on the page cache lock. So this will probably help in the normal case. However, in the case were memory is low, GC workers are now hogging free memory could cause allocation stalls to be longer than needed and cause in-place relocation to happen more often than needed. This hogging also gets worse the more GC workers we have. So, I'm a bit hesitant to bring this in without some more thought. For example, instead of having a fixed bulk free limit (like 32) we might instead want to look at other ways of limiting the amount of times `free_page()` is called. I suspect the main problem with calling `free_page()` comes in the beginning of the relocation phase, where we might be relocating a lot very sparse pages. I.e. the time between calls to `free_page()` becomes very short. Later in the relocation phase, as pages get less and less sparse and we spend more and more time copying objects, the
calls to `free_page()` becomes less frequent. So, perhaps we could instead track number of relocated bytes, and call `free_pages()` once we pass some limit (say 2M). They we get a fairly uniform time between the calls to `free_page()` and avoid hogging memory for too long. This is just a thought, there might be other/better strategies they would work too. One would have to implement and benchmark to figure out which works best.
-------------
PR: https://git.openjdk.java.net/jdk/pull/5670
More information about the hotspot-gc-dev
mailing list