RFR: 8253600: G1: Fully support pinned regions for full gc [v4]
Thomas Schatzl
tschatzl at openjdk.java.net
Tue Nov 10 09:58:03 UTC 2020
On Mon, 9 Nov 2020 18:07:48 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:
>> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision:
>>
>> sjohanss review 2
>
> src/hotspot/share/gc/g1/g1FullGCHeapRegionAttr.hpp line 58:
>
>> 56: bool is_pinned_or_closed(HeapWord* obj) const {
>> 57: assert(!is_invalid(obj), "not initialized yet");
>> 58: return get_by_address(obj) >= Pinned;
>
> Better have a static assert that `ClosedArchive > Pinned`.
Okay, will do.
> src/hotspot/share/gc/g1/g1FullGCHeapRegionAttr.hpp line 37:
>
>> 35: static const uint8_t ClosedArchive = 2;
>> 36:
>> 37: static const uint8_t Invalid = 255;
>
> Why use 255 as the default value? Prior to this PR, the default value is 0. I think it's best to keep it intact. (Probably, the compiler can optimize `clear()` better if each element is reset to 0.)
I do not understand what "Prior to this PR the default value is 0" means, can you clarify? The default value for the biased arrays must always be defined by the child classes overriding default_value() so there does not seem to be a pre-defined "prior" value.
Other than that the reason for using 255 is easier discrimination from the valid values in the debugger, just like hotspot uses something like 0xdeadbeef for various kinds of invalid memory.
There is no difference for the compiler for having 0 or 255 except for maybe the part about initializing the default value (on probably older x86 processors).
However compared to the overhead of setting up an efficient byte-wise `clear()` method is much larger than in this case a difference between `xor reg, reg` vs. `movsx reg, -1` (e.g. for x86, the former is/having been a common trick to quickly clear registers). That pales in comparison to the actual work.
Some architectures (ppc, sparc, iirc likely some more) have some special "clear cache line" instruction which may be used to clear memory "quickly" to zero, but it is not used since they often come with large caveats.
Otherwise the clearing of the this small table (one byte per region) is a one-time cost at startup of a full gc operations that may take billions of cpu cycles.
It's not worth thinking about overhead of the clearing in this case imho.
> src/hotspot/share/gc/g1/g1FullCollector.hpp line 76:
>
>> 74: ReferenceProcessorSubjectToDiscoveryMutator _is_subject_mutator;
>> 75:
>> 76: G1FullGCHeapRegionAttr _region_attr_table;
>
> I don't really see the point of this auxiliary data structure. Why can't we just query the underlying region for its type, pinned, open/close archive?
The only reason is performance:
Consider memory access with that auxiliary data structure: it is a single dereference/memory load on a very dense data structure.
Directly querying the HeapRegion* is two dependent dereferences (first getting the HeapRegion*, then to the HeapRegion, then accessing the member) on a fairly large data structure (HeapRegion).
Since these loads are done for every reference (millions of times) the second is much slower, at best filling the cache with useless data as (roughly) a HeapRegion is around a cache line in size (iirc).
Further, with the current encoding of the values in region attribute table, the closed-or-pinned check can be done with a single check instead of some disjunction (ie. region->is_pinned() || region->is_closed(), although it is true that closed regions are always pinned), so saving even more code (and branch locations).
-------------
PR: https://git.openjdk.java.net/jdk/pull/824
More information about the hotspot-dev
mailing list