[master] RFR: Implement self-forwarding of objects that preserves header bits
Roman Kennke
rkennke at openjdk.java.net
Wed Jul 7 20:10:05 UTC 2021
On Wed, 7 Jul 2021 17:29:50 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:
>> In a few places in GCs we self-forward objects to indicate promotion failures. This is problematic with Lilliput because it irreversably overrides header bits.
>>
>> I propose to address this problem by using the recently-free'd biased-locking bit to indicate self-forwarding, without actually writing the ptr-to-self in the header.
>>
>> A few notes:
>> - The code in g1FullGCCompactionPoint.cpp keeps degenerating into ugly mess. This certainly warrants some rewriting.
>> - We have some naked header-decodings, which get tidied-up. This could also be brought upstream.
>> - cas_forward_to() kinda duplicates forward_to_atomic(), and has been replaced by the new forward_to_self_atomic(). It could be unduplicated upstream, too.
>>
>> An alternative *may be* to preserve the header of self-forwarded objects in a side-table (like PreservedMarksStack) instead. This may be possible but hairy: we could not access the compressed-klass* in the upper bits until the header gets restored. (This also affects calls to size(), etc).
>>
>> The ex-biased-locking-bit may still be used in regular operation. It only acts as self-forwarding-indicator when the lower two bits are also set. It requires the usual marks-preservation if we do this.
>>
>> We might want to have a discussion which project would need header bits, and how to realistically allocate them. #4522 mentions Valhalla as possible taker of the BL header bit. We may be able to free one or two bits when we compress the klass* even more. For example, we currently use 25 bits for i-hash, and 32 bits for nklass*. We usually want 32bits for i-hash instead. This would leave 25bits for nklass, which can address 268MB of Klass*-space (usual compression scheme), or 32million classes (table-lookup), or something in-between if we use fixed-size Klass (seems unrealistic though). Taking away another bit mean halving the addressable space.
>>
>> (It would be kinda-nice to have the BL-bit for Shenandoah, too, and for a similar purpose: indicate evacuation failure. But we do have a working solution, however that is ugly and affects performance a little.)
>
> src/hotspot/share/gc/g1/g1FullGCPrepareTask.cpp line 171:
>
>> 169: // We only re-prepare objects forwarded within the current region, so
>> 170: // skip objects that are already forwarded to another region.
>> 171: if (obj->is_forwarded()) {
>
> Yeah, this seems like a nice micro-optimization for upstream. Do you want to take care of it?
Naa, this doesn't really do much, unless we extract the header once, and then check is_forwarded() on it, and avoid decoding if it's not forwarded - but even then this seems to be too micro ;-)
Here it is not an optimization but a necessary change, because we cannot blindly decode the forwardee anymore.
> src/hotspot/share/oops/markWord.hpp line 135:
>
>> 133: static const uintptr_t marked_value = 3;
>> 134:
>> 135: static const uintptr_t self_forwarded_value = 1 << self_forwarded_shift;
>
> This actually looks like `self_forwarded_in_place`? Because `self_forwarded_value` is just `1`.
It's actually self_forwarded_mask_in_place, I removed the _value and simplified the set_self_forwarded() method.
> src/hotspot/share/oops/oop.inline.hpp line 314:
>
>> 312: return forwardee(old_mark);
>> 313: } else {
>> 314: compare = old_mark;
>
> Is it even possible to get to this branch? I.e. when CAS fails, aren't we guaranteed to see `old_mark.is_marked()`?
I don't think so. I thought a little too far here, Shenandoah-style territory. I removed this branch, and the whole loop.
-------------
PR: https://git.openjdk.java.net/lilliput/pull/10
More information about the lilliput-dev
mailing list