[master] RFR: Implement self-forwarding of objects that preserves header bits

Aleksey Shipilev shade at openjdk.java.net
Wed Jul 7 17:43:38 UTC 2021

On Mon, 28 Jun 2021 13:49:09 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

> In a few places in GCs we self-forward objects to indicate promotion failures. This is problematic with Lilliput because it irreversably overrides header bits.
> I propose to address this problem by using the recently-free'd biased-locking bit to indicate self-forwarding, without actually writing the ptr-to-self in the header.
> A few notes:
>  - The code in g1FullGCCompactionPoint.cpp keeps degenerating into ugly mess. This certainly warrants some rewriting.
>  - We have some naked header-decodings, which get tidied-up. This could also be brought upstream.
>  - cas_forward_to() kinda duplicates forward_to_atomic(), and has been replaced by the new forward_to_self_atomic(). It could be unduplicated upstream, too.
> An alternative *may be* to preserve the header of self-forwarded objects in a side-table (like PreservedMarksStack) instead. This may be possible but hairy: we could not access the compressed-klass* in the upper bits until the header gets restored. (This also affects calls to size(), etc).
> The ex-biased-locking-bit may still be used in regular operation. It only acts as self-forwarding-indicator when the lower two bits are also set. It requires the usual marks-preservation if we do this.
> We might want to have a discussion which project would need header bits, and how to realistically allocate them. #4522 mentions Valhalla as possible taker of the BL header bit. We may be able to free one or two bits when we compress the klass* even more. For example, we currently use 25 bits for i-hash, and 32 bits for nklass*. We usually want 32bits for i-hash instead. This would leave 25bits for nklass, which can address 268MB of Klass*-space (usual compression scheme), or 32million classes (table-lookup), or something in-between if we use fixed-size Klass (seems unrealistic though). Taking away another bit mean halving the addressable space.
> (It would be kinda-nice to have the BL-bit for Shenandoah, too, and for a similar purpose: indicate evacuation failure. But we do have a working solution, however that is ugly and affects performance a little.)

This looks generally fine for the experimental code (i.e. no obvious bugs), I just have a few questions.

src/hotspot/share/gc/g1/g1FullGCPrepareTask.cpp line 171:

> 169:   // We only re-prepare objects forwarded within the current region, so
> 170:   // skip objects that are already forwarded to another region.
> 171:   if (obj->is_forwarded()) {

Yeah, this seems like a nice micro-optimization for upstream. Do you want to take care of it?

src/hotspot/share/oops/markWord.hpp line 135:

> 133:   static const uintptr_t marked_value             = 3;
> 134: 
> 135:   static const uintptr_t self_forwarded_value     = 1 << self_forwarded_shift;

This actually looks like `self_forwarded_in_place`? Because `self_forwarded_value` is just `1`.

src/hotspot/share/oops/oop.hpp line 250:

> 248: 
> 249:   inline void forward_to(oop p);
> 250:   inline void forward_to_self();

Yes, this rename makes total sense for upstream. Want to do it?

src/hotspot/share/oops/oop.inline.hpp line 314:

> 312:       return forwardee(old_mark);
> 313:     } else {
> 314:       compare = old_mark;

Is it even possible to get to this branch? I.e. when CAS fails, aren't we guaranteed to see `old_mark.is_marked()`?


Marked as reviewed by shade (Committer).

PR: https://git.openjdk.java.net/lilliput/pull/10

More information about the lilliput-dev mailing list