Far classes

Wed Jun 26 08:54:22 UTC 2024

On 18 Jun 2024, at 5:23, Thomas Stüfe wrote:

> We dedicate one bit in the nKlass for "is-far-class". For far classes, we
> store the Klass* at the end of the object. Then we encode the offset of the
> Klass* slot in the remaining nKlass bits.

You could also use a joint encoding [1] on more than one bit,
so as have encode more near classes in the same number of
bits.  What’s the trade-off?  The bits other than the
joint encoding would encode the offset, so the offsets
would be shorter.  In fact, you don’t need long offsets
at all; there’s no sense in tying the max offset to
the max number of near classes, which is what the naive
selector bit does: 2^15 near classes AND 2^15 max offset,
if you have 16 bits and burn one bit for the far class
indicator.

Instead, use (say) 6 joint bits out of 16 total, and
then you get 2^16-2^10 near classes, and a maximum far
class offset of 2^10.

[1] https://cr.openjdk.org/~jrose/jvm/joint-bit-encodings.html

> That depends on max. object size. How large does an object get? I found no
> limit in specs. However, the size of an object depends on its members, and
> we have an utf-8 CP-entry per member, and the number of CP entries is
> limited to 2^16. So, an object cannot have more than 65535 members (a bit
> less, actually). Therefore, I think it cannot be larger than 64k heap words.

Objects can get pathologically large because there is no limit
to the depth of the superclass chain, and each superclass can
contribute tens of thousands of fields.

But this should not be understood as a constraint on the
size of the nClass field, or the number of near classes.

> …
> We could even get down to 16 bits for the MW-stored nKlass, if we agree on
> aligning the Klass* slot trailing the object to 16 bytes. In that case, we
> can encode the Klass* slot offset with 15 bits and have the "is-far-class"
> as the 16th bit. Then, we could extract the nKlass from the MW with a
> 16-bit move. This would cost us: On average, another four bytes of overhead
> per far-class object, and a halved value range for near class IDs.

You are getting closer here to a better design:  The key move
is to constrain where the far-class Klass* can occur in the
object layout.  As long as there are enough bits in the header
(minus the far-class selector bit or bits), as long as those
bits can distinguish all the possible locations of the
far class (Klass*) field in the object layout, you are good.

So the problem boils down to what is the best way to constrain
the location of the Klass* field.  Obviously it is aligned
word-wise, so it’s not just any char offset.  More importantly,
we can simply demand that it is less than some fixed constant,
such as 2^10 words (taking the above example again, the one
with 16 nClass bits and 6 joint encoded far-class selector
bits).

Can we meet this demand?  Yes.  The key is to allocate
a far class pointer in any class whose layout is large
enough to overflow the offset limit. This is done even
if the class itself does not need a far class slot.
The slot is wasted in that case, but it is just one
word out of 2^10, so the max waste is 0.1%.

Jumbo classes are super-rare, anyway.

That way, if a subclass of the jumbo class ever needs
a far class word, there’s a spot prepared for it,
within the maximum offset.

If the class is jumbo and final, there is no need
to allocate a far class slot for subclasses.  But
if it is jumbo and non-final then it will require
a far class slot EVEN IF it is lucky enough to
acquire a near class ID.  The far class slot is
for the subclasses that are not so lucky and
cannot get a near class ID.  They will need that
far class slot, and they won’t be able to allocate
it for themselves.

BTW, if the class is abstract there is no need to allocate
a near class ID: Only concrete classes need near class
IDs.  But abstract jumbo classes WILL need far class
slots, again for their subclasses that are unlucky,
and cannot get a near class ID.

For testing make the max offset of the far class word
very small, like 10.  That way many classes will be
burdened with the extra field, and you will get a
stress test of the mechanism.  Don’t just assume
that there are enough jumbo classes in the world
to test this contraption without a stress mode.

The trick of preallocating a far class slot even
before you need it allows you to constrain the
offset of the far class slot.

The other independent trick of using a joint
encoding (of the far class selector pattern)
allows you to have very small far class offsets,
and therefore use almost all of the encoding
power of the nKlass in the header to represent
near classes, which is as it should be.

Continuing the above concrete example, if the 16-bit
nKlass has all zero bits in the top 6 bits, that
selects the far class mode, while one or more
non-zero bits in the top 6 would select the near
class, and all 16 bits would encode the ID of that
near class.

Klass* get_klass(uint16_t nKlass) {
  if ((nKlass & (-1<<10)) == 0) {
    return ((Klass**)this)[nKlass];
  } else {
    return NEAR_CLASSES[nKlass - (1<<10)];
  }
}