Reducing class pointer size useful?
Roman Kennke
rkennke at redhat.com
Mon Sep 13 14:19:06 UTC 2021
Hi Thomas,
Yes, indeed, this would be very helpful!
The current state of the prototype is that I'm putting the compressed
Klass* in the upper 32bit of the header. (The original Klass* is still
currently present in the 2nd word, but unused, except for verification
purposes.) The layout is basically: 32bits for Klass*, 26bits for the
hashcode, and 6 bits for the rest (locking and GC). Here it would be
nice to have 32bit hashcodes instead, and 26bits for the Klass*.
I'm also working on moving the hashcode out of the header, requiring
only 2 bits for managing the hashcode state, which makes it very
reasonable to consider header sizes of 32bit: 24bits for the Klass* and
8 bits for GC+locking+hashcode.
So yes, any mechanims to reduce the Klass* to 24bits (maybe with some
flexibility in case we need more bits, e.g. for Loom or Valhalla) would
be very welcome. My thinking went in very similar direction as you
indicated (larger alignments for the Klass objects), and John Rose
sketched some more ideas in his reply.
Are you planning to work on this?
Thanks,
Roman
> Hi,
>
> Would it be of use for Lilliput to shrink the class pointer size beyond 32
> bit? I did not closely follow the discussions. Therefore I am not sure
> where the current thinking goes.
>
> If yes, maybe we could reduce the pointer size not only by reducing the
> encoding range but by using larger alignments.
>
> We encode with add-and-shift, as we do with compressed oops. Traditionally
> the shift was 3, since sizeof(void*) is the alignment requirement for
> metaspace allocations. This shift was used to enlarge the coverage of class
> pointer encoding from 4GB to 32GB (KlassEncodingMetaspaceMax). But we never
> used this to my knowledge since we limit class space size to 3GB at most.
> And nobody needs 32GB class space anyway. So there was never a reason to
> cover more than (3GB + <cds size>). Unless I missed something, the shift
> had been useless. In fact, we recently removed the shift if CDS is on
> (JDK-8265705) to solve an unrelated aarch64 issue, and nothing bad happened.
>
> But we could use the shift, not to enlarge the encoding range but to reduce
> the class pointer size. And we could use a larger shift value. For example,
> let's say we shift 8 bits. Then cut off those bits and reduce the class
> pointer to 24 bits.
>
> The resulting alignment would be 256 bytes. Applied to all metaspace
> allocations such an alignment would be prohibitively expensive, since most
> allocations are very small. But if we apply this larger alignment to the
> class space only, leave the rest of the metaspace alone, it is not so bad.
> Before JEP 387, using different alignments would have been difficult to
> implement, but metaspace coding is much more modular now, and using
> different alignments for the different regions can be done.
>
> So we apply the larger alignment only to Klass structures. Klass structures
> are large, and the relative loss due to alignment would matters less. They
> are variable-sized but sizes are clustered between ~512 bytes and ~1K. They
> can get much larger than that, but that is rare. Alignment loss would be
> between 0-255 bytes, lets say on average 127. For a typical larger app of
> 10000 classes, this would waste ~1.2MB. If that is acceptable depends on
> what positive effect the smaller compressed class pointer has on project
> Lilliput.
>
> ---
>
> One could argue that using an 8 bit shifted class pointer emans it stops
> being a pointer and becomes an index into a table of 256-byte-slots,
> populated with variable-sized Klass structures. With Klass sizes clustered
> between 512 bytes..1K each Klass would populate 2..4 slots on average. The
> 24-bit pointer is enough to address 16mio slots, hence on average 4..8
> million Klass structures, still covering a 4G total range.
>
> We could further slim down the class pointer if we agree on a lower maximum
> number of classes. E.g. with 22 bits, we could address 4mio slots and house
> about 500k...1mio classes, still allowing for a maximum encoding range of
> 1G.
>
> We could play around with these variables. E.g. a larger shift of 10 bits -
> 1KB alignment - would mean most Klass structures occupy just one slot, we
> would have to live a somewhat higher alignment waste of 0...1024, but now
> can reduce the encoded class pointer to 20 bits, still being able to
> address 1 mio slots resp. close to 1mio classes, with the total encoding
> range still covering a 1GB.
>
> ---
>
> I think this approach is a variant of the
> Klass-structures-in-a-table-and-store-the-index approach, but it allows for
> those rare Klass structures to be larger than a single table slot and it
> has a much larger max. cap on the number of classes than if we were just to
> limit the encoding range. To me this matters somewhat because I have seen
> productive installations where the number of classes was the low 100000's.
> I don't think the 8192 limit cited in the Lilliput Wiki is practical.
>
> If I am right this approach should not require a lot of changes:
> - we would need to modify metaspace to use separte alignments for the class
> space
> - may have to fix class pointer encoding for the various platforms if they
> don't work with larger shifts out of the box, or are inefficient. E.g. on
> x64, we use LEAQ to encode pointers, and LEAQ allows for a max. shift of 3,
> so for shift=8 we may need to use separate add and shift.
> - CDS may need some work too, since the Klass structures in the CDS region
> need to be aligned to the larger alignment as well.
>
> Hope I did not make some gross miscalculation somewhare, but that's my
> idea. What do you think.
>
> Thanks, Thomas
>
More information about the lilliput-dev
mailing list