[master] RFR: JDK-8325104: Lilliput: Shrink Classpointers [v3]

Tue Mar 26 14:01:38 UTC 2024

On Mon, 25 Mar 2024 14:51:14 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> Hi,
>> 
>> I wanted to get input on the following improvement for Lilliput. Testing is still ongoing, but things look really good, so this patch is hopefully near its final form (barring any objections from reviewers, of course).
>> 
>> Note: I have a companion patch prepared for upstream, minus the markword changes. I will attempt to get that one upstream quickly in order to not have a large delta between upstream and lilliput, especially in Metaspace.
>> 
>> ## High-Level Overview
>> 
>> (for a short sequence of slides, please see https://github.com/tstuefe/fosdem24/blob/master/classpointers-and-liliput.pdf - these accompanied a talk we held at FOSDEM 24).
>> 
>> We want to reduce the bit size of narrow Klass to free up bits in the MarkWord. 
>> 
>> We cannot just reduce the Klass encoding range size (well, we could, and maybe we will later, but for now we decided not to). We instead increase the alignment Klass is stored at, and use that alignment shadow to store other information.
>> 
>> In other words, this patch changes the narrow Klass Pointer to a Klass ID, since now (almost) every value in its value range points to a different class. Therefore, we use the value range of nKlass much more efficiently.
>> 
>> We then use the newly freed bits in the MarkWord to restore the iHash to 31 bits: 
>> 
>> 
>> [ 22-bit nKlass | 31-bit iHash | 4 free bits | age | fwd | lck ]
>> 
>> nKlass gets reduced to 22 bits. Identity hash gets re-inflated to 31 bits. Preceding iHash are now 4 unused bits. Rest is unchanged.
>> 
>> (Note: I originally wanted to swap iHash and nKlass such that either of them could be loaded with a 32-bit load, but I found that tricky since C2 seems to rely on the nKlass offset in the Markword being > 0.)
>> 
>> ## nKlass reduction:
>> 
>> The reduction in nKlass size is made by only storing them at 10-bit aligned addresses. That alignment (1KB) works well in practice since Klass - although var sized - typically is between 512 bytes and 1KB in size. Outliers are possible, but the size distribution is bell-curvish [1], so far-away outliers are very rare. 
>> 
>> To not lose memory to alignment waste, metaspace is reshaped to handle arbitrarily aligned allocations efficiently. Basically, we allow the non-Klass arena of a class loader to steal the alignment waste storage from the class arena. So, alignment waste blocks are filled with non-Klass metadata. That works very well in practice since non-Klass metadata is numerous and fine-granular compared to the big Klass bloc...
>
> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 20 commits:
> 
>  - Roman feedback - small stuff
>  - Merge branch 'master' into Smaller-ClassPointers
>  - revert COH archive generation
>  - Remove files that accidentally slipped in
>  - Merge branch 'master' into Smaller-ClassPointers
>  - Merge
>  - Merge commit 'c1281e6b45ed167df69d29a6039d81854c145ae6~1' into Smaller-ClassPointers
>  - Fix Typo
>  - Better CDS arch generation
>  - Fix error in COH archive generation
>  - ... and 10 more: https://git.openjdk.org/lilliput/compare/b2fcfb73...1260f2d6

Hi Thomas, I've started to read through the patch but are far from done. I'm sending a few early comments / questions / suggestions.

src/hotspot/share/oops/compressedKlass.cpp line 49:

> 47: address CompressedKlassPointers::_base = (address)-1;
> 48: int CompressedKlassPointers::_shift = -1;
> 49: size_t CompressedKlassPointers::_range = (size_t)-1;

Doesn't this put 0x00000000FFFFFFFF in this size_t? I guess this works with the code using it, but it is not obvious that this is what we intended to put into the _range variable as the (uninitialized) value.

Related to this. How important is it to do these initialized checks? It seems to add casts and a bit odd code. When this is stable, wouldn't it be nice to clean this up. That would also make _tiny_cp a bool instead of a tri-bool.

src/hotspot/share/oops/compressedKlass.hpp line 42:

> 40: 
> 41:   // Tiny-class-pointer mode
> 42:   static int _tiny_cp; // -1, 0=true, 1=false

This comment seems wrong. I think it should be `0=false, 1=true`.

src/hotspot/share/oops/compressedKlass.hpp line 50:

> 48:   // Narrow klass pointer bits for an unshifted narrow Klass pointer.
> 49:   static constexpr int narrow_klass_pointer_bits_legacy = 32;
> 50:   static constexpr int narrow_klass_pointer_bits_tinycp = 22;

It might be nice to have a consistent naming between `*_tinycp` and `_tiny_cp`.

src/hotspot/share/oops/compressedKlass.hpp line 128:

> 126: 
> 127:   // The maximum possible shift; the actual shift employed later can be smaller (see initialize())
> 128:   static int max_shift()                 { check_init(_max_shift); return _max_shift; }

FWIW, you could get rid of the variable name duplication if we changed this to be something like `{ return read_check_init(_max_shift); }` instead.

-------------

PR Review: https://git.openjdk.org/lilliput/pull/128#pullrequestreview-1960456794
PR Review Comment: https://git.openjdk.org/lilliput/pull/128#discussion_r1539272989
PR Review Comment: https://git.openjdk.org/lilliput/pull/128#discussion_r1539275135
PR Review Comment: https://git.openjdk.org/lilliput/pull/128#discussion_r1539277333
PR Review Comment: https://git.openjdk.org/lilliput/pull/128#discussion_r1539280234