Call for Discussion: New Project: Lilliput

Fri Mar 12 20:01:10 UTC 2021

The HotSpot group will gladly participate in sponsoring this project.

Regards,
Vladimir Kozlov

On 3/9/21 6:39 AM, Roman Kennke wrote:
> We would like to propose a new project called Lilliput, with the goal of exploring ways to shrink the object header.
> 
> Goal:
> 1. Reduce the object header to 64 bits. It may be possible to shrink it down to 32 bits as a secondary goal.
> 2. Make the header layout more flexible, i.e. allow some build-time (possibly even run-time) configuration of how we use 
> the bits.
> 
> Motivation:
> In 64-bit Hotspot, Java objects have an object header of 128 bits: a 64 bit multi-purpose header (‘mark’ or ‘lock’) word 
> and a 64-bit class pointer. With typical average object sizes of 5-6 words, this is quite significant: 2 of those words 
> are always taken by the header. If it were possible to reduce the size of the header, we could significantly reduce 
> memory pressure, which directly translates to one or more of (depending what you care about or what your workload does):
> 
> - Reduced heap usage
> - Higher object allocation rate
> - Reduced GC activity
> - Tighter packing of objects -> better cache locality
> 
> In other words, we could reduce the overall CPU and/or memory usage of all Java workloads, whether it’s a large 
> in-memory database or a small containerized application.
> 
> 
> The object header is used (and overloaded) for the following purposes:
> 
> - Locking: the lower 3 bits are used to indicate the locking state of an object and the higher bits *may* be used to 
> encode a pointer to a stack-allocated monitor or inflated lock object
> - GC: 4 bits are used for tracking the age of each object (in generational collectors). The whole header *may* be used 
> to store forwarding information, depending on the collector
> - Identity hash-code: Up to 32 bits are used to store the identity hash-code
> - Type information: 64 bits are used to point to the Klass that describes the object type
> 
> 
> We have a wide variety of techniques to explore for allocating and down-sizing header fields:
> 
> - Pointers can be compressed, e.g. if we expect a maximum of, say, 8192 classes, we could, with some careful alignment 
> of Klass objects, compress the class pointer down to 13 bits: 2^13=8192 addressable Klasses. Similar considerations 
> apply to stack pointers and monitors.
> - Instead of using pointers, we could use class IDs that index a lookup table
> - We could backfill fields which are known at compile-time (e.g. alignment gap or hidden fields)
> - We could use backfill fields appended to an object after the GC moved it (e.g. for hashcode)
> - We could use side-tables
> 
> 
> We also have a bewildering number of constraints. To name a few:
> - Performance
> - If we limit e.g. number of classes/monitors/etc that we can encode, we need a way to deal with overflow
> - Requires changes in assembly across all supported platforms (also consider 32 bits)
> - Interaction with other projects like Panama, Loom, maybe Leyden, etc
> 
> And a couple of opportunities for further work (possibly outside of this project):
> - If we leave arraylength in its own 64-bit field, perhaps we should consider 64-bit addressable arrays?
> - Improvements to hashcode. Maybe salt it to avoid repetition of nursery objects, maybe expand it to 64 or even 128 bit.
> 
> 
> I would propose myself as the project lead for Lilliput. :-)
> For initial committers I think we need all expertise in runtime and GC that we can get. From the top of my head I’m 
> thinking of John Rose, Dave Dice, Andrew Dinn, Andrew Haley, Erik Österlund, Aleksey Shipilev, Coleen Phillimore, Stefan 
> Karlsson, Per Liden. Please suggest anybody who you think should be involved in this too. (Or yourself if you want to be 
> in, or if you have no interest in it.)
> 
> 
> My initial work plan is to:
> 
> - Brainstorm, collect ideas and propose techniques in the Wiki
> - Come up with a proof of concept as quickly as possible
>    - Use ZGC: no header usage
>    - Use existing class-pointer compression
>    - Shrink hashcode
> - Work from there, decide-as-we-go with insights from previous steps
> 
> 
> Please let me know what you think!
> 
> Thanks,
> Roman
>