Call for Discussion: New Project: Lilliput

Fri Mar 12 22:17:37 UTC 2021

On 3/9/21 9:39 AM, Roman Kennke wrote:
> We would like to propose a new project called Lilliput, with the goal 
> of exploring ways to shrink the object header.
>
> Goal:
> 1. Reduce the object header to 64 bits. It may be possible to shrink 
> it down to 32 bits as a secondary goal.
> 2. Make the header layout more flexible, i.e. allow some build-time 
> (possibly even run-time) configuration of how we use the bits.
>
> Motivation:
> In 64-bit Hotspot, Java objects have an object header of 128 bits: a 
> 64 bit multi-purpose header (‘mark’ or ‘lock’) word and a 64-bit class 
> pointer. With typical average object sizes of 5-6 words, this is quite 
> significant: 2 of those words are always taken by the header. If it 
> were possible to reduce the size of the header, we could significantly 
> reduce memory pressure, which directly translates to one or more of 
> (depending what you care about or what your workload does):
>
> - Reduced heap usage
> - Higher object allocation rate
> - Reduced GC activity
> - Tighter packing of objects -> better cache locality
>
> In other words, we could reduce the overall CPU and/or memory usage of 
> all Java workloads, whether it’s a large in-memory database or a small 
> containerized application.
>
>
> The object header is used (and overloaded) for the following purposes:
>
> - Locking: the lower 3 bits are used to indicate the locking state of 
> an object and the higher bits *may* be used to encode a pointer to a 
> stack-allocated monitor or inflated lock object
> - GC: 4 bits are used for tracking the age of each object (in 
> generational collectors). The whole header *may* be used to store 
> forwarding information, depending on the collector
> - Identity hash-code: Up to 32 bits are used to store the identity 
> hash-code
> - Type information: 64 bits are used to point to the Klass that 
> describes the object type
>
>
> We have a wide variety of techniques to explore for allocating and 
> down-sizing header fields:
>
> - Pointers can be compressed, e.g. if we expect a maximum of, say, 
> 8192 classes, we could, with some careful alignment of Klass objects, 
> compress the class pointer down to 13 bits: 2^13=8192 addressable 
> Klasses. Similar considerations apply to stack pointers and monitors.
> - Instead of using pointers, we could use class IDs that index a 
> lookup table
> - We could backfill fields which are known at compile-time (e.g. 
> alignment gap or hidden fields)
> - We could use backfill fields appended to an object after the GC 
> moved it (e.g. for hashcode)
> - We could use side-tables
>
>
> We also have a bewildering number of constraints. To name a few:
> - Performance
> - If we limit e.g. number of classes/monitors/etc that we can encode, 
> we need a way to deal with overflow
> - Requires changes in assembly across all supported platforms (also 
> consider 32 bits)
> - Interaction with other projects like Panama, Loom, maybe Leyden, etc
>
> And a couple of opportunities for further work (possibly outside of 
> this project):
> - If we leave arraylength in its own 64-bit field, perhaps we should 
> consider 64-bit addressable arrays?
> - Improvements to hashcode. Maybe salt it to avoid repetition of 
> nursery objects, maybe expand it to 64 or even 128 bit.
>
>
> I would propose myself as the project lead for Lilliput. :-)
> For initial committers I think we need all expertise in runtime and GC 
> that we can get. From the top of my head I’m thinking of John Rose, 
> Dave Dice, Andrew Dinn, Andrew Haley, Erik Österlund, Aleksey 
> Shipilev, Coleen Phillimore, Stefan Karlsson, Per Liden. Please 
> suggest anybody who you think should be involved in this too. (Or 
> yourself if you want to be in, or if you have no interest in it.)

Yes, please. I'd like to be a committer.  I had done experiments with a 
klass pointer indirection, which could effectively be a klass index. The 
performance numbers for throughput weren't that bad, but we didn't have 
the bandwidth to investigate further at the time. Also I kept running 
into solaris _sbrk malloc at the time.

In JDK 18, we're going to be removing BiasedLocking so that removes one 
use of the markWord.

Thanks,
Coleen

>
>
> My initial work plan is to:
>
> - Brainstorm, collect ideas and propose techniques in the Wiki
> - Come up with a proof of concept as quickly as possible
>   - Use ZGC: no header usage
>   - Use existing class-pointer compression
>   - Shrink hashcode
> - Work from there, decide-as-we-go with insights from previous steps
>
>
> Please let me know what you think!
>
> Thanks,
> Roman
>