Call for Discussion: New Project: Lilliput

Tue Mar 9 14:39:43 UTC 2021

We would like to propose a new project called Lilliput, with the goal of 
exploring ways to shrink the object header.

Goal:
1. Reduce the object header to 64 bits. It may be possible to shrink it 
down to 32 bits as a secondary goal.
2. Make the header layout more flexible, i.e. allow some build-time 
(possibly even run-time) configuration of how we use the bits.

Motivation:
In 64-bit Hotspot, Java objects have an object header of 128 bits: a 64 
bit multi-purpose header (‘mark’ or ‘lock’) word and a 64-bit class 
pointer. With typical average object sizes of 5-6 words, this is quite 
significant: 2 of those words are always taken by the header. If it were 
possible to reduce the size of the header, we could significantly reduce 
memory pressure, which directly translates to one or more of (depending 
what you care about or what your workload does):

- Reduced heap usage
- Higher object allocation rate
- Reduced GC activity
- Tighter packing of objects -> better cache locality

In other words, we could reduce the overall CPU and/or memory usage of 
all Java workloads, whether it’s a large in-memory database or a small 
containerized application.

The object header is used (and overloaded) for the following purposes:

- Locking: the lower 3 bits are used to indicate the locking state of an 
object and the higher bits *may* be used to encode a pointer to a 
stack-allocated monitor or inflated lock object
- GC: 4 bits are used for tracking the age of each object (in 
generational collectors). The whole header *may* be used to store 
forwarding information, depending on the collector
- Identity hash-code: Up to 32 bits are used to store the identity hash-code
- Type information: 64 bits are used to point to the Klass that 
describes the object type

We have a wide variety of techniques to explore for allocating and 
down-sizing header fields:

- Pointers can be compressed, e.g. if we expect a maximum of, say, 8192 
classes, we could, with some careful alignment of Klass objects, 
compress the class pointer down to 13 bits: 2^13=8192 addressable 
Klasses. Similar considerations apply to stack pointers and monitors.
- Instead of using pointers, we could use class IDs that index a lookup 
table
- We could backfill fields which are known at compile-time (e.g. 
alignment gap or hidden fields)
- We could use backfill fields appended to an object after the GC moved 
it (e.g. for hashcode)
- We could use side-tables

We also have a bewildering number of constraints. To name a few:
- Performance
- If we limit e.g. number of classes/monitors/etc that we can encode, we 
need a way to deal with overflow
- Requires changes in assembly across all supported platforms (also 
consider 32 bits)
- Interaction with other projects like Panama, Loom, maybe Leyden, etc

And a couple of opportunities for further work (possibly outside of this 
project):
- If we leave arraylength in its own 64-bit field, perhaps we should 
consider 64-bit addressable arrays?
- Improvements to hashcode. Maybe salt it to avoid repetition of nursery 
objects, maybe expand it to 64 or even 128 bit.

I would propose myself as the project lead for Lilliput. :-)
For initial committers I think we need all expertise in runtime and GC 
that we can get. From the top of my head I’m thinking of John Rose, Dave 
Dice, Andrew Dinn, Andrew Haley, Erik Österlund, Aleksey Shipilev, 
Coleen Phillimore, Stefan Karlsson, Per Liden. Please suggest anybody 
who you think should be involved in this too. (Or yourself if you want 
to be in, or if you have no interest in it.)

My initial work plan is to:

- Brainstorm, collect ideas and propose techniques in the Wiki
- Come up with a proof of concept as quickly as possible
   - Use ZGC: no header usage
   - Use existing class-pointer compression
   - Shrink hashcode
- Work from there, decide-as-we-go with insights from previous steps

Please let me know what you think!

Thanks,
Roman