Sharing the markword (aka Valhalla's markword use)

Fri Mar 8 14:59:14 UTC 2024

(Response in line)

From: Kennke, Roman <rkennke at amazon.de>
Date: Wednesday, March 6, 2024 at 1:43 PM
To: Dan Heidinga <dan.heidinga at oracle.com>
Cc: Thomas Stüfe <thomas.stuefe at gmail.com>, valhalla-dev at openjdk.org <valhalla-dev at openjdk.org>, lilliput-dev at openjdk.org <lilliput-dev at openjdk.org>
Subject: [External] : Re: Re: Sharing the markword (aka Valhalla's markword use)
Hi Dan,

> 4 free bits – just what we need! =)

Yes, but my questions about constraints remain.
Apologies.  I thought I had responded in the other thread.  Three of the bits can be rediscovered / determined from the Klass metadata and I’m not aware of any requirements on the specific bit position.  The 4th bit – the is_larval bit – needs to be preserved and we envision uses of it by future (de)duplication processes that may run as part of the gc.  Particular bit position doesn’t matter as long as we can find the bit.
Does that answer your questions?

>  One of the challenges for J9 over the last few years has been finding header bits.  J9 went to a “single word header”, uses the has_been_hashed | has_been_moved trick for identity hash, and has a more complicated scheme for which classes get lock words (or not) and where they’re put per class (layout gaps are a pretty common choice).  This has limited the ability of new features to get header bits without decreasing gc age bits or stealing a bit from existing use cases.
>  Small headers are good but how small is small enough?  This isn’t a Valhalla concern per se, but a general observation that if we are bumping the limit on bits now, we are limiting all future project’s ability to use bits.  There’s a trade-off here that my experience fighting for bits on J9 says we should be making cautiously.

I understand. My point of view is that it doesn’t make a lot of sense to not do a tremendously useful optimization like 8- or 4-byte headers, just because we may need something in the future. But I agree that we should try to make it flexible enough to accommodate future needs, or at least make it relatively easy.
I’m offering some caution on the value of shrinking the header as small as possible today as it severally restricts future options.  I know you and Thomas and others are taking a principled look at this space to find the best encodings and I want to be a voice for keeping additional slack in the encoding for future projects (Java’s not done after all =)

Let’s first lay out a rough plan about what we’re planning to do with class-pointers.

Currently, compressed class-pointers are 32 bits wide, and from what I can tell, is more than enough to address as many classes as anybody would want.

For 8-byte headers, our plan is to use 22 bits and still address enough classes (see Thomas’ part of the FOSDEM presentation). This leaves enough bits for 31 bit hashcode, 4 Valhalla bits, 4 age bits, 1 self-forwarding bits and 2 lock bits.
That’s great.  If we start by planning on 4 Valhalla bits then we can always look at clever encodings as follow tasks when the need for more bits surfaces.
We have a plan how we can make the class-pointer much smaller. Based on the observation that most workloads only require a few 10000 classes, let’s say 15 bits is ‘enough’ and not wasteful. Workloads that require more often generate those classes, which often don’t have many instances (e.g. only 1). In order to address arbitrary many classes, we would require one more bit that indicates that the class-pointer should be loaded from a dedicated field - this could be compressed 32 bits, or uncompressed 64 bits, or somehow be combined with the 15 class-bits from the header, we can determine the details later. We can also determine the exact number of bits later. The point is that we can make the class-pointer small enough to accommodate, say, 99% of all workloads and flexible enough for everything else. Using those 16 class-bits, and assuming 2(hash)+4(Valhalla)+4(age)+3(locking), we’d still have a reserve of 3 bits. We could use them for more classes, or reserve them for future use. It does not seem unreasonable to make the class-bits even smaller, in case we ever need more bits.

Also, if we ever need a bit only for arrays (wasn’t one of the Valhalla bits only for arrays?), we’d have one bit that we can use from the array-length field.
Two of the bits are for arrays and in the Valhalla we’ve also floated the idea of clever encodings like using the array-length.  The challenge is that such encodings float through the rest of the system and have surprising knock on effects.  As an example, we looked at overloading the high bit of array length in J9 and, although I don’t recall the details, there was an issue with some array bounds checks getting slower as they required additional masks.
It sounds like we have a path forward with bits available for Valhalla without affecting Lilliput’s plans.  Do you see any concerns not listed here?
--Dan

Let me know what you think.

Roman

>  --Dan
>  From: Thomas Stüfe <thomas.stuefe at gmail.com>
> Date: Tuesday, March 5, 2024 at 11:29 AM
> To: Dan Heidinga <dan.heidinga at oracle.com>
> Cc: valhalla-dev at openjdk.org <valhalla-dev at openjdk.org>, lilliput-dev at openjdk.org <lilliput-dev at openjdk.org>, Kennke, Roman <rkennke at amazon.de>
> Subject: [External] : Re: Sharing the markword (aka Valhalla's markword use)
> Hi Dan,
>  In addition to Roman's answer, we plan to reduce the Klasspointer to 22 bits [1]. For 64-bit headers, this would give us 31-bit i-hash back and still leave us with 4 unused bits.
>  Unfortunately, outside of our heads and the FOSDEM talk [2] we gave this year I think this is nowhere documented yet. I feel guilty but have been swamped since returning from FOSDEM.
>  [1] https://urldefense.com/v3/__https://github.com/openjdk/lilliput/pull/128__;!!ACWV5N9M2RV99hQ!NEUrYwA0GRM8U1H-5SrLV7vXMk0nr9mDhhPK7lnoKFZQKloskZA2uEpPbroQKnz_-8DCexbKx7peCEZfQ9I9$<https://urldefense.com/v3/__https:/github.com/openjdk/lilliput/pull/128__;!!ACWV5N9M2RV99hQ!NEUrYwA0GRM8U1H-5SrLV7vXMk0nr9mDhhPK7lnoKFZQKloskZA2uEpPbroQKnz_-8DCexbKx7peCEZfQ9I9$>
> [2] https://urldefense.com/v3/__https://fosdem.org/2024/schedule/event/fosdem-2024-3015-project-lilliput-compact-object-headers/__;!!ACWV5N9M2RV99hQ!NEUrYwA0GRM8U1H-5SrLV7vXMk0nr9mDhhPK7lnoKFZQKloskZA2uEpPbroQKnz_-8DCexbKx7peCCetCSuA$<https://urldefense.com/v3/__https:/fosdem.org/2024/schedule/event/fosdem-2024-3015-project-lilliput-compact-object-headers/__;!!ACWV5N9M2RV99hQ!NEUrYwA0GRM8U1H-5SrLV7vXMk0nr9mDhhPK7lnoKFZQKloskZA2uEpPbroQKnz_-8DCexbKx7peCCetCSuA$>
>  Cheers, Thomas
>  On Tue, Mar 5, 2024 at 4:06 PM Dan Heidinga <dan.heidinga at oracle.com> wrote:
> (Cross-posting to both valhalla-dev and lilliput-dev)
>  Valhalla’s markword usage and Lilliput’s desire to shrink the object header require some careful collaboration to find a design that let’s both projects move forward.  I’d like to lay out the current Valhalla markword use so that we can look at how it fits with Lilliput’s plans and ensure we can make the right trade-offs together.  There may be clever encodings (reusing the locking bits?) but it makes sense to do that together – hence the cross-post.
>  Valhalla uses 4 markword bits [0], two for instances and two for arrays.  The bits are:
>  * is_larval: This is bit is dynamic and indicates the state change from when a value instance can be updated (during construction) to when it becomes immutable.  We need this bit to ensure correctness of off-label construction and debugging apis as well as to ensure values being constructed are never aliased with fully constructed values.
>  * is_value_type: this bit is static and is used to identify value instances.  This bit speeds acmp and other identity sensitive operations so that non-value code doesn’t experience a regression.  Before values, acmp could use pointer comparison to test if two instance were the same.  With values a “substitutability” test is required.
>  For value instances, neither the hash code nor their locking bits are required.  Value hash codes are computed similarly to the substitutability test and values cannot be locked or synchronized on.
>  Arrays of values are identity objects and, like other reference array types, are compatible with Object[] or interface arrays (assuming the values implement the interface).
>  We use two bits to identify the special cases of arrays:
>  * is_flat_array: Indicates that the array elements have been flattened and that data must be copied in/out of the array when accessing the elements.
>  * is_null_free_array: indicates that the array rejects null elements and will throw an exception when code attempts to store null into the array.
>  Arrays – being identity objects – need both their hash codes and locking bits.
>  This is what Valhalla is using the current prototypes.  Early performance experiments led us to this design and we’re working on reconfirming those results.
>  How does this approach fit with the current Lilliput plans?
>  --Dan
>  [0] https://urldefense.com/v3/__https://github.com/openjdk/valhalla/blob/1f410430df6ef023b82d971a10ee4f0f8dfa2d6b/src/hotspot/share/oops/markWord.hpp*L69__;Iw!!ACWV5N9M2RV99hQ!NEUrYwA0GRM8U1H-5SrLV7vXMk0nr9mDhhPK7lnoKFZQKloskZA2uEpPbroQKnz_-8DCexbKx7peCB8WCdsM$<https://urldefense.com/v3/__https:/github.com/openjdk/valhalla/blob/1f410430df6ef023b82d971a10ee4f0f8dfa2d6b/src/hotspot/share/oops/markWord.hpp*L69__;Iw!!ACWV5N9M2RV99hQ!NEUrYwA0GRM8U1H-5SrLV7vXMk0nr9mDhhPK7lnoKFZQKloskZA2uEpPbroQKnz_-8DCexbKx7peCB8WCdsM$>

Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/valhalla-dev/attachments/20240308/7b8deb36/attachment.htm>