RFR: 8139457: Array bases are aligned at HeapWord granularity [v50]

Albert Mingkun Yang ayang at openjdk.org
Tue Aug 15 16:21:30 UTC 2023


On Tue, 15 Aug 2023 15:38:09 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

>> src/hotspot/share/oops/arrayOop.hpp line 94:
>> 
>>> 92:   // Returns the offset of the first element.
>>> 93:   static int base_offset_in_bytes(BasicType type) {
>>> 94:     size_t hs = header_size_in_bytes();
>> 
>> Previously, the memory representation of an array is `[header] + [payload]`. The "header" part includes the necessary alignment gap if any. (IOW, the end of the header and the beginning of the payload is adjacent.) The public API has `header_size(...)` and `base_offset_in_bytes(...)`, which corresponds to the header and payload, respectively.
>> 
>> With the new change, the header is *not* word-alignment any more. Could the API (and its impl) be adjusted to maintain the original model `array = [header] + [payload]`. Sth like, `header_size_in_bytes(...)` includes the alignment gap and replaces `header_size` in the public API.
>
> I am not sure what you are asking here. We *do* have header_size_in_bytes(), but its meaning is just the header, without gap (if any). The gap is included in base_offset_in_bytes(), which is the offset at which the payload starts, which seems consistent to other similar methods. We could make header_size_in_bytes() so that it includes the gap, but then 1. it would be the same as base_offset_in_bytes() and 2. we still need 'just the header' in places, so we would have to have another method for that. Also, I am not sure where we would need header_size_in_bytes() with gap, because that's usually the places where base_offset_in_bytes() seems semantically better suited.
> 
> (Also, note that we still have header_size(BasicType) which returns an word-up-aligned header size which is used by a number of places that require this like minimum and maximum size computations; it's kinda ugly, but having to have to align-up the byte-size in all those places has been ugly as well, I have attempted that in an earlier version of this PR)

On master, `header_size` and `base_offset_in_bytes` are two public API; they refer to the header and payload of an array and have the invariant, header-size-in-bytes == base-offset-in-bytes, because `base_offset_in_bytes(...)` is impled as `header_size(...) * HeapWordSize`.

With the new change, `header_size` and `base_offset_in_bytes` are still two public API, but the invariant is broken, `header_size(...) * HeapWordSize != base_offset_in_bytes(...)`.  I am asking if we can expose `header_size_in_bytes` as the public API so that invariant is kept.

> 1. it would be the same as base_offset_in_bytes()

True; the impl on master is just a word-to-byte translation. (They being the same is an advantage, reinforcing the invariant, IMO.)

> 2. we still need 'just the header' in places

Is that a public API? I think the concept of "header" contains everything from array-start to payload-start, from the perspective of users (of this class).

> it's kinda ugly, but having to have to align-up the byte-size in all those places has been ugly as well

I am more concerned about the fact that the public API `header_size` covers more than the header. Having the header-size aligned up to word-size (or anything) should be sth the caller decides.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1294837287


More information about the hotspot-dev mailing list