RFR: 8139457: Array bases are aligned at HeapWord granularity [v50]
Albert Mingkun Yang
ayang at openjdk.org
Tue Aug 15 16:21:30 UTC 2023
On Tue, 15 Aug 2023 15:38:09 GMT, Roman Kennke <rkennke at openjdk.org> wrote:
>> src/hotspot/share/oops/arrayOop.hpp line 94:
>>
>>> 92: // Returns the offset of the first element.
>>> 93: static int base_offset_in_bytes(BasicType type) {
>>> 94: size_t hs = header_size_in_bytes();
>>
>> Previously, the memory representation of an array is `[header] + [payload]`. The "header" part includes the necessary alignment gap if any. (IOW, the end of the header and the beginning of the payload is adjacent.) The public API has `header_size(...)` and `base_offset_in_bytes(...)`, which corresponds to the header and payload, respectively.
>>
>> With the new change, the header is *not* word-alignment any more. Could the API (and its impl) be adjusted to maintain the original model `array = [header] + [payload]`. Sth like, `header_size_in_bytes(...)` includes the alignment gap and replaces `header_size` in the public API.
>
> I am not sure what you are asking here. We *do* have header_size_in_bytes(), but its meaning is just the header, without gap (if any). The gap is included in base_offset_in_bytes(), which is the offset at which the payload starts, which seems consistent to other similar methods. We could make header_size_in_bytes() so that it includes the gap, but then 1. it would be the same as base_offset_in_bytes() and 2. we still need 'just the header' in places, so we would have to have another method for that. Also, I am not sure where we would need header_size_in_bytes() with gap, because that's usually the places where base_offset_in_bytes() seems semantically better suited.
>
> (Also, note that we still have header_size(BasicType) which returns an word-up-aligned header size which is used by a number of places that require this like minimum and maximum size computations; it's kinda ugly, but having to have to align-up the byte-size in all those places has been ugly as well, I have attempted that in an earlier version of this PR)
On master, `header_size` and `base_offset_in_bytes` are two public API; they refer to the header and payload of an array and have the invariant, header-size-in-bytes == base-offset-in-bytes, because `base_offset_in_bytes(...)` is impled as `header_size(...) * HeapWordSize`.
With the new change, `header_size` and `base_offset_in_bytes` are still two public API, but the invariant is broken, `header_size(...) * HeapWordSize != base_offset_in_bytes(...)`. I am asking if we can expose `header_size_in_bytes` as the public API so that invariant is kept.
> 1. it would be the same as base_offset_in_bytes()
True; the impl on master is just a word-to-byte translation. (They being the same is an advantage, reinforcing the invariant, IMO.)
> 2. we still need 'just the header' in places
Is that a public API? I think the concept of "header" contains everything from array-start to payload-start, from the perspective of users (of this class).
> it's kinda ugly, but having to have to align-up the byte-size in all those places has been ugly as well
I am more concerned about the fact that the public API `header_size` covers more than the header. Having the header-size aligned up to word-size (or anything) should be sth the caller decides.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1294837287
More information about the hotspot-dev
mailing list