Java value layout constants
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Tue Nov 30 13:05:36 UTC 2021
Hi all,
there has been a twist in this story. When looking at the PR [1], Paul
and I were reminded of a failure mode which occurred last year [2],
where accessing double elements copied into a segment backed by a byte[]
could sometimes fail (e.g. on x86 platform) because of misaligned access.
The moral of the story (more details below) is that enforcing alignment
on heap segments is hit-and-miss in the current implementation, and can
reveal sharp edges (e.g. some operation might reveal alignment decisions
which might be JVM implementation dependent).
Because of this, I'd like to revise our plan, and leave Java layout
constants as they are now (e.g. unaligned) in Java 18, while we fix
handling of alignment and heap segments under the hood. Given the
timeframe, this seems the most sensible choice.
If you are interested in more details, please continue reading below.
The issue in [2] revealed that, while on x64 platform we can rely on the
first element of an array T[], for any T to be 64-bit aligned, that is
not the case in x86. The issue has to do with how array object headers
are laid out. The layout of a Java array is typically defined as follows
(see [3])
1. 4-byte mark
2. 4/8-byte class pointer
3. 4-byte length
4. optional padding
5. elements
Now, in x64 platforms, the class pointer in (2) is typically 64-bits.
This means that the header part of an array is 16 bytes in total, which
in turn means that the first element of the array is always 8-byte
aligned (because all heap objects are at least 8-byte aligned,
regardless of platform, see [4]).
What about x86? Well, in x86 the class pointer is only 4 bytes, which
means the header is 12 bytes. This gives a 32-bit VM more options: if
the array is a int[], a VM might just store that element at offset 12,
as that offset is 4-byte aligned. But if the array is a long[], the VM
needs to insert some padding, so that the first element of the array
will be at least 8-byte aligned (otherwise atomic operation will fail).
This logic is reflected in [5], where the VM makes sure that for long[]
and double[], elements are always (regardless of 32 vs. 64 bits) stored
at offsets that are 64-bit aligned.
This obviously creates an asymmetry: we could create a memory segment
backed by a double[], copy its elements into a segment backed by a
byte[], and then try to retrieve 64-bit aligned double values from the
second memory segment. This operation will succeed on 64-bit platforms
(as byte[] and double[] have same alignment constraints there), but will
fail spuriously on x86 platforms. But this is not just about 32bit vs
64bit - other VM implementations might have different opinion on what
alignment of array elements should be, and enhancements such as those
proposed by Project Lilliput [6] can have profound implications in this
area.
Where does this leave us? Checking for alignment is definitively useful
to prevent bugs - but the simple check carried out by the memory segment
API ends up *leaking* implementation decisions as to how array elements
are laid out. Ideally we'd like to have an API whose failures are
predictable, so the status quo isn't great. Note that the real issue
here is not whether layout constants should be aligned or not - or what
their alignment (if any) should be. The real issue is that the memory
segment API does not enforce alignment in all situations, especially
around memory copy. It is in fact possible to copy elements from a
segment backed from an array that has _more_ alignment constraints into
a segment backed by an array that has _less_ alignment constraints, w/o
errors, which is a potential source of (alignment) bugs.
We believe (thanks John!) we have a story to generalize the alignment
checks to heap segments, in a way that no implementation-dependent
information is leaked - the basic idea is to observe that native
segments and heap segments are different beasts: when working with a
native segment we can always know the alignment properties of any
address inside that segment (the alignment is a property of the bit
pattern of that address - e.g. how many zeros appear at the end of the
address). But heap segments addresses are *virtualized* - so there is
nothing for the API to check (e.g. heap segments do not have a base
address, so to speak). In order to have reliable alignment checks which
work on both native and heap segments, our API should assume that memory
addresses produced by an heap segments can never be more aligned than
the element size of the Java array backing that heap segment. This means
that if we have a segment backed by a short[], the *maximum alignment*
constraint supported by this segment is, for instance, 2. If we try to
store an aligned int inside this segment, an error should occur (whether
the store occur as a result of dereference, or bulk-copy), as there is
no guarantee that this operation is well-defined across all platforms.
Conversely, a native segment has _no_ maximum alignment.
This strategy allows the API to implement alignment checks on heap
segments in a predictable fashion, so that the outcome of an alignment
check does not depend on the assumptions of a particular architecture,
or on the set of enabled VM features. When this underlying issue is
fixed, we can then have a discussion as to whether layout constants in
ValueLayout should be aligned-by-default or not. Having aligned layout
constants might be useful to prevent bugs, but will limit the
flexibility of the API. But that's a decision for another day.
[1] - https://git.openjdk.java.net/jdk/pull/6589
[2] - https://bugs.openjdk.java.net/browse/JDK-8255343
[3] -
https://github.com/openjdk/jdk/blob/master/src/hotspot/share/oops/arrayOop.hpp#L35
[4] -
https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/globals.hpp#L132
[5] -
https://github.com/openjdk/jdk/blob/master/src/hotspot/share/oops/arrayOop.hpp#L70
[6] - https://openjdk.java.net/projects/lilliput/
Maurizio
On 25/11/2021 15:52, Maurizio Cimadamore wrote:
> Hi,
> This is a followup of the disccussion that started in [1]. In the new
> changes slated for Java 18, the set of Java value layout constants are
> all byte-aligned (e.g. alignment constraints are not set). The
> motivation for this is mostly historical (but there's also a
> performance twist, see below): the dereference primitives in
> MemoryAccess used to setup var handles based on non-aligned layouts.
> So, to preserve compatibility with what we had before, we opted to
> "relax" alignment constraints on the JAVA_XYZ layout constants in
> ValueLayout. During the development of the new dereference API, some
> issues arised around alignment checks and memory copy [2]; which also
> contributed to consolidate the feeling that Java layout constants
> should be unaligned.
>
> Now, while it's always possible, for clients, to go back to the
> desired alignment constraints (e.g. by defining custom layout
> constants), from the discussion it emerged that it can be somewhat
> confusing/surprising having a layout constant called JAVA_INT, whose
> alignment is not the VM alignment for a Java int value.
>
> For this reason, I'd like to propose a small tweak, which would
> essentially revert alignment constraints for Java layout constants to
> what they were in 17. In other words, let's keep the "good" JAVA_XYZ
> names for the _true_ Java layouts (including alignment as seen by VM).
> If clients want to create unaligned constants they can do so, as they
> can also create big-endian constants where needed. In the majority of
> cases, since access will be aligned (for performance reasons), this
> will not really change much for clients. But some of those clients
> that need to pack data structures more (Lucene?) will need to define
> their own packed/unaligned layout constants.
>
> Does that seem like an acceptable compromise?
>
> A patch for these changes is available here:
>
> https://github.com/mcimadamore/jdk/tree/value_layout_align
>
> While testing it, I was reminded (once more) that access with
> alignment constraints is currently slower than access w/o alignment
> constraints - which has to do with C2 not hoisting alignment checks in
> cases like this:
>
> ((segmentBaseAddress + accessedOffset) & alignmentMask) == 0
>
> Here, segmentBaseAddress is a loop invariant, and the accessedOffset
> depends on the loop variable. So, it is in principle possible for the
> VM to hoist the check for baseAddress and to eliminate the alignment
> check for the offset (which would come from BCE analysis). But this is
> not how things work today. The patch works around this, by using
> different var handles for when the accessed offset is provably aligned
> (e.g. when using the getAtIndex/setAtIndex APIs). Even with those
> workarounds, calling getAtIndex/setAtIndex on a MemoryAddress is still
> slower than on a MemorySegment, because of the way in which we try to
> workaround the long loop optimization problem. Luckily a fix for that
> problem [3] has been integrated in JDK 18, which means we will remove
> these implementation workaround, which will help making performance
> more stable across the board.
>
> If the changes in this patch seem good, I'm happy to try and integrate
> this into 18.
>
> Cheers
> Maurizio
>
> [1] -
> https://mail.openjdk.java.net/pipermail/panama-dev/2021-November/015805.html
> [2] -
> https://github.com/openjdk/panama-foreign/pull/555#issuecomment-865115787
> [3] - https://github.com/openjdk/jdk/pull/2045
>
>
>
More information about the panama-dev
mailing list