RFR: 8331865: Consolidate size and alignment checks in LayoutPath
Maurizio Cimadamore
mcimadamore at openjdk.org
Thu May 16 11:03:30 UTC 2024
When creating a nested memory access var handle, we ensure that the segment is accessed at the correct alignment for the root layout being accessed. But we do not ensure that the segment has at least the size of the accessed root layout. Example:
MemoryLayout LAYOUT = sequenceLayout(2, structLayout(JAVA_INT.withName("x"), JAVA_INT.withName("y")));
VarHandle X_HANDLE = LAYOUT.varHandle(PathElement.sequenceElement(), PathElement.groupElement("x"));
If I have a memory segment whose size is 12, I can successfully call X_HANDLE on it with index 1, like so:
MemorySegment segment = Arena.ofAuto().allocate(12);
int x = (int)X_HANDLE.get(segment, 0, 1);
This seems incorrect: after all, the provided segment doesn't have enough space to fit _two_ elements of the nested structs.
This PR adds checks to detect this kind of scenario earlier, thus improving usability. To achieve this we could, in principle, just tweak `LayoutPath::checkEnclosingLayout` and add the required size check.
But doing so will add yet another redundant check on top of the other already added by [pull/19124](https://git.openjdk.org/jdk/pull/19124). Instead, this PR rethinks how memory segment var handles are created, in order to eliminate redundant checks.
The main observation is that size and alignment checks depend on an _enclosing_ layout. This layout *might* be the very same value layout being accessed (this is the case when e.g. using `JAVA_INT.varHandle()`). But, in the general case, the accessed value layout and the enclosing layout might differ (e.g. think about accessing a struct field).
Furthermore, the enclosing layout check depends on the _base_ offset at which the segment is accessed, _prior_ to any index computation that occurs if the accessed layout path has any open elements. It is important to notice that this base offset is only available when looking at the var handle that is returned to the user. For instance, an indexed var handle with coordinates:
(MemorySegment, long /* base */, long /* index 1 */, long /* index 2 */, long /* index 3 */)
Is obtained through adaptation, by taking a more basic var handle of the form:
(MemorySegment, long /* offset */)
And then injecting the result of the index multiplication into `offset`. As such, we can't add an enclosing layout check inside the var handle returned by `VarHandles::memorySegmentViewHandle`, as doing so will end up seeing the *wrong* offset (e.g. an offset in which the index part has already been mixed in).
The only possibility then, is to remove size and alignment checks from the *raw* var handles returned by `VarHandles::memorySegmentViewHandle`, and perform such checks outside (e.g. in `LayoutPath::dereferenceHandle`). The only checks left in the raw var handles are:
* an alignment check to make sure that the access mode selected by the client is really available - this alignment check is against the alignment of the value layout being selected, and not the enclosing layout alignment (which might be stricter)
* a read-only check, to make sure that write access modes are blocked if the accessed segment is read-only
* liveness/confinement checks, as mandated by ScopedMemoryAccess
Since these check depends on the particular access mode selected by the client, we can't move these checks away from the raw var handle.
These changes come with some consequences:
* Now it is always necessary to adapt a raw var handle, and to insert appropriate size and alignment checks (otherwise OOB access might be possible). As such, `ValueLayouts` also need to call the path layout variant of `MemoryLayout::varHandle`, to make sure that the raw var handle is adapted accordingly, before it is cached in its stable field.
* The var handle cache has been moved from `Utils` to `ValueLayouts::varHandle`. The cache is used (a) to reduce the number of var handle instances created and (b) to make sure that the cached var handle in the `ValueLayout` has stable identity. With respect to (a), while it makes sense to cache "toplevel" var handles (e.g. `JAVA_INT.varHandle()`) the cost/benefit ratio for caching nested var handles seem more unfavourable, as the same var handle can be reused with different enclosing layouts, leading to different checks. Ultimately, all nested var handles will require some kind of adaptation, so it doesn't seem too crucial to have a deeper level of caching here.
* The order in which exceptions are thrown might be slightly different. This is because the size/alignment checks now take place _before_ the raw var handle is even called. This caused a bunch of small updates in code and tests.
* It used to be possible to create a sequence layout with maximal size, like `MemoryLayout.sequenceLayout(Long.MAX_VALUE, JAVA_LONG)`, and derive an indexed var handle from that layout. Since there used to be no enclosing layout size check, access to the sequence element was allowed, as long as the index computation did not result in an offset outside the boundary of the accessed memory segment. This is now no longer the case: when selecting an element from the above layout, the implementation will make sure that the accessed segment indeed has the size of that sequence layout (which will probably lead to a `IndexOutOfBoundException`). To do indexed access on an unbounded sequence, the `MemoryLayout::arrayElementVarHandle` should be used instead (but this is the norm anyway for such cases).
-------------
Commit messages:
- Revert commented lines in test
- Simplify var handle code
- Revert test changes
- Cleanup spurious changes
- Only use nested var handles
- Add tests for nested layout size check
- Drop spurious changes
- cleanup javadoc
- Fix benchmark
- Cache both nested and toplevel var handles
- ... and 7 more: https://git.openjdk.org/jdk/compare/61aff6db...236007c3
Changes: https://git.openjdk.org/jdk/pull/19251/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=19251&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8331865
Stats: 199 lines in 12 files changed: 73 ins; 37 del; 89 mod
Patch: https://git.openjdk.org/jdk/pull/19251.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/19251/head:pull/19251
PR: https://git.openjdk.org/jdk/pull/19251
More information about the core-libs-dev
mailing list