RFR: 8338677: Improve startup of memory access var handles by simplifying combinator chains

Maurizio Cimadamore mcimadamore at openjdk.org
Tue Aug 20 18:16:13 UTC 2024


This PR reduces the amount of lambda forms (LFs) which are created when generating var handles for simple struct field accessors. This contributes to the startup regression seen in [JDK-8337505](https://bugs.openjdk.org/browse/JDK-8337505).

There are essentially three sources of excessive var handle adaptation:

1. `LayoutPath::dereferenceHandle` has to do some very complex adaptation (including a permute) in order to inject alignment and size checks (against the enclosing layout) on the generated var handle.
2. Even in simple cases (e.g. when there's no dynamic coordinate), the offset of the accessed field is added to the var handle via an expensive collect adapter.
3. When we adapt a `long` var handle to work on `MemorySegment` using an `AddressLayout`, we make no distinction on whether the address layout has a target layout or not. In the latter case (common!) we can adapt more simply.

The meat of this PR is to address (1) by changing the shape of the generated helpers in the `X-VarHandleSegmentView.java.template` class. That is, the method for doing a plain get will now have the following shape:


T get(MemorySegment segment, MemoryLayout enclosing, long base, long offset)


Where:
* `segment` is the segment being accessed
* `enclosing` is the enclosing layout (the root of the selected layout path) against which to check size and alignment
* `base` is the public-facing offset passed by the user when calling `get` on the var handle
* `offset` is the offset at which the selected layout element can be found from the root (this can be replaced with an expression that takes several dynamic indices and turn them into a single offset)

With this organization, it is easy to see how, in order to create a memory access var handle for a struct field `S.f` we only need to:
* inject the enclosing layout `S` into the var handle (into the `enclosing` coordinate)
* inject the offset of `S.f` into the var handle (into the `offset` coordinate)

This way, we get our plain old memory access var handle featuring only two coordinates: a segment and an offset. Note how, to get there, we only needed very simple adaptations (e.g. `MethodHandles::insertCoordinates`).

#### Evaluation

I did some tests using the benchmark in [JDK-8337505](https://bugs.openjdk.org/browse/JDK-8337505) to assess the impact of this change on startup. To evaluate startup, I ran the benchmark 50 times and then took some stats. Here's what the numbers look before this change (AVG = average, MED = median):


AVG        0.196ms
MED        0.198ms
MAX        0.201ms
MIN        0.186ms


And here's after this change:


AVG        0.179ms
MED        0.180ms
MAX        0.183ms
MIN        0.174ms


This is a good 10% speedup. The number of generated LFs for this test went from 99 to 67 (we're at the point where most LFs are from static initializers in the `LayoutPath` and `Utils` classes).

I also run all the memory benhmarks starting with `LoopOver` before and after the change, and verified no unwanted change in peak performance.

#### Future work

There's more work to do here. One possible option is to tweak the template further to also generate variants for `MemorySegment` and `boolean`, so that no adaptation is required in those cases. Some preliminary examples seem to show another 10ms gain with this approach.

Another option would be to add some FFM code to the `HelloClasslist` class, so that some of the generated classes can be optimized at link-time. This also seems to yield another 10ms gain (I have not tried to see if this adds up with the gain in the previously described approach, but I would say probably not - at least not fully).

Many thanks to @cl4es for the invaluable help and moral support :-)

-------------

Commit messages:
 - Fix code breakage after refatcor
 - Consoldiate and share code
 - Improve adaptation of address handles
 - Simplify address adaptation
 - Initial push

Changes: https://git.openjdk.org/jdk/pull/20647/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20647&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8338677
  Stats: 181 lines in 3 files changed: 31 ins; 11 del; 139 mod
  Patch: https://git.openjdk.org/jdk/pull/20647.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20647/head:pull/20647

PR: https://git.openjdk.org/jdk/pull/20647


More information about the core-libs-dev mailing list