[master] RFR: JDK-8307737: [Lilliput] Reduce number of loads used for Klass decoding in static code [v3]
Thomas Stuefe
stuefe at openjdk.org
Wed May 10 11:33:38 UTC 2023
> Klass decode depends on several runtime parameters. In 64-bit header mode, these are:
>
> - UseCompactObjectHeaders
> - Encoding Base
> - Encoding Shift
>
> In Legacy header mode, these are:
>
> - UseCompactObjectHeaders
> - UseCompressedClassPointers
> - Encoding Base
> - Encoding Shift
>
> These values are stored at distinct locations and require three resp. four loads (see disassmbly [1]). Unfortunately, Legacy mode is more expensive with Lilliput, since we now need to load two switches.
>
> I want to minimize the number of loads. There are several ways to do this, but the most simple would be to use a denser representation in memory of these values. We always load them together anyway.
>
> All four values (UseCompactObjectHeaders, UseCompressedClassPointers, Encoding Base+Shift) can be coded into a single 64-bit value. The encoding base will always be page-aligned. That leaves us an alignment shadow of 12 bits to hide all the rest of the information. UseCompactObjectHeaders and UseCompressedClassPointers can be represented by single bits. The encoding shift will not be larger than 31, so we can store the shift in 5 bits.
>
> The result is that the three resp. four loads can be folded into a single 64-bit load without too much trouble.
>
> Alternatives.
>
> - Generating a stub routine is not an option if one wants to keep decoding inlined
> - We could generate different variants of the decoding routines via template, parametrized for each permutation of (shift, UseCompressedClassPointers, UseCompactObjectHeaders) and the most common encoding base. But:
> - we would need a different solution for uncommon base addresses
> - we may want to make shift an adjustable runtime parameter
> - we would need to decide, at runtime, which code variant to use - again, introduces a runtime switch we need to load from memory - nothing gained compared to the proposed solution.
>
> ----------------
>
> The patch changes the way the static helper class `CompressedKlassPointers` stores encoding base and shift to a denser format. The format also contains copies of UseCompressedClassPointers and UseCompactObjectHeaders:
>
>
> Bit#
> 0: UseCompactObjectHeaders
> 1: UseCompressedClassPointers
> 2-6: Encoding shift (5 bits)
> 7-11: Unused
> 12-63: Encoding Base address
>
>
> Patch then changes some frequently used oop methods to use the dense representation of UseCompactObjectHeaders/UseCompressedClassPointers.
>
> The result: we now only need one load for Klass decoding/encoding (see disassembly [2]).
>
> -----------------------
>
> [1] https://bugs.openjdk.org/secure/attachment/103772/KlassExtraction.txt
> [2] https://bugs.openjdk.org/secure/attachment/103773/KlassExtraction-patched.txt
Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:
feedback roman
-------------
Changes:
- all: https://git.openjdk.org/lilliput/pull/92/files
- new: https://git.openjdk.org/lilliput/pull/92/files/a2f0fd99..59a2c191
Webrevs:
- full: https://webrevs.openjdk.org/?repo=lilliput&pr=92&range=02
- incr: https://webrevs.openjdk.org/?repo=lilliput&pr=92&range=01-02
Stats: 7 lines in 2 files changed: 0 ins; 0 del; 7 mod
Patch: https://git.openjdk.org/lilliput/pull/92.diff
Fetch: git fetch https://git.openjdk.org/lilliput.git pull/92/head:pull/92
PR: https://git.openjdk.org/lilliput/pull/92
More information about the lilliput-dev
mailing list