RFR: 7903912: JMH: Pad using a mix of bytes and longs
Claes Redestad
redestad at openjdk.org
Mon Dec 16 12:10:01 UTC 2024
On Sun, 15 Dec 2024 18:51:54 GMT, Claes Redestad <redestad at openjdk.org> wrote:
> Using a mix of bytes and longs we should get the gap-filling effect described in https://shipilev.net/jvm/objects-inside-out/#_observation_hierarchy_tower_padding_trick_collapse_in_jdk_15 while reducing size of generated code by roughly 20%:
>
>
> 864772 jmh-samples/target/jmh-samples-1.38-SNAPSHOT.jar
> 490531 jmh-samples/target/jmh-samples-1.38-SNAPSHOT-sources.jar
>
> 684383 jmh-samples/target/jmh-samples-1.38-SNAPSHOT.jar
> 399758 jmh-samples/target/jmh-samples-1.38-SNAPSHOT-sources.jar
Checking locally with JOL internals on my M1 the B2 fields are laid out at the exact same offset (280), only discernible difference is that `byte` and `long` padding fields of the B1 type are laid out in reverse:
240 8 long JMHSample_39_MemoryAccess_jmhType_B1.b1_2b 0
248 8 long JMHSample_39_MemoryAccess_jmhType_B1.b1_2c 0
256 8 long JMHSample_39_MemoryAccess_jmhType_B1.b1_2d 0
264 1 byte JMHSample_39_MemoryAccess_jmhType_B1.b1_00 0
265 1 byte JMHSample_39_MemoryAccess_jmhType_B1.b1_01 0
266 1 byte JMHSample_39_MemoryAccess_jmhType_B1.b1_02 0
267 1 byte JMHSample_39_MemoryAccess_jmhType_B1.b1_03 0
268 1 byte JMHSample_39_MemoryAccess_jmhType_B1.b1_04 0
269 1 byte JMHSample_39_MemoryAccess_jmhType_B1.b1_05 0
270 1 byte JMHSample_39_MemoryAccess_jmhType_B1.b1_06 0
271 1 byte JMHSample_39_MemoryAccess_jmhType_B1.b1_07 0
272 1 byte JMHSample_39_MemoryAccess_jmhType_B1.b1_08 0
273 1 byte JMHSample_39_MemoryAccess_jmhType_B1.b1_09 0
274 1 byte JMHSample_39_MemoryAccess_jmhType_B1.b1_0a 0
275 1 byte JMHSample_39_MemoryAccess_jmhType_B1.b1_0b 0
276 1 byte JMHSample_39_MemoryAccess_jmhType_B1.b1_0c 0
277 1 byte JMHSample_39_MemoryAccess_jmhType_B1.b1_0d 0
278 1 byte JMHSample_39_MemoryAccess_jmhType_B1.b1_0e 0
279 1 byte JMHSample_39_MemoryAccess_jmhType_B1.b1_0f 0
280 4 int JMHSample_39_MemoryAccess_jmhType_B2.setupTrialMutex 0
284 4 int JMHSample_39_MemoryAccess_jmhType_B2.tearTrialMutex 0
288 4 int JMHSample_39_MemoryAccess_jmhType_B2.setupIterationMutex 0
292 4 int JMHSample_39_MemoryAccess_jmhType_B2.tearIterationMutex 0
296 4 int JMHSample_39_MemoryAccess_jmhType_B2.setupInvocationMutex 0
300 4 int JMHSample_39_MemoryAccess_jmhType_B2.tearInvocationMutex 0
304 1 boolean JMHSample_39_MemoryAccess_jmhType_B2.readyTrial false
305 1 boolean JMHSample_39_MemoryAccess_jmhType_B2.readyIteration false
306 1 boolean JMHSample_39_MemoryAccess_jmhType_B2.readyInvocation false
307 1 byte JMHSample_39_MemoryAccess_jmhType.b3_00 0
308 1 byte JMHSample_39_MemoryAccess_jmhType.b3_01 0
309 1 byte JMHSample_39_MemoryAccess_jmhType.b3_02 0
310 1 byte JMHSample_39_MemoryAccess_jmhType.b3_03 0
311 1 byte JMHSample_39_MemoryAccess_jmhType.b3_04 0
Before:
276 1 byte JMHSample_39_MemoryAccess_jmhType_B1.b1_252 0
277 1 byte JMHSample_39_MemoryAccess_jmhType_B1.b1_253 0
278 1 byte JMHSample_39_MemoryAccess_jmhType_B1.b1_254 0
279 1 byte JMHSample_39_MemoryAccess_jmhType_B1.b1_255 0
280 4 int JMHSample_39_MemoryAccess_jmhType_B2.setupTrialMutex 0
284 4 int JMHSample_39_MemoryAccess_jmhType_B2.tearTrialMutex 0
288 4 int JMHSample_39_MemoryAccess_jmhType_B2.setupIterationMutex 0
292 4 int JMHSample_39_MemoryAccess_jmhType_B2.tearIterationMutex 0
296 4 int JMHSample_39_MemoryAccess_jmhType_B2.setupInvocationMutex 0
300 4 int JMHSample_39_MemoryAccess_jmhType_B2.tearInvocationMutex 0
304 1 boolean JMHSample_39_MemoryAccess_jmhType_B2.readyTrial false
305 1 boolean JMHSample_39_MemoryAccess_jmhType_B2.readyIteration false
306 1 boolean JMHSample_39_MemoryAccess_jmhType_B2.readyInvocation false
307 1 byte JMHSample_39_MemoryAccess_jmhType.b3_000 0
308 1 byte JMHSample_39_MemoryAccess_jmhType.b3_001 0
309 1 byte JMHSample_39_MemoryAccess_jmhType.b3_002 0
310 1 byte JMHSample_39_MemoryAccess_jmhType.b3_003 0
Note: while using hexadecimal has a size benefit, I just thought it made it easier to check that the generated code is correct since everything aligns when doing 16 vars per line, e.g.:
byte p00, p01, p02, p03, p04, p05, p06, p07, p08, p09, p0a, p0b, p0c, p0d, p0e, p0f;
long p10, p11, p12, p13, p14, p15, p16, p17, p18, p19, p1a, p1b, p1c, p1d, p1e, p1f;
long p20, p21, p22, p23, p24, p25, p26, p27, p28, p29, p2a, p2b, p2c, p2d;
Ok! Filed (and linked this to) CODETOOLS-7903912
Updated the padding in the infra classes and will mark this as ready for review after at least getting it through integration tests locally.
-------------
PR Comment: https://git.openjdk.org/jmh/pull/147#issuecomment-2544010640
PR Comment: https://git.openjdk.org/jmh/pull/147#issuecomment-2545361150
More information about the jmh-dev
mailing list