RFR: 8353500: [s390x] Intrinsify Unsafe::setMemory [v2]
Amit Kumar
amitkumar at openjdk.org
Thu Apr 17 10:28:40 UTC 2025
On Wed, 9 Apr 2025 08:57:40 GMT, Amit Kumar <amitkumar at openjdk.org> wrote:
>> Unsafe::setMemory intrinsic implementation for s390x.
>>
>> Stub Code:
>>
>>
>> StubRoutines::unsafe_setmemory [0x000003ffb04b63c0, 0x000003ffb04b64d0] (272 bytes)
>> --------------------------------------------------------------------------------
>> 0x000003ffb04b63c0: ogrk %r1,%r2,%r3
>> 0x000003ffb04b63c4: nill %r1,7
>> 0x000003ffb04b63c8: je 0x000003ffb04b6410
>> 0x000003ffb04b63cc: nill %r1,3
>> 0x000003ffb04b63d0: je 0x000003ffb04b6460
>> 0x000003ffb04b63d4: nill %r1,1
>> 0x000003ffb04b63d8: jlh 0x000003ffb04b64a0
>> 0x000003ffb04b63dc: risbg %r4,%r4,48,55,8
>> 0x000003ffb04b63e2: risbgz %r1,%r3,32,63,62
>> 0x000003ffb04b63e8: je 0x000003ffb04b6402
>> 0x000003ffb04b63ec: nopr
>> 0x000003ffb04b63ee: nopr
>> 0x000003ffb04b63f0: sth %r4,0(%r2)
>> 0x000003ffb04b63f4: sth %r4,2(%r2)
>> 0x000003ffb04b63f8: agfi %r2,4
>> 0x000003ffb04b63fe: brct %r1,0x000003ffb04b63f0
>> 0x000003ffb04b6402: nilf %r3,2
>> 0x000003ffb04b6408: ber %r14
>> 0x000003ffb04b640a: sth %r4,0(%r2)
>> 0x000003ffb04b640e: br %r14
>> 0x000003ffb04b6410: risbg %r4,%r4,48,55,8
>> 0x000003ffb04b6416: risbg %r4,%r4,32,47,16
>> 0x000003ffb04b641c: risbg %r4,%r4,0,31,32
>> 0x000003ffb04b6422: risbgz %r1,%r3,32,63,60
>> 0x000003ffb04b6428: je 0x000003ffb04b6446
>> 0x000003ffb04b642c: nopr
>> 0x000003ffb04b642e: nopr
>> 0x000003ffb04b6430: stg %r4,0(%r2)
>> 0x000003ffb04b6436: stg %r4,8(%r2)
>> 0x000003ffb04b643c: agfi %r2,16
>> 0x000003ffb04b6442: brct %r1,0x000003ffb04b6430
>> 0x000003ffb04b6446: nilf %r3,8
>> 0x000003ffb04b644c: ber %r14
>> 0x000003ffb04b644e: stg %r4,0(%r2)
>> 0x000003ffb04b6454: br %r14
>> 0x000003ffb04b6456: nopr
>> 0x000003ffb04b6458: nopr
>> 0x000003ffb04b645a: nopr
>> 0x000003ffb04b645c: nopr
>> 0x000003ffb04b645e: nopr
>> 0x000003ffb04b6460: risbg %r4,%r4,48,55,8
>> 0x000003ffb04b6466: risbg %r4,%r4,32,47,16
>> 0x000003ffb04b646c: risbgz %r1,%r3,32,63,61
>> 0x000003ffb04b6472: je 0x000003ffb04b6492
>> 0x000003ffb04b6476: nopr
>> 0x000003ffb04b6478: nopr
>> 0x000003ffb04b647a: nopr
>> 0x000003ffb04b647c: nopr
>> 0x000003ffb04b647e: nopr
>> 0x000003ffb04b6480: st %r4,0(%r2)
>> 0x000003ffb04b6484: st %r4,4(%r2)
>> 0x000003ffb04b6488: agfi %r2,8
>> 0x000003ffb04b648e: brct %r1,0x000003ffb04b6480
>> 0x000003ffb04b6492: nilf %r3,4
>> 0x000003ffb04b6498: ber %r14
>> 0x000003ffb04b649a: st %r4,0(%r2)
>> 0x0000...
>
> Amit Kumar has updated the pull request incrementally with four additional commits since the last revision:
>
> - reviews for Martin
> - Revert "minor improvement"
>
> This reverts commit a6af6da26d1e0590dc24486131d1bc752e047f98.
> - minor improvement
> - reviews from Lutz and Martin
This result is from shared-machine, but looks like the regression part is fixed.
We got regression because, for Unaligned case, only 1-byte store instruction were getting emitted (i.e. `stc`). And as the alignment depends on two factors (`size` and `address where we are storing the value`). So we can't always exactly tell that this will be an aligned or un-aligned case in the Benchmark.
I will do further testing and will see if more optimization can be done. Then will mark this PR ready for review.
Benchmark (aligned) (size) Mode Cnt Score Error Units
MemorySegmentZeroUnsafe.panama true 1 avgt 30 2.893 ± 0.013 ns/op
MemorySegmentZeroUnsafe.panama true 2 avgt 30 3.122 ± 0.006 ns/op
MemorySegmentZeroUnsafe.panama true 3 avgt 30 3.286 ± 0.006 ns/op
MemorySegmentZeroUnsafe.panama true 4 avgt 30 3.401 ± 0.006 ns/op
MemorySegmentZeroUnsafe.panama true 5 avgt 30 3.291 ± 0.021 ns/op
MemorySegmentZeroUnsafe.panama true 6 avgt 30 3.455 ± 0.015 ns/op
MemorySegmentZeroUnsafe.panama true 7 avgt 30 3.471 ± 0.007 ns/op
MemorySegmentZeroUnsafe.panama true 8 avgt 30 3.215 ± 0.033 ns/op
MemorySegmentZeroUnsafe.panama true 15 avgt 30 4.632 ± 0.006 ns/op
MemorySegmentZeroUnsafe.panama true 16 avgt 30 3.815 ± 0.014 ns/op
MemorySegmentZeroUnsafe.panama true 63 avgt 30 9.695 ± 0.036 ns/op
MemorySegmentZeroUnsafe.panama true 64 avgt 30 5.296 ± 0.008 ns/op
MemorySegmentZeroUnsafe.panama true 255 avgt 30 9.682 ± 0.011 ns/op
MemorySegmentZeroUnsafe.panama true 256 avgt 30 9.508 ± 0.013 ns/op
MemorySegmentZeroUnsafe.panama false 1 avgt 30 2.887 ± 0.005 ns/op
MemorySegmentZeroUnsafe.panama false 2 avgt 30 3.134 ± 0.024 ns/op
MemorySegmentZeroUnsafe.panama false 3 avgt 30 3.285 ± 0.005 ns/op
MemorySegmentZeroUnsafe.panama false 4 avgt 30 3.397 ± 0.003 ns/op
MemorySegmentZeroUnsafe.panama false 5 avgt 30 3.297 ± 0.049 ns/op
MemorySegmentZeroUnsafe.panama false 6 avgt 30 3.445 ± 0.006 ns/op
MemorySegmentZeroUnsafe.panama false 7 avgt 30 3.471 ± 0.007 ns/op
MemorySegmentZeroUnsafe.panama false 8 avgt 30 3.204 ± 0.023 ns/op
MemorySegmentZeroUnsafe.panama false 15 avgt 30 4.630 ± 0.007 ns/op
MemorySegmentZeroUnsafe.panama false 16 avgt 30 3.811 ± 0.006 ns/op
MemorySegmentZeroUnsafe.panama false 63 avgt 30 9.676 ± 0.012 ns/op
MemorySegmentZeroUnsafe.panama false 64 avgt 30 9.690 ± 0.031 ns/op
MemorySegmentZeroUnsafe.panama false 255 avgt 30 9.678 ± 0.013 ns/op
MemorySegmentZeroUnsafe.panama false 256 avgt 30 4.180 ± 0.010 ns/op
MemorySegmentZeroUnsafe.unsafe true 1 avgt 30 2.636 ± 0.060 ns/op
MemorySegmentZeroUnsafe.unsafe true 2 avgt 30 2.379 ± 0.006 ns/op
MemorySegmentZeroUnsafe.unsafe true 3 avgt 30 7.743 ± 0.009 ns/op
MemorySegmentZeroUnsafe.unsafe true 4 avgt 30 2.531 ± 0.113 ns/op
MemorySegmentZeroUnsafe.unsafe true 5 avgt 30 7.746 ± 0.012 ns/op
MemorySegmentZeroUnsafe.unsafe true 6 avgt 30 3.183 ± 0.006 ns/op
MemorySegmentZeroUnsafe.unsafe true 7 avgt 30 7.742 ± 0.011 ns/op
MemorySegmentZeroUnsafe.unsafe true 8 avgt 30 2.580 ± 0.095 ns/op
MemorySegmentZeroUnsafe.unsafe true 15 avgt 30 7.870 ± 0.184 ns/op
MemorySegmentZeroUnsafe.unsafe true 16 avgt 30 2.523 ± 0.011 ns/op
MemorySegmentZeroUnsafe.unsafe true 63 avgt 30 7.757 ± 0.033 ns/op
MemorySegmentZeroUnsafe.unsafe true 64 avgt 30 3.580 ± 0.005 ns/op
MemorySegmentZeroUnsafe.unsafe true 255 avgt 30 7.744 ± 0.009 ns/op
MemorySegmentZeroUnsafe.unsafe true 256 avgt 30 8.090 ± 0.110 ns/op
MemorySegmentZeroUnsafe.unsafe false 1 avgt 30 2.683 ± 0.025 ns/op
MemorySegmentZeroUnsafe.unsafe false 2 avgt 30 7.747 ± 0.009 ns/op
MemorySegmentZeroUnsafe.unsafe false 3 avgt 30 7.738 ± 0.009 ns/op
MemorySegmentZeroUnsafe.unsafe false 4 avgt 30 7.745 ± 0.009 ns/op
MemorySegmentZeroUnsafe.unsafe false 5 avgt 30 7.773 ± 0.064 ns/op
MemorySegmentZeroUnsafe.unsafe false 6 avgt 30 7.736 ± 0.008 ns/op
MemorySegmentZeroUnsafe.unsafe false 7 avgt 30 7.747 ± 0.010 ns/op
MemorySegmentZeroUnsafe.unsafe false 8 avgt 30 7.748 ± 0.030 ns/op
MemorySegmentZeroUnsafe.unsafe false 15 avgt 30 7.735 ± 0.008 ns/op
MemorySegmentZeroUnsafe.unsafe false 16 avgt 30 7.747 ± 0.020 ns/op
MemorySegmentZeroUnsafe.unsafe false 63 avgt 30 7.746 ± 0.013 ns/op
MemorySegmentZeroUnsafe.unsafe false 64 avgt 30 7.743 ± 0.012 ns/op
MemorySegmentZeroUnsafe.unsafe false 255 avgt 30 7.741 ± 0.011 ns/op
MemorySegmentZeroUnsafe.unsafe false 256 avgt 30 2.739 ± 0.005 ns/op
Finished running test 'micro:java.lang.foreign.MemorySegmentZeroUnsafe'
Stub Code Generated with current code:
StubRoutines::unsafe_setmemory [0x000003ff9c4b63c0, 0x000003ff9c4b64dc] (284 bytes)
--------------------------------------------------------------------------------
BFD: unknown S/390 disassembler option: s390
.long 0x00000000
0x000003ff9c4b63c0: ogrk %r1,%r2,%r3
0x000003ff9c4b63c4: nill %r1,7
0x000003ff9c4b63c8: je 0x000003ff9c4b641e
0x000003ff9c4b63cc: nill %r1,3
0x000003ff9c4b63d0: je 0x000003ff9c4b6464
0x000003ff9c4b63d4: nill %r1,1
0x000003ff9c4b63d8: jne 0x000003ff9c4b649e
0x000003ff9c4b63dc: risbg %r4,%r4,48,55,8
0x000003ff9c4b63e2: risbgz %r1,%r3,32,63,62
0x000003ff9c4b63e8: je 0x000003ff9c4b6410
0x000003ff9c4b63ec: nopr
0x000003ff9c4b63ee: nopr
0x000003ff9c4b63f0: nopr
0x000003ff9c4b63f2: nopr
0x000003ff9c4b63f4: nopr
0x000003ff9c4b63f6: nopr
0x000003ff9c4b63f8: nopr
0x000003ff9c4b63fa: nopr
0x000003ff9c4b63fc: nopr
0x000003ff9c4b63fe: nopr
0x000003ff9c4b6400: sth %r4,0(%r2)
0x000003ff9c4b6404: sth %r4,2(%r2)
0x000003ff9c4b6408: aghi %r2,4
0x000003ff9c4b640c: brct %r1,0x000003ff9c4b6400
0x000003ff9c4b6410: nilf %r3,2
0x000003ff9c4b6416: ber %r14
0x000003ff9c4b6418: sth %r4,0(%r2)
0x000003ff9c4b641c: br %r14
0x000003ff9c4b641e: risbg %r4,%r4,48,55,8
0x000003ff9c4b6424: risbg %r4,%r4,32,47,16
0x000003ff9c4b642a: risbg %r4,%r4,0,31,32
0x000003ff9c4b6430: risbgz %r1,%r3,32,63,60
0x000003ff9c4b6436: je 0x000003ff9c4b6454
0x000003ff9c4b643a: nopr
0x000003ff9c4b643c: nopr
0x000003ff9c4b643e: nopr
0x000003ff9c4b6440: stg %r4,0(%r2)
0x000003ff9c4b6446: stg %r4,8(%r2)
0x000003ff9c4b644c: aghi %r2,16
0x000003ff9c4b6450: brct %r1,0x000003ff9c4b6440
0x000003ff9c4b6454: nilf %r3,8
0x000003ff9c4b645a: ber %r14
0x000003ff9c4b645c: stg %r4,0(%r2)
0x000003ff9c4b6462: br %r14
0x000003ff9c4b6464: risbg %r4,%r4,48,55,8
0x000003ff9c4b646a: risbg %r4,%r4,32,47,16
0x000003ff9c4b6470: risbgz %r1,%r3,32,63,61
0x000003ff9c4b6476: je 0x000003ff9c4b6490
0x000003ff9c4b647a: nopr
0x000003ff9c4b647c: nopr
0x000003ff9c4b647e: nopr
0x000003ff9c4b6480: st %r4,0(%r2)
0x000003ff9c4b6484: st %r4,4(%r2)
0x000003ff9c4b6488: aghi %r2,8
0x000003ff9c4b648c: brct %r1,0x000003ff9c4b6480
0x000003ff9c4b6490: nilf %r3,4
0x000003ff9c4b6496: ber %r14
0x000003ff9c4b6498: st %r4,0(%r2)
0x000003ff9c4b649c: br %r14
0x000003ff9c4b649e: cghi %r3,256
0x000003ff9c4b64a2: jl 0x000003ff9c4b64c4
0x000003ff9c4b64a6: stc %r4,0(%r2)
0x000003ff9c4b64aa: mvc 1(255,%r2),0(%r2)
0x000003ff9c4b64b0: aghi %r2,256
0x000003ff9c4b64b4: aghi %r3,-256
0x000003ff9c4b64b8: cghi %r3,256
0x000003ff9c4b64bc: jh 0x000003ff9c4b64a6
0x000003ff9c4b64c0: ltr %r3,%r3
0x000003ff9c4b64c2: ber %r14
0x000003ff9c4b64c4: stc %r4,0(%r2)
0x000003ff9c4b64c8: aghi %r3,-2
0x000003ff9c4b64cc: blr %r14
0x000003ff9c4b64ce: exrl %r3,0x000003ff9c4b64d6
0x000003ff9c4b64d4: br %r14
0x000003ff9c4b64d6: mvc 1(1,%r2),0(%r2)
-------------
PR Comment: https://git.openjdk.org/jdk/pull/24480#issuecomment-2809303487
PR Comment: https://git.openjdk.org/jdk/pull/24480#issuecomment-2812434376
More information about the hotspot-compiler-dev
mailing list