RFR: 8353500: [s390x] Intrinsify Unsafe::setMemory [v4]
Amit Kumar
amitkumar at openjdk.org
Mon May 26 04:11:47 UTC 2025
On Wed, 23 Apr 2025 06:09:25 GMT, Amit Kumar <amitkumar at openjdk.org> wrote:
>> Unsafe::setMemory intrinsic implementation for s390x.
>>
>> Stub Code:
>>
>>
>> StubRoutines::unsafe_setmemory [0x000003ffb04b63c0, 0x000003ffb04b64d0] (272 bytes)
>> --------------------------------------------------------------------------------
>> 0x000003ffb04b63c0: ogrk %r1,%r2,%r3
>> 0x000003ffb04b63c4: nill %r1,7
>> 0x000003ffb04b63c8: je 0x000003ffb04b6410
>> 0x000003ffb04b63cc: nill %r1,3
>> 0x000003ffb04b63d0: je 0x000003ffb04b6460
>> 0x000003ffb04b63d4: nill %r1,1
>> 0x000003ffb04b63d8: jlh 0x000003ffb04b64a0
>> 0x000003ffb04b63dc: risbg %r4,%r4,48,55,8
>> 0x000003ffb04b63e2: risbgz %r1,%r3,32,63,62
>> 0x000003ffb04b63e8: je 0x000003ffb04b6402
>> 0x000003ffb04b63ec: nopr
>> 0x000003ffb04b63ee: nopr
>> 0x000003ffb04b63f0: sth %r4,0(%r2)
>> 0x000003ffb04b63f4: sth %r4,2(%r2)
>> 0x000003ffb04b63f8: agfi %r2,4
>> 0x000003ffb04b63fe: brct %r1,0x000003ffb04b63f0
>> 0x000003ffb04b6402: nilf %r3,2
>> 0x000003ffb04b6408: ber %r14
>> 0x000003ffb04b640a: sth %r4,0(%r2)
>> 0x000003ffb04b640e: br %r14
>> 0x000003ffb04b6410: risbg %r4,%r4,48,55,8
>> 0x000003ffb04b6416: risbg %r4,%r4,32,47,16
>> 0x000003ffb04b641c: risbg %r4,%r4,0,31,32
>> 0x000003ffb04b6422: risbgz %r1,%r3,32,63,60
>> 0x000003ffb04b6428: je 0x000003ffb04b6446
>> 0x000003ffb04b642c: nopr
>> 0x000003ffb04b642e: nopr
>> 0x000003ffb04b6430: stg %r4,0(%r2)
>> 0x000003ffb04b6436: stg %r4,8(%r2)
>> 0x000003ffb04b643c: agfi %r2,16
>> 0x000003ffb04b6442: brct %r1,0x000003ffb04b6430
>> 0x000003ffb04b6446: nilf %r3,8
>> 0x000003ffb04b644c: ber %r14
>> 0x000003ffb04b644e: stg %r4,0(%r2)
>> 0x000003ffb04b6454: br %r14
>> 0x000003ffb04b6456: nopr
>> 0x000003ffb04b6458: nopr
>> 0x000003ffb04b645a: nopr
>> 0x000003ffb04b645c: nopr
>> 0x000003ffb04b645e: nopr
>> 0x000003ffb04b6460: risbg %r4,%r4,48,55,8
>> 0x000003ffb04b6466: risbg %r4,%r4,32,47,16
>> 0x000003ffb04b646c: risbgz %r1,%r3,32,63,61
>> 0x000003ffb04b6472: je 0x000003ffb04b6492
>> 0x000003ffb04b6476: nopr
>> 0x000003ffb04b6478: nopr
>> 0x000003ffb04b647a: nopr
>> 0x000003ffb04b647c: nopr
>> 0x000003ffb04b647e: nopr
>> 0x000003ffb04b6480: st %r4,0(%r2)
>> 0x000003ffb04b6484: st %r4,4(%r2)
>> 0x000003ffb04b6488: agfi %r2,8
>> 0x000003ffb04b648e: brct %r1,0x000003ffb04b6480
>> 0x000003ffb04b6492: nilf %r3,4
>> 0x000003ffb04b6498: ber %r14
>> 0x000003ffb04b649a: st %r4,0(%r2)
>> 0x0000...
>
> Amit Kumar has updated the pull request incrementally with one additional commit since the last revision:
>
> improved mvc implementation
Tier-1 test are clean with fastdebug-vm;
These are the performance number on my z16 zVM:
Benchmark (aligned) (size) Mode Cnt Score Error Units
MemorySegmentZeroUnsafe.panama true 1 avgt 30 2.889 ± 0.020 ns/op
MemorySegmentZeroUnsafe.panama true 2 avgt 30 3.115 ± 0.014 ns/op
MemorySegmentZeroUnsafe.panama true 3 avgt 30 3.271 ± 0.003 ns/op
MemorySegmentZeroUnsafe.panama true 4 avgt 30 3.382 ± 0.006 ns/op
MemorySegmentZeroUnsafe.panama true 5 avgt 30 3.295 ± 0.062 ns/op
MemorySegmentZeroUnsafe.panama true 6 avgt 30 3.428 ± 0.008 ns/op
MemorySegmentZeroUnsafe.panama true 7 avgt 30 3.482 ± 0.049 ns/op
MemorySegmentZeroUnsafe.panama true 8 avgt 30 3.188 ± 0.013 ns/op
MemorySegmentZeroUnsafe.panama true 15 avgt 30 4.612 ± 0.005 ns/op
MemorySegmentZeroUnsafe.panama true 16 avgt 30 3.795 ± 0.004 ns/op
MemorySegmentZeroUnsafe.panama true 63 avgt 30 5.376 ± 0.037 ns/op
MemorySegmentZeroUnsafe.panama true 64 avgt 30 4.846 ± 0.033 ns/op
MemorySegmentZeroUnsafe.panama true 255 avgt 30 7.723 ± 0.263 ns/op
MemorySegmentZeroUnsafe.panama true 256 avgt 30 7.299 ± 0.017 ns/op
MemorySegmentZeroUnsafe.panama false 1 avgt 30 2.883 ± 0.017 ns/op
MemorySegmentZeroUnsafe.panama false 2 avgt 30 3.110 ± 0.003 ns/op
MemorySegmentZeroUnsafe.panama false 3 avgt 30 3.271 ± 0.003 ns/op
MemorySegmentZeroUnsafe.panama false 4 avgt 30 3.385 ± 0.009 ns/op
MemorySegmentZeroUnsafe.panama false 5 avgt 30 3.268 ± 0.024 ns/op
MemorySegmentZeroUnsafe.panama false 6 avgt 30 3.431 ± 0.010 ns/op
MemorySegmentZeroUnsafe.panama false 7 avgt 30 3.459 ± 0.003 ns/op
MemorySegmentZeroUnsafe.panama false 8 avgt 30 3.186 ± 0.005 ns/op
MemorySegmentZeroUnsafe.panama false 15 avgt 30 4.614 ± 0.015 ns/op
MemorySegmentZeroUnsafe.panama false 16 avgt 30 3.799 ± 0.006 ns/op
MemorySegmentZeroUnsafe.panama false 63 avgt 30 5.282 ± 0.020 ns/op
MemorySegmentZeroUnsafe.panama false 64 avgt 30 4.891 ± 0.012 ns/op
MemorySegmentZeroUnsafe.panama false 255 avgt 30 8.038 ± 0.007 ns/op
MemorySegmentZeroUnsafe.panama false 256 avgt 30 7.890 ± 0.108 ns/op
MemorySegmentZeroUnsafe.unsafe true 1 avgt 30 3.785 ± 0.062 ns/op
MemorySegmentZeroUnsafe.unsafe true 2 avgt 30 3.772 ± 0.075 ns/op
MemorySegmentZeroUnsafe.unsafe true 3 avgt 30 3.433 ± 0.052 ns/op
MemorySegmentZeroUnsafe.unsafe true 4 avgt 30 3.727 ± 0.172 ns/op
MemorySegmentZeroUnsafe.unsafe true 5 avgt 30 3.414 ± 0.062 ns/op
MemorySegmentZeroUnsafe.unsafe true 6 avgt 30 3.313 ± 0.117 ns/op
MemorySegmentZeroUnsafe.unsafe true 7 avgt 30 3.198 ± 0.015 ns/op
MemorySegmentZeroUnsafe.unsafe true 8 avgt 30 2.843 ± 0.158 ns/op
MemorySegmentZeroUnsafe.unsafe true 15 avgt 30 3.278 ± 0.004 ns/op
MemorySegmentZeroUnsafe.unsafe true 16 avgt 30 2.925 ± 0.113 ns/op
MemorySegmentZeroUnsafe.unsafe true 63 avgt 30 3.800 ± 0.006 ns/op
MemorySegmentZeroUnsafe.unsafe true 64 avgt 30 3.400 ± 0.050 ns/op
MemorySegmentZeroUnsafe.unsafe true 255 avgt 30 7.032 ± 0.120 ns/op
MemorySegmentZeroUnsafe.unsafe true 256 avgt 30 6.423 ± 0.013 ns/op
MemorySegmentZeroUnsafe.unsafe false 1 avgt 30 3.645 ± 0.148 ns/op
MemorySegmentZeroUnsafe.unsafe false 2 avgt 30 3.638 ± 0.152 ns/op
MemorySegmentZeroUnsafe.unsafe false 3 avgt 30 3.377 ± 0.068 ns/op
MemorySegmentZeroUnsafe.unsafe false 4 avgt 30 3.692 ± 0.119 ns/op
MemorySegmentZeroUnsafe.unsafe false 5 avgt 30 3.436 ± 0.027 ns/op
MemorySegmentZeroUnsafe.unsafe false 6 avgt 30 3.427 ± 0.038 ns/op
MemorySegmentZeroUnsafe.unsafe false 7 avgt 30 3.192 ± 0.014 ns/op
MemorySegmentZeroUnsafe.unsafe false 8 avgt 30 3.035 ± 0.046 ns/op
MemorySegmentZeroUnsafe.unsafe false 15 avgt 30 3.294 ± 0.049 ns/op
MemorySegmentZeroUnsafe.unsafe false 16 avgt 30 3.042 ± 0.061 ns/op
MemorySegmentZeroUnsafe.unsafe false 63 avgt 30 3.579 ± 0.006 ns/op
MemorySegmentZeroUnsafe.unsafe false 64 avgt 30 3.449 ± 0.035 ns/op
MemorySegmentZeroUnsafe.unsafe false 255 avgt 30 8.633 ± 0.317 ns/op
MemorySegmentZeroUnsafe.unsafe false 256 avgt 30 7.003 ± 0.085 ns/op
-------------
PR Comment: https://git.openjdk.org/jdk/pull/24480#issuecomment-2908459844
More information about the hotspot-compiler-dev
mailing list