RFR: 8353500: [s390x] Intrinsify Unsafe::setMemory

Lutz Schmidt lucy at openjdk.org
Mon Apr 7 13:53:01 UTC 2025


On Mon, 7 Apr 2025 08:44:07 GMT, Amit Kumar <amitkumar at openjdk.org> wrote:

> Unsafe::setMemory intrinsic implementation for s390x. 
> 
> Stub Code: 
> 
> 
> StubRoutines::unsafe_setmemory [0x000003ffb04b63c0, 0x000003ffb04b64d0] (272 bytes)
> --------------------------------------------------------------------------------
>   0x000003ffb04b63c0:   ogrk	%r1,%r2,%r3
>   0x000003ffb04b63c4:   nill	%r1,7
>   0x000003ffb04b63c8:   je	0x000003ffb04b6410
>   0x000003ffb04b63cc:   nill	%r1,3
>   0x000003ffb04b63d0:   je	0x000003ffb04b6460
>   0x000003ffb04b63d4:   nill	%r1,1
>   0x000003ffb04b63d8:   jlh	0x000003ffb04b64a0
>   0x000003ffb04b63dc:   risbg	%r4,%r4,48,55,8
>   0x000003ffb04b63e2:   risbgz	%r1,%r3,32,63,62
>   0x000003ffb04b63e8:   je	0x000003ffb04b6402
>   0x000003ffb04b63ec:   nopr
>   0x000003ffb04b63ee:   nopr
>   0x000003ffb04b63f0:   sth	%r4,0(%r2)
>   0x000003ffb04b63f4:   sth	%r4,2(%r2)
>   0x000003ffb04b63f8:   agfi	%r2,4
>   0x000003ffb04b63fe:   brct	%r1,0x000003ffb04b63f0
>   0x000003ffb04b6402:   nilf	%r3,2
>   0x000003ffb04b6408:   ber	%r14
>   0x000003ffb04b640a:   sth	%r4,0(%r2)
>   0x000003ffb04b640e:   br	%r14
>   0x000003ffb04b6410:   risbg	%r4,%r4,48,55,8
>   0x000003ffb04b6416:   risbg	%r4,%r4,32,47,16
>   0x000003ffb04b641c:   risbg	%r4,%r4,0,31,32
>   0x000003ffb04b6422:   risbgz	%r1,%r3,32,63,60
>   0x000003ffb04b6428:   je	0x000003ffb04b6446
>   0x000003ffb04b642c:   nopr
>   0x000003ffb04b642e:   nopr
>   0x000003ffb04b6430:   stg	%r4,0(%r2)
>   0x000003ffb04b6436:   stg	%r4,8(%r2)
>   0x000003ffb04b643c:   agfi	%r2,16
>   0x000003ffb04b6442:   brct	%r1,0x000003ffb04b6430
>   0x000003ffb04b6446:   nilf	%r3,8
>   0x000003ffb04b644c:   ber	%r14
>   0x000003ffb04b644e:   stg	%r4,0(%r2)
>   0x000003ffb04b6454:   br	%r14
>   0x000003ffb04b6456:   nopr
>   0x000003ffb04b6458:   nopr
>   0x000003ffb04b645a:   nopr
>   0x000003ffb04b645c:   nopr
>   0x000003ffb04b645e:   nopr
>   0x000003ffb04b6460:   risbg	%r4,%r4,48,55,8
>   0x000003ffb04b6466:   risbg	%r4,%r4,32,47,16
>   0x000003ffb04b646c:   risbgz	%r1,%r3,32,63,61
>   0x000003ffb04b6472:   je	0x000003ffb04b6492
>   0x000003ffb04b6476:   nopr
>   0x000003ffb04b6478:   nopr
>   0x000003ffb04b647a:   nopr
>   0x000003ffb04b647c:   nopr
>   0x000003ffb04b647e:   nopr
>   0x000003ffb04b6480:   st	%r4,0(%r2)
>   0x000003ffb04b6484:   st	%r4,4(%r2)
>   0x000003ffb04b6488:   agfi	%r2,8
>   0x000003ffb04b648e:   brct	%r1,0x000003ffb04b6480
>   0x000003ffb04b6492:   nilf	%r3,4
>   0x000003ffb04b6498:   ber	%r14
>   0x000003ffb04b649a:   st	%r4,0(%r2)
>   0x000003ffb04b649e:   br	%r14
>   0x000003ffb04b64a0:   risbgz	%r1,%r3,32,63,63
>   0x000003ffb04b64a6:   je	0x000003ffb04b64c2
>   0x000003...

Changes requested by lucy (Reviewer).

src/hotspot/cpu/s390/assembler_s390.inline.hpp line 417:

> 415: }
> 416: inline void Assembler::z_risbg( Register r1, Register r2, int64_t spos3, int64_t epos4, int64_t nrot5, bool zero_rest) { // Rotate then INS selected bits.  -- z196
> 417:   const int64_t len = 48;

Changes are not necessary if `bool zero_rest` is used to control what happens to untouched destination bits.

src/hotspot/cpu/s390/stubGenerator_s390.cpp line 1496:

> 1494:     __ z_bre(L_Tail);
> 1495: 
> 1496:     __ align(16); // loop alignment

align(32) would be more helpful:

- instruction engine fetches octoword (32 bytes) bundles.
- Tight loop is < 32 byes -> all in one bundle, does not cross cache line boundary.

src/hotspot/cpu/s390/stubGenerator_s390.cpp line 1541:

> 1539: 
> 1540:       __ z_nill(rScratch1, 7);
> 1541:       __ z_bre(L_fill8Bytes); // branch if 0

Pls use z_braz() to reflect check semantics

src/hotspot/cpu/s390/stubGenerator_s390.cpp line 1545:

> 1543: 
> 1544:       __ z_nill(rScratch1, 3);
> 1545:       __ z_bre(L_fill4Bytes); // branch if 0

See above

src/hotspot/cpu/s390/stubGenerator_s390.cpp line 1548:

> 1546: 
> 1547:       __ z_nill(rScratch1, 1);
> 1548:       __ z_brne(L_fillBytes); // branch if not 0

Pls use z_brnaz() to reflect check semantics

src/hotspot/cpu/s390/stubGenerator_s390.cpp line 1557:

> 1555:       do_setmemory_atomic_loop(2, dest, size, byteVal, _masm);
> 1556: 
> 1557:       __ align(16);

What is this alignment good for?

-------------

PR Review: https://git.openjdk.org/jdk/pull/24480#pullrequestreview-2746874042
PR Review Comment: https://git.openjdk.org/jdk/pull/24480#discussion_r2031282057
PR Review Comment: https://git.openjdk.org/jdk/pull/24480#discussion_r2031271721
PR Review Comment: https://git.openjdk.org/jdk/pull/24480#discussion_r2031276482
PR Review Comment: https://git.openjdk.org/jdk/pull/24480#discussion_r2031277491
PR Review Comment: https://git.openjdk.org/jdk/pull/24480#discussion_r2031278295
PR Review Comment: https://git.openjdk.org/jdk/pull/24480#discussion_r2031273859


More information about the hotspot-compiler-dev mailing list