RFR: 8329331: Intrinsify Unsafe::setMemory [v21]

Jorn Vernee jvernee at openjdk.org
Fri Apr 19 17:50:06 UTC 2024


On Tue, 16 Apr 2024 00:04:15 GMT, Scott Gibbons <sgibbons at openjdk.org> wrote:

>> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64.  See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change.
>> 
>> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes.
>> 
>> Tested with tier-1 (and full CI).  I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`).
>> 
>> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt)
>
> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Add enter() and leave(); remove Windows-specific register stuff

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 4013:

> 4011:   // Initialize table for unsafe copy memeory check.
> 4012:   if (UnsafeMemoryAccess::_table == nullptr) {
> 4013:     UnsafeMemoryAccess::create_table(26);

How did you arrive at a table size of 26?

src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2603:

> 2601:     const Register wide_value = rax;
> 2602:     const Register rScratch1 = r10;
> 2603: 

Maybe put an `assert_different_registers` here for the above registers, just to be sure. (I see you are avoiding the existing `rscratch1` already, because of a conflict with `c_rarg2`)

src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2674:

> 2672:     // Parameter order is (ptr, byteVal, size)
> 2673:     __ xchgq(c_rarg1, c_rarg2);
> 2674:     __ pop(rbp);    // Clear effect of enter()

Why not just use `leave()` here?

src/hotspot/share/opto/library_call.cpp line 4959:

> 4957:   if (stopped())  return true;
> 4958: 
> 4959:   if (StubRoutines::unsafe_setmemory() == nullptr) return false;

I don't see why this check is needed here, since we already check whether the stub is there in `is_intrinsic_supported`.

Note that `inline_unsafe_copyMemory` also doesn't have this check. I think it would be good to keep consistency between the two.

src/hotspot/share/opto/runtime.cpp line 780:

> 778: const TypeFunc* OptoRuntime::make_setmemory_Type() {
> 779:   // create input type (domain)
> 780:   int num_args      = 4;

This variable seems redundant.

src/hotspot/share/opto/runtime.cpp line 786:

> 784:   fields[argp++] = TypePtr::NOTNULL;    // dest
> 785:   fields[argp++] = TypeLong::LONG;      // size
> 786:   fields[argp++] = Type::HALF;          // size

Since the size is a `size_t`, I don't think this is correct on 32-bit platforms. I think here we want `TypeX_X`, and then add the extra `HALF` only on 64-bit platforms. Similar to what we do in `make_arraycopy_Type`: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/runtime.cpp#L799-L802

(Note that you will also have to adjust `argcnt` for this)

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572570842
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572578437
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572593795
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572556648
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572564382
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572562058


More information about the core-libs-dev mailing list