RFR: 8329331: Intrinsify Unsafe::setMemory [v21]
Jorn Vernee
jvernee at openjdk.org
Fri Apr 19 17:50:06 UTC 2024
On Tue, 16 Apr 2024 00:04:15 GMT, Scott Gibbons <sgibbons at openjdk.org> wrote:
>> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change.
>>
>> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes.
>>
>> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`).
>>
>> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt)
>
> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:
>
> Add enter() and leave(); remove Windows-specific register stuff
src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 4013:
> 4011: // Initialize table for unsafe copy memeory check.
> 4012: if (UnsafeMemoryAccess::_table == nullptr) {
> 4013: UnsafeMemoryAccess::create_table(26);
How did you arrive at a table size of 26?
src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2603:
> 2601: const Register wide_value = rax;
> 2602: const Register rScratch1 = r10;
> 2603:
Maybe put an `assert_different_registers` here for the above registers, just to be sure. (I see you are avoiding the existing `rscratch1` already, because of a conflict with `c_rarg2`)
src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2674:
> 2672: // Parameter order is (ptr, byteVal, size)
> 2673: __ xchgq(c_rarg1, c_rarg2);
> 2674: __ pop(rbp); // Clear effect of enter()
Why not just use `leave()` here?
src/hotspot/share/opto/library_call.cpp line 4959:
> 4957: if (stopped()) return true;
> 4958:
> 4959: if (StubRoutines::unsafe_setmemory() == nullptr) return false;
I don't see why this check is needed here, since we already check whether the stub is there in `is_intrinsic_supported`.
Note that `inline_unsafe_copyMemory` also doesn't have this check. I think it would be good to keep consistency between the two.
src/hotspot/share/opto/runtime.cpp line 780:
> 778: const TypeFunc* OptoRuntime::make_setmemory_Type() {
> 779: // create input type (domain)
> 780: int num_args = 4;
This variable seems redundant.
src/hotspot/share/opto/runtime.cpp line 786:
> 784: fields[argp++] = TypePtr::NOTNULL; // dest
> 785: fields[argp++] = TypeLong::LONG; // size
> 786: fields[argp++] = Type::HALF; // size
Since the size is a `size_t`, I don't think this is correct on 32-bit platforms. I think here we want `TypeX_X`, and then add the extra `HALF` only on 64-bit platforms. Similar to what we do in `make_arraycopy_Type`: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/runtime.cpp#L799-L802
(Note that you will also have to adjust `argcnt` for this)
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572570842
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572578437
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572593795
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572556648
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572564382
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572562058
More information about the core-libs-dev
mailing list