RFR: 8338591: Improve performance of MemorySegment::copy
Per Minborg
pminborg at openjdk.org
Wed Sep 4 08:19:21 UTC 2024
On Tue, 3 Sep 2024 15:23:20 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:
>> src/java.base/share/classes/jdk/internal/foreign/AbstractMemorySegmentImpl.java line 637:
>>
>>> 635: for (; offset < limit; offset += 8) {
>>> 636: final long v = SCOPED_MEMORY_ACCESS.getLong(src.sessionImpl(), src.unsafeGetBase(), src.unsafeGetOffset() + srcOffset + offset);
>>> 637: SCOPED_MEMORY_ACCESS.putLong(dst.sessionImpl(), dst.unsafeGetBase(), dst.unsafeGetOffset() + dstOffset + offset, v);
>>
>> I've been digging a bit deeper on why we can get away w/o aligned access (both here and for `fill`). It turns out that, conveniently, var handles use `Unsafe::getXYZUnaligned` for plain set/get. This means that the intrinsics will deal with unaligned access accordingly, depending on the platform (either by splitting into multiple accesses, or by issuing an unaligned access on platforms that support that).
>>
>> This means we can write "simple" code, and expect C2 to rewrite it the most optimal way. We should probably add a comment in this direction (both for `fill` and `copy`.
>
> Separately, I wonder if it would make sense to put all these "tricky" routines off to one side - e.g. `BulkSegmentSupport`, so that we don't make `AbstractMemorySegmentImpl` even bigger than it is.
Having these bulk methods in a separate class makes sense.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/20829#discussion_r1743280272
More information about the core-libs-dev
mailing list