RFR: 8338967: Improve performance for MemorySegment::fill [v9]

Maurizio Cimadamore mcimadamore at openjdk.org
Fri Aug 30 10:05:21 UTC 2024


On Fri, 30 Aug 2024 09:09:57 GMT, Per Minborg <pminborg at openjdk.org> wrote:

>> The performance of the `MemorySegment::fil` can be improved by replacing the `checkAccess()` method call with calling `checkReadOnly()` instead (as the bounds of the segment itself do not need to be checked).
>> 
>> Also, smaller segments can be handled directly by Java code rather than transitioning to native code.
>> 
>> Here is how the `MemorySegment::fill` performance is improved by this PR:
>> 
>> ![image](https://github.com/user-attachments/assets/ee29fdf0-a7cf-4d5b-bb6b-278b01d97e3c)
>> 
>> Operations involving 8 or more bytes are delegated to native code whereas smaller segments are handled via a switch rake.
>> 
>> It should be noted that `Arena::allocate` is using `MemorySegment::fil`. Hence, this PR will also have a positive effect on memory allocation performance.
>
> Per Minborg has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove unused imports

src/java.base/share/classes/jdk/internal/misc/X-ScopedMemoryAccess.java.template line 200:

> 198: 
> 199:     @ForceInline @Scoped
> 200:     private void setMemoryInternal(AbstractMemorySegmentImpl segment, long length, byte value) {

All this logic should be in the memory segment impl class, and the calls to `UNSAFE` replaced with calls to `SCOPED_MEMORY_ACCESS`. This logic is complex for a scoped method, and you risk running into limitations of scoped methods (they cannot exceed 10 Java frames). In general, scoped methods are really meant to be wrappers around some JVM intrinsics.

src/java.base/share/classes/jdk/internal/misc/X-ScopedMemoryAccess.java.template line 206:

> 204:                 session.checkValidStateRaw();
> 205:             }
> 206:             // 0...0X...XXXX implies: 0 <= length < FILL_NATIVE_LIMIT

I would also recommend against "ultra" fine tuning and have limits that are different from platform to platform. It seems like we're using a different limit here because of https://bugs.openjdk.org/browse/JDK-8338975. But I'm not sure working around that is in the scope of the current PR.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20712#discussion_r1738351494
PR Review Comment: https://git.openjdk.org/jdk/pull/20712#discussion_r1738354035


More information about the core-libs-dev mailing list