[External] : Re: MemorySegment.ofAddress(...).reinterpret(...)
Brian S O'Neill
bronee at gmail.com
Thu Jul 6 18:45:15 UTC 2023
Thanks for looking into this. I'm already doing unaligned accesses, so
there's not much I can do at this point on my end. I'll try to identify
which operations are the most expensive and report back.
On 2023-07-06 11:31 AM, Jorn Vernee wrote:
> I took a look at the generated assembly [1] for the payload method in:
>
> static final int ELEM_SIZE = 10;
> static final int CARRIER_SIZE = (int)JAVA_INT.byteSize();
> static final int BYTE_SIZE = ELEM_SIZE * CARRIER_SIZE;
> static final MemorySegment ALL =
> MemorySegment.NULL.reinterpret(Long.MAX_VALUE);
>
> public static void main(String[] args) {
> int[] ints = new int[ELEM_SIZE];
> long unsafe_addr = Arena.ofAuto().allocate(BYTE_SIZE).address();
> for (int i = 0; i < ints.length ; i++) {
> ints[i] = i;
> }
>
> State state = new State(ints, unsafe_addr);
> for (int i = 0; i < 20_000; i++) {
> payload(state);
> }
> }
>
> record State(int[] ints, long unsafe_addr) {}
>
> public static void payload(State state) {
> MemorySegment.copy(state.ints(), 0, ALL, JAVA_INT,
> state.unsafe_addr(), ELEM_SIZE);
> }
>
> There are three checks we do to ensure safety:
> 1. a bounds check, which checks if the accessed address range is in bounds
> 2. an alignment check to see that the base address is aligned according
> to the alignment of JAVA_INT
> 3. a liveness check to see that the segment being accessed is still alive
>
> #2 can be removed on your end by using the JAVA_INT_UNALIGNED layout
> instead. #3 shouldn't really be happening, since the scope of the ALL
> segment is the global scope, which is always alive. I have a fix that
> overrides the liveness check method in the GlobalSession class to do
> nothing [2]. On my machine, addressing those 2 brings the performance of
> the Panama-based copy very close to that of Unsafe (within 0.2 ns):
>
> Benchmark Mode Cnt Score Error Units
> MemorySegmentCopy.segment_copy_static_small avgt 30 7.741 ± 0.017
> ns/op
> MemorySegmentCopy.unsafe_copy_small avgt 30 7.534 ± 0.011
> ns/op
>
> Also, keep in mind that I'm just copying 10 elements here, for larger
> copies the overhead would be smaller.
>
> #1 could technically also be removed by exposing a special memory
> segment implementation that doesn't do any bounds checking. I'm not sure
> we want to go there though, given the API surface/complexity we'd
> introduce, for a very marginal gain. At least I'd like to discuss that
> with Maurizio first (who's currently on vacation).
>
> #2 is definitely something we can address I think. I've filed:
> https://bugs.openjdk.org/browse/JDK-8311594
>
> HTH,
> Jorn
>
More information about the panama-dev
mailing list