[foreign-memaccess+abi] RFR: 8268743: Require a better way for copying data between MemorySegments and on-heap arrays
Uwe Schindler
uschindler at openjdk.java.net
Mon Jun 21 14:25:10 UTC 2021
On Wed, 16 Jun 2021 11:13:41 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:
>>> > To call the new static MemorySegment.copy() method, I still need to wrap the byte[] array, so I should merge in THIS pull request? From looking at your repository it does not seem to have the HeapMemorySegment SMALL change.
>>>
>>> Leave alone this pull request, and just work off the branch I pointed out, which contains the static method. The static method just takes two segments, two offsets (in bytes) and a length, it's like System.arrayCopy.
>>
>> I will just check both branches separately. First this pull request, if it makes the allocations go away (which looks like it won't). Then the other one.
>>
>> Give me a few days, will work on this after the conference.
>>
>>> But let's first verify that this is indeed what's causing the degradation for Lucene. You mentioned that the copy method were called "millions of times" which makes it unlikely for them to be completely "cold" and unoptimized.
>>
>> That was meant "symbolic". Not sure how often it is called in reality. But from the number of allocations on heap, it seems often. We found similar issues in the past, too: Sometimes Hotspot refuses to optimize a method for unknown reasons. It seems to have to do with how it is called. Sometimes the complexity of code inside Lucene is dramatic. Internals are looking very "C like". Huge methods with complex spaghetti-like logic, randomly calling methods from MemorySegmentIndexInput (previously ByteBufferIndexInput). Most of the logic uses the "iterator/enum" aproach. Search thread gets new document hits by consuming some enum, which itsself calls IndexInput.
>
>> I will just check both branches separately. First this pull request, if it makes the allocations go away (which looks like it won't). Then the other one.
>
> You might also wanna try the one in:
> https://github.com/openjdk/panama-foreign/pull/555
>
> As you said, the fact that the copy is wrapped in a `@ForceInline` might help. Also, I realized (and that's probably what you were referring to) that the single static copy method would still require wrapping the array in a segment.
Hi @mcimadamore,
this pull request helps nothing. The garbage is still created. I compiled a JDK with exactly this branch and ran the benchmark against it:
thetaphi at serv1:~/panama-foreign$ git remote -v
mcimadamore https://github.com/mcimadamore/panama-foreign.git (fetch)
mcimadamore https://github.com/mcimadamore/panama-foreign.git (push)
origin https://github.com/openjdk/panama-foreign.git (fetch)
origin https://github.com/openjdk/panama-foreign.git (push)
thetaphi at serv1:~/panama-foreign$ git status
On branch copy-bench-lucene
Your branch is up to date with 'mcimadamore/copy-bench-lucene'.
nothing to commit, working tree clean
thetaphi at serv1:~/panama-foreign$
JFR output looks identical. Again it is worse with `-Xbatch` and `-XX:-TieredCompilation` from the total runtime.
I will now try #555 as an alternative. To me it looks like Hotspot does not try to optimize the methods at all because maybe the complex structure of our code.
-------------
PR: https://git.openjdk.java.net/panama-foreign/pull/560
More information about the panama-dev
mailing list