[foreign] Poor performance?
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Fri May 17 15:33:39 UTC 2019
Thanks Jorn,
I'd be more interested in knowing the raw native call numbers, does it
get any better with linkToNative? Here I'd be expecting performances
identical to JNI (since the binder should lower the Pointer to a long,
which LinkToNative would then pass by register).
As for the fuller benchmark, note that you are also measuring the
performances of Scope::allocate, which is internally using some maps.
JNR/JNI does not do the same liveliness checks that we do, so the full
benchmark is not totally fair. But the arw performance of the downcall
should be an apple-to-apple comparison, and it shouldn't be 8x slower as
it is now (at least not with linkToNative).
Maurizio
On 17/05/2019 16:14, Jorn Vernee wrote:
> FWIW, I ran the benchmarks with the linkToNative back-end (using
> -Djdk.internal.foreign.NativeInvoker.FASTPATH=direct), but it's still
> 2x slower than JNI:
>
> Benchmark Mode Cnt Score Error
> Units
> JmhGetSystemTimeSeconds.jni_javacpp avgt 50 298.046 ▒
> 15.744 ns/op
> JmhGetSystemTimeSeconds.panama_prelayout avgt 50 596.567 ▒
> 20.570 ns/op
>
> Of course, like Aleksey says: "The numbers [above] are just data. To
> gain reusable insights, you need to follow up on why the numbers are
> the way they are.". Unfortunately, I'm having some trouble getting the
> project to work with the Windows profiler :/ Was currently looking
> into that.
>
> Cheers,
> Jorn
>
> Maurizio Cimadamore schreef op 2019-05-17 16:51:
>> On 17/05/2019 11:26, Maurizio Cimadamore wrote:
>>> thanks you for bringing this up, I saw this benchmark few days ago
>>> and I took a look at it. That benchmark is unfortunately hitting on
>>> a couple of (transitory!) pain points: (1) it is running on Windows,
>>> which lacks the optimizations available for MacOS and Linux
>>> (directInvoker). When the linkToNative effort will be completed,
>>> this discrepancy between platforms will go away. The second problem
>>> (2) is that the call is passing a big struct (e.g. bigger than 64
>>> bits). Even on Linux and Mac, such a call would be unable to take
>>> advantage of the optimized invoker and would fall back to the so
>>> called 'universal invoker' which is slow.
>>
>> Actually, my bad, the bench is passing pointer to structs, not structs
>> by value - which I think should mean the 'foreign+linkToNative'
>> experimental branch should be able to handle this. Would be nice to
>> get some confirmation that this is indeed the case.
>>
>> Maurizio
More information about the panama-dev
mailing list