Performance improvement to unchecked segment ofNativeRestricted

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Thu Jan 7 22:25:29 UTC 2021


On 07/01/2021 22:09, Radosław Smogura wrote:
> Hi all,
>
> Thanks for feedback, and good to hear it's a good direction.
>
> In fact my first change included a global scope, as Maurizio suggested. I added this null scope, as I have impression I constantly hit this check
>      private int getIntUnalignedInternal(Scope scope, Object base, long offset, boolean be) {
>          /* ...*/
>              if (scope != null) {
>                  scope.checkValidState();
>              }
>        /* ... */
>      }
> And I could not understand why hotspot leaves scope != null, and inlines empty method ('else' part looked more like jump to a trap). Now, I tired to use IGV to understand why, rebased one commit back, and it's ok...

Could it be that maybe you hit this compiled method from two places - 
one where scope == null and one where scope != null, but then the scope 
check is trivial (because of your NullScope).

This could happen e.g. if your code is also using other APIs such as 
ByteBuffer heavily enough to trigger compilation of this method. That 
said, to overcome these issues, we turned on sharp profiling and 
inlining on all these ScopedMemoryAccess methods, so pollution shouldn't 
be a problem, especially cross API.

Let us know if you find anything interesting!

Thanks
Maurizio

>
> I'll update  branch, back-up original one, take closer look again, clean it, and come back later.
>
> Thanks again for checks, and sorry for not needed bother.
>
> Kind regards,
> Rado
>
>> I think the fundamental idea is solid, and I agree there's room to
>> improve performance here, but as Ty notes, having a NULL scope might be
>> problematic. I think it would perhaps be better to have a
>> constant/stateless, always open scope, whose checkValidState method does
>> nothing? If that scope is stored in a final static constant, I believe
>> C2 will not have many problems in eliminating the overhead of the scope
>> check.
>>
>> Maurizio
>>
>> On 06/01/2021 01:59, Ty Young wrote:
>>> You may also need to override checkValidState from
>>> AbstractMemorySegmentImpl in EverythingSegment. checkValidState calls
>>> the scope's checkValidState and since scope is null, you'll get a NPE.
>>> The list of methods that call checkValidState include:
>>>
>>>
>>> splitterator
>>>
>>> mismatch
>>>
>>> address
>>>
>>> handoff
>>>
>>> share
>>>
>>> registerCleaner
>>>
>>> close
>>>
>>>
>>> In short, if you do basically anything but simply read/write via
>>> VarHandle, including native library usage which uses address(), you're
>>> going to get a NPE.
>>>
>>>
>>>
>>> On 1/5/21 4:32 PM, Radosław Smogura wrote:
>>>> Hi all,
>>>>
>>>> I hope you have a good day.
>>>>
>>>> Here I would like to present some changes to increase performance of
>>>> ofNativeRestricted - my benchmarks - where I tried to simulate access
>>>> from code - outpaced the access to Java array (as intended). As it
>>>> looks like that pull request flow has changed, I have to sign-up OCA
>>>> (and if this change is fine I would be happy to do this).
>>>>
>>>> Below please find benchmark results and link to "pending PR" / branch
>>>>
>>>> The results outpaced the Java array access.
>>>>
>>>> Benchmark                           Mode  Cnt Score          Error
>>>> Units
>>>> AccessBenchmark.foreignAddress     thrpt    4  86860188.499 ±
>>>> 13454393.406  ops/s
>>>> AccessBenchmark.foreignAddressRaw  thrpt    4  96150181.668 ±
>>>> 7025145.700  ops/s
>>>> AccessBenchmark.target             thrpt    4  93673099.539 ±
>>>> 23272596.145  ops/s```
>>>>
>>>> versus tests on original repo
>>>>
>>>> Benchmark                           Mode  Cnt Score         Error  Units
>>>> AccessBenchmark.foreignAddress     thrpt    4  81907199.092 ±
>>>> 2663269.652  ops/s
>>>> AccessBenchmark.foreignAddressRaw  thrpt    4  83629168.611 ±
>>>> 1025857.535  ops/s
>>>> AccessBenchmark.target             thrpt    4  94023553.582 ±
>>>> 6128411.421  ops/s
>>>>
>>>> https://urldefense.com/v3/__https://github.com/openjdk/panama-foreign/pull/431__;!!GqivPVa7Brio!NjDzQjnuIfj8QDuZCy61K8fphZLKHZwMwp7YYRxaKinV1vUfDCsDzW1h_fk0t0hemqPqIyg$
>>>> [https://urldefense.com/v3/__https://avatars2.githubusercontent.com/u/41768318?s=400&v=4__;!!GqivPVa7Brio!NjDzQjnuIfj8QDuZCy61K8fphZLKHZwMwp7YYRxaKinV1vUfDCsDzW1h_fk0t0hecmGoyf0$ ]<https://urldefense.com/v3/__https://github.com/openjdk/panama-foreign/pull/431__;!!GqivPVa7Brio!NjDzQjnuIfj8QDuZCy61K8fphZLKHZwMwp7YYRxaKinV1vUfDCsDzW1h_fk0t0hemqPqIyg$ >
>>>>
>>>> [WIP] Performance improvement to unchecked segment ofNativeRestricted
>>>> by rsmogura · Pull Request #431 ·
>>>> openjdk/panama-foreign<https://urldefense.com/v3/__https://github.com/openjdk/panama-foreign/pull/431__;!!GqivPVa7Brio!NjDzQjnuIfj8QDuZCy61K8fphZLKHZwMwp7YYRxaKinV1vUfDCsDzW1h_fk0t0hemqPqIyg$ >
>>>> Here's a proposition to tune the access to global scope. This
>>>> changes, could help to outpace the access to Java arrays using []
>>>> operator. The results outpaced the Java array access. Benchmark ...
>>>> github.com
>>>>
>>>> Kind regards,
>>>> Rado


More information about the panama-dev mailing list