Performance improvement to unchecked segment ofNativeRestricted
Radosław Smogura
mail at smogura.eu
Thu Jan 7 22:09:21 UTC 2021
Hi all,
Thanks for feedback, and good to hear it's a good direction.
In fact my first change included a global scope, as Maurizio suggested. I added this null scope, as I have impression I constantly hit this check
private int getIntUnalignedInternal(Scope scope, Object base, long offset, boolean be) {
/* ...*/
if (scope != null) {
scope.checkValidState();
}
/* ... */
}
And I could not understand why hotspot leaves scope != null, and inlines empty method ('else' part looked more like jump to a trap). Now, I tired to use IGV to understand why, rebased one commit back, and it's ok...
I'll update branch, back-up original one, take closer look again, clean it, and come back later.
Thanks again for checks, and sorry for not needed bother.
Kind regards,
Rado
> I think the fundamental idea is solid, and I agree there's room to
> improve performance here, but as Ty notes, having a NULL scope might be
> problematic. I think it would perhaps be better to have a
> constant/stateless, always open scope, whose checkValidState method does
> nothing? If that scope is stored in a final static constant, I believe
> C2 will not have many problems in eliminating the overhead of the scope
> check.
>
> Maurizio
>
> On 06/01/2021 01:59, Ty Young wrote:
>>
>> You may also need to override checkValidState from
>> AbstractMemorySegmentImpl in EverythingSegment. checkValidState calls
>> the scope's checkValidState and since scope is null, you'll get a NPE.
>> The list of methods that call checkValidState include:
>>
>>
>> splitterator
>>
>> mismatch
>>
>> address
>>
>> handoff
>>
>> share
>>
>> registerCleaner
>>
>> close
>>
>>
>> In short, if you do basically anything but simply read/write via
>> VarHandle, including native library usage which uses address(), you're
>> going to get a NPE.
>>
>>
>>
>> On 1/5/21 4:32 PM, Radosław Smogura wrote:
>>> Hi all,
>>>
>>> I hope you have a good day.
>>>
>>> Here I would like to present some changes to increase performance of
>>> ofNativeRestricted - my benchmarks - where I tried to simulate access
>>> from code - outpaced the access to Java array (as intended). As it
>>> looks like that pull request flow has changed, I have to sign-up OCA
>>> (and if this change is fine I would be happy to do this).
>>>
>>> Below please find benchmark results and link to "pending PR" / branch
>>>
>>> The results outpaced the Java array access.
>>>
>>> Benchmark Mode Cnt Score Error
>>> Units
>>> AccessBenchmark.foreignAddress thrpt 4 86860188.499 ±
>>> 13454393.406 ops/s
>>> AccessBenchmark.foreignAddressRaw thrpt 4 96150181.668 ±
>>> 7025145.700 ops/s
>>> AccessBenchmark.target thrpt 4 93673099.539 ±
>>> 23272596.145 ops/s```
>>>
>>> versus tests on original repo
>>>
>>> Benchmark Mode Cnt Score Error Units
>>> AccessBenchmark.foreignAddress thrpt 4 81907199.092 ±
>>> 2663269.652 ops/s
>>> AccessBenchmark.foreignAddressRaw thrpt 4 83629168.611 ±
>>> 1025857.535 ops/s
>>> AccessBenchmark.target thrpt 4 94023553.582 ±
>>> 6128411.421 ops/s
>>>
>>> https://github.com/openjdk/panama-foreign/pull/431
>>> [https://avatars2.githubusercontent.com/u/41768318?s=400&v=4]<https://github.com/openjdk/panama-foreign/pull/431>
>>>
>>> [WIP] Performance improvement to unchecked segment ofNativeRestricted
>>> by rsmogura · Pull Request #431 ·
>>> openjdk/panama-foreign<https://github.com/openjdk/panama-foreign/pull/431>
>>> Here's a proposition to tune the access to global scope. This
>>> changes, could help to outpace the access to Java arrays using []
>>> operator. The results outpaced the Java array access. Benchmark ...
>>> github.com
>>>
>>> Kind regards,
>>> Rado
More information about the panama-dev
mailing list