RFR: AARCH64: optimize string compare intrinsic

Tue May 8 22:36:57 UTC 2018

> On May 8, 2018, at 6:31 AM, Dmitrij Pochepko <dmitrij.pochepko at bell-sw.com> wrote:
> 
> 
> 
> On 04.05.2018 16:46, Andrew Haley wrote:
>> On 05/04/2018 02:24 PM, Dmitrij Pochepko wrote:
>>> Do you suggest to change vectorizedMismatch from generic single entry
>>> point to 4 versions (1,2,4 and 8 -byte) each optimized for respective
>>> size(and possible re-using code generation logic)? Then it can be
>>> re-used for same-encoded strings without penalties, indeed, but it
>>> requires changes in jdk.internal.util.ArraysSupport.java
>> I don't see why it's absolutely necessary.  On the other hand, it might
>> be an excellent idea to have a switch statement in the Java code wich
>> will almost always optimized away.  It's worth trying.
>> 
> As this is a separate possibly multiplatform effort which potentially affects common code and x86 platform intrinsic implementation I created a separate enhancement for this issue: https://bugs.openjdk.java.net/browse/JDK-8202783
> 

Thank you (i cannot look at the issue right now as JBS is down for maintenance).

I don’t understand why you need to convert vectorizedMismatch to 4 versions, the stub is generated using the most optimal vector instructions for the platform (on x86). A single version should suffice with surrounding Java code detecting thresholds and managing the tail elements. See the wrapping Java methods in jdk.internal.util.ArraysSupport and similar methods in java.nio.BufferMismatch. (Note that fixed thresholds are used, and i did not do any measurements on platforms with larger vector sizes to determine if the thresholds should be adjusted, but vectorizedMismatch implementation will use smaller vectors sizes if need be.)

Paul.

> Let me know if you have any comments on the patch.
> 
> Thanks,
> Dmitrij