[aarch64-port-dev ] RFR: JDK-8203041 : AArch64: fix overflow in immediate cmp/cmn instruction

Fri May 18 11:15:31 UTC 2018

On 18.05.2018 11:41, Andrew Haley wrote:
> On 05/18/2018 04:03 AM, Joshua Zhu wrote:
>>
>> JVM crashed with "-XX:SoftwarePrefetchHintDistance=32760" option.
> It didn't crash: it was an assertion failure.
>
>> This causes overflow in immediate operand for jtreg TestOptionsWithRanges.
>>
>> See JBS for the backtrace.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8203041
>>
>> Webrev: http://cr.openjdk.java.net/~zyao/8203041/
> Thank you.
>
> This patch is problematic in several ways.  The type of
> prefetchLoopThreshold is int, so it's 32 bits.  Casting it (via the
> call to cmp()) to unsigned int is probably wrong too.
not quite that. prefetchLoopThreshold is calculated as 
(SoftwarePrefetchHintDistance + 32), where SoftwarePrefetchHintDistance 
is in [0, 32768] range by design here, so, it's always within 32 bits range.
First cmp argument is cnt1, which is amount of array elements left to 
check, which is also always positive 32bit integer. Basically, this cmp 
can be changed to cmpw, but it basically won't change anything for CPU.
>
> rscratch1 isn't available for use in a compare instruction because we
> sometimes say cmp(rscratch1, imm).  I suspect it'd be better to use a
> subsw in generate_large_array_equals() if you can find a spare
> register for the destination.
according to spec, cmp is an alias for subs when destination register is 
ZR, so it won't change anything
>
> There are several other coding problems in
> generate_large_array_equals(), in particular the use of code like
>
>    Register tmp1 = rscratch1
>
> This creates a hidden alias for rscratch1, but rscratch1 is used by
> assembler macros.  This is very dangerous.  I'll have a look some
> more.
>
This was done for unification purposes, to have all temporary registers 
named as "tmp*". "rscratch1" name is not used in this intrinsic code to 
avoid confusion. Also, assert for different registers is present to 
prevent few possible kinds of problems in future. Surely, this renaming 
can be removed if you think it'll make code look better.

Overall, I think it should be fixed by moving calculated immediate into 
register and then perform cmp/subs between 2 registers. For example, we 
can use tmp3. Then instead of:

__ cmp(cnt1, prefetchLoopThreshold);      (in 2 places)

code will look like:

__ mov(tmp3, prefetchLoopThreshold); // separate move to handle 
situation when prefetchLoopThreshold is not in 12 bit range
__ cmp(cnt1, tmp3);

Thanks,
Dmitrij