[aarch64-port-dev ] RFR: JDK-8203041 : AArch64: fix overflow in immediate cmp/cmn instruction
Dmitrij Pochepko
dmitrij.pochepko at bell-sw.com
Fri May 18 11:15:31 UTC 2018
On 18.05.2018 11:41, Andrew Haley wrote:
> On 05/18/2018 04:03 AM, Joshua Zhu wrote:
>>
>> JVM crashed with "-XX:SoftwarePrefetchHintDistance=32760" option.
> It didn't crash: it was an assertion failure.
>
>> This causes overflow in immediate operand for jtreg TestOptionsWithRanges.
>>
>> See JBS for the backtrace.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8203041
>>
>> Webrev: http://cr.openjdk.java.net/~zyao/8203041/
> Thank you.
>
> This patch is problematic in several ways. The type of
> prefetchLoopThreshold is int, so it's 32 bits. Casting it (via the
> call to cmp()) to unsigned int is probably wrong too.
not quite that. prefetchLoopThreshold is calculated as
(SoftwarePrefetchHintDistance + 32), where SoftwarePrefetchHintDistance
is in [0, 32768] range by design here, so, it's always within 32 bits range.
First cmp argument is cnt1, which is amount of array elements left to
check, which is also always positive 32bit integer. Basically, this cmp
can be changed to cmpw, but it basically won't change anything for CPU.
>
> rscratch1 isn't available for use in a compare instruction because we
> sometimes say cmp(rscratch1, imm). I suspect it'd be better to use a
> subsw in generate_large_array_equals() if you can find a spare
> register for the destination.
according to spec, cmp is an alias for subs when destination register is
ZR, so it won't change anything
>
> There are several other coding problems in
> generate_large_array_equals(), in particular the use of code like
>
> Register tmp1 = rscratch1
>
> This creates a hidden alias for rscratch1, but rscratch1 is used by
> assembler macros. This is very dangerous. I'll have a look some
> more.
>
This was done for unification purposes, to have all temporary registers
named as "tmp*". "rscratch1" name is not used in this intrinsic code to
avoid confusion. Also, assert for different registers is present to
prevent few possible kinds of problems in future. Surely, this renaming
can be removed if you think it'll make code look better.
Overall, I think it should be fixed by moving calculated immediate into
register and then perform cmp/subs between 2 registers. For example, we
can use tmp3. Then instead of:
__ cmp(cnt1, prefetchLoopThreshold); (in 2 places)
code will look like:
__ mov(tmp3, prefetchLoopThreshold); // separate move to handle
situation when prefetchLoopThreshold is not in 12 bit range
__ cmp(cnt1, tmp3);
Thanks,
Dmitrij
More information about the aarch64-port-dev
mailing list