[aarch64-port-dev ] RFR: 8218748: AARCH64: String::compareTo intrinsic documentation and maintenance improvement
Patrick Zhang OS
patrick at os.amperecomputing.com
Fri Nov 15 07:51:17 UTC 2019
Hi Dmitrij,
The inline document inside your this patch is nice in helping me understand the string_compare stub code generation in depth, although I don't know why it was not pushed.
http://cr.openjdk.java.net/~dpochepk/8218748/webrev.02/src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp.sdiff.html
There is one point I don't quite understand, could you please clarify it a little more? Thanks in advance!
The large loop with prefetching logic, in different_encoding function, it uses "cnt2 - prefetchLoopExitCondition" to tell whether the 1st iteration should be executed or not, while the same_encoding does not do this, why?
I was thinking that "prefetching out of the string boundary" could be an invalid operation, but it seems a misunderstanding, isn't it? The AArch64 ISA document says that prfm instruction signals the memory system and expects "preloading the cache line containing the specified address into one or more caches" would be done, in order to speed up the memory accesses when they do occur. If this is just a hint for coming ldrs, and safe enough, could we not restrict the rest iterations (2 ~ n) with largeLoopExitCondition? instead use 64 LL/UU, and 128 for LU/UL. As such more iteration can say in the large loop (SoftwarePrefetchHintDistance=192 for example), for better performance. Any comments?
Thanks
4327 address generate_compare_long_string_different_encoding(bool isLU) {
4377 if (SoftwarePrefetchHintDistance >= 0) {
4378 __ subs(rscratch2, cnt2, prefetchLoopExitCondition);
4379 __ br(__ LT, NO_PREFETCH);
4380 __ bind(LARGE_LOOP_PREFETCH); // 64-characters loop
... ...
4395 __ subs(rscratch2, cnt2, prefetchLoopExitCondition); // <-- could we use subs(rscratch2, cnt2, 128) instead?
4396 __ br(__ GE, LARGE_LOOP_PREFETCH);
4397 } // end of 64-characters loop
4616 address generate_compare_long_string_same_encoding(bool isLL) {
4637 if (SoftwarePrefetchHintDistance >= 0) {
4638 __ bind(LARGE_LOOP_PREFETCH);
4639 __ prfm(Address(str1, SoftwarePrefetchHintDistance));
4640 __ prfm(Address(str2, SoftwarePrefetchHintDistance));
4641 compare_string_16_bytes_same(DIFF, DIFF2);
4642 compare_string_16_bytes_same(DIFF, DIFF2);
4643 __ sub(cnt2, cnt2, 8 * characters_in_word);
4644 compare_string_16_bytes_same(DIFF, DIFF2);
4645 __ subs(rscratch2, cnt2, largeLoopExitCondition); // rscratch2 is not used. Use subs instead of cmp in case of potentially large constants // <-- could we use subs(rscratch2, cnt2, 64) instead?
4646 compare_string_16_bytes_same(DIFF, DIFF2);
4647 __ br(__ GT, LARGE_LOOP_PREFETCH);
4648 __ cbz(cnt2, LAST_CHECK); // no more loads left
4649 }
Regards
Patrick
-----Original Message-----
From: hotspot-compiler-dev <hotspot-compiler-dev-bounces at openjdk.java.net> On Behalf Of Dmitry Samersoff
Sent: Sunday, May 19, 2019 11:42 PM
To: Dmitrij Pochepko <dmitrij.pochepko at bell-sw.com>; Andrew Haley <aph at redhat.com>; Pengfei Li (Arm Technology China) <Pengfei.Li at arm.com>
Cc: hotspot-compiler-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
Subject: Re: [aarch64-port-dev ] RFR: 8218748: AARCH64: String::compareTo intrinsic documentation and maintenance improvement
Dmitrij,
The changes looks good to me.
-Dmitry
On 25.02.2019 19:52, Dmitrij Pochepko wrote:
> Hi Andrew, Pengfei,
>
> I created webrev.02 with all your suggestions implemented:
>
> webrev: http://cr.openjdk.java.net/~dpochepk/8218748/webrev.02/
>
> - comments are now both in separate section and inlined into code.
> - documentation mismatch mentioned by Pengfei is fixed:
> -- SHORT_LAST_INIT label name misprint changed to correct SHORT_LAST
> -- SHORT_LOOP_TAIL block now merged with last instruction.
> Documentation is updated respectively
> - minor other changes to layout and wording
>
> Newly developed tests were run as sanity and they passed.
>
> Thanks,
> Dmitrij
>
> On 22/02/2019 6:42 PM, Andrew Haley wrote:
>> On 2/22/19 10:31 AM, Pengfei Li (Arm Technology China) wrote:
>>
>>> So personally, I still prefer to inline the comments with the
>>> original code block to avoid this kind of inconsistencies. And it
>>> makes us easier to review or maintain the code together with the
>>> doc, as we don't need to scroll back and force. I don't know the
>>> benefit of making the code documentation as a separate part. What's
>>> your opinion, Andrew Haley?
>> I agree with you. There's no harm having both inline and separate.
>>
More information about the hotspot-compiler-dev
mailing list