[aarch64-port-dev ] RFR: 8218748: AARCH64: String::compareTo intrinsic documentation and maintenance improvement

Patrick Zhang OS patrick at os.amperecomputing.com
Fri Nov 15 07:51:17 UTC 2019


Hi Dmitrij,

The inline document inside your this patch is nice in helping me understand the string_compare stub code generation in depth, although I don't know why it was not pushed. 
http://cr.openjdk.java.net/~dpochepk/8218748/webrev.02/src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp.sdiff.html 

There is one point I don't quite understand, could you please clarify it a little more? Thanks in advance!
The large loop with prefetching logic, in different_encoding function, it uses "cnt2 - prefetchLoopExitCondition" to tell whether the 1st iteration should be executed or not, while the same_encoding does not do this, why?

I was thinking that "prefetching out of the string boundary" could be an invalid operation, but it seems a misunderstanding, isn't it? The AArch64 ISA document says that prfm instruction signals the memory system and expects "preloading the cache line containing the specified address into one or more caches" would be done, in order to speed up the memory accesses when they do occur. If this is just a hint for coming ldrs, and safe enough, could we not restrict the rest iterations (2 ~ n) with largeLoopExitCondition? instead use 64 LL/UU, and 128 for LU/UL. As such more iteration can say in the large loop (SoftwarePrefetchHintDistance=192 for example), for better performance. Any comments?

Thanks 

4327   address generate_compare_long_string_different_encoding(bool isLU) {
4377     if (SoftwarePrefetchHintDistance >= 0) {
4378       __ subs(rscratch2, cnt2, prefetchLoopExitCondition);
4379       __ br(__ LT, NO_PREFETCH);
4380       __ bind(LARGE_LOOP_PREFETCH);                  // 64-characters loop
... ...
4395           __ subs(rscratch2, cnt2, prefetchLoopExitCondition); // <-- could we use subs(rscratch2, cnt2, 128) instead?
4396           __ br(__ GE, LARGE_LOOP_PREFETCH);
4397     } // end of 64-characters loop

4616   address generate_compare_long_string_same_encoding(bool isLL) {
4637     if (SoftwarePrefetchHintDistance >= 0) {
4638       __ bind(LARGE_LOOP_PREFETCH);
4639         __ prfm(Address(str1, SoftwarePrefetchHintDistance));
4640         __ prfm(Address(str2, SoftwarePrefetchHintDistance));
4641         compare_string_16_bytes_same(DIFF, DIFF2);
4642         compare_string_16_bytes_same(DIFF, DIFF2);
4643         __ sub(cnt2, cnt2, 8 * characters_in_word);
4644         compare_string_16_bytes_same(DIFF, DIFF2);
4645         __ subs(rscratch2, cnt2, largeLoopExitCondition); // rscratch2 is not used. Use subs instead of cmp in case of potentially large constants // <-- could we use subs(rscratch2, cnt2, 64) instead?
4646         compare_string_16_bytes_same(DIFF, DIFF2);
4647         __ br(__ GT, LARGE_LOOP_PREFETCH);
4648         __ cbz(cnt2, LAST_CHECK);                    // no more loads left
4649     }

Regards
Patrick

-----Original Message-----
From: hotspot-compiler-dev <hotspot-compiler-dev-bounces at openjdk.java.net> On Behalf Of Dmitry Samersoff
Sent: Sunday, May 19, 2019 11:42 PM
To: Dmitrij Pochepko <dmitrij.pochepko at bell-sw.com>; Andrew Haley <aph at redhat.com>; Pengfei Li (Arm Technology China) <Pengfei.Li at arm.com>
Cc: hotspot-compiler-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
Subject: Re: [aarch64-port-dev ] RFR: 8218748: AARCH64: String::compareTo intrinsic documentation and maintenance improvement

Dmitrij,

The changes looks good to me.

-Dmitry

On 25.02.2019 19:52, Dmitrij Pochepko wrote:
> Hi Andrew, Pengfei,
> 
> I created webrev.02 with all your suggestions implemented:
> 
> webrev: http://cr.openjdk.java.net/~dpochepk/8218748/webrev.02/
> 
> - comments are now both in separate section and inlined into code.
> - documentation mismatch mentioned by Pengfei is fixed:
> -- SHORT_LAST_INIT label name misprint changed to correct SHORT_LAST
> -- SHORT_LOOP_TAIL block now merged with last instruction. 
> Documentation is updated respectively
> - minor other changes to layout and wording
> 
> Newly developed tests were run as sanity and they passed.
> 
> Thanks,
> Dmitrij
> 
> On 22/02/2019 6:42 PM, Andrew Haley wrote:
>> On 2/22/19 10:31 AM, Pengfei Li (Arm Technology China) wrote:
>>
>>> So personally, I still prefer to inline the comments with the 
>>> original code block to avoid this kind of inconsistencies. And it 
>>> makes us easier to review or maintain the code together with the 
>>> doc, as we don't need to scroll back and force. I don't know the 
>>> benefit of making the code documentation as a separate part. What's 
>>> your opinion, Andrew Haley?
>> I agree with you. There's no harm having both inline and separate.
>>



More information about the hotspot-compiler-dev mailing list