RFR: 8320448: Accelerate IndexOf using AVX2 [v37]

Scott Gibbons sgibbons at openjdk.org
Fri May 24 20:47:24 UTC 2024


On Fri, 24 May 2024 19:30:54 GMT, Volodymyr Paprotski <duke at openjdk.org> wrote:

>> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   mov64 => lea(InternalAddress)
>
> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4633:
> 
>> 4631:     andl(result, 0x0000000f);  //   tail count (in bytes)
>> 4632:     andl(limit, 0xfffffff0);   // vector count (in bytes)
>> 4633:     jcc(Assembler::zero, COMPARE_TAIL);
> 
> In the `expand_ary2` case, this is the same andl/compare as line 4549; i.e. I think you can just put `jcc(Assembler::zero, COMPARE_TAIL);` on line 4549, inside the if (and move the other jcc into the else branch)?

OK.  Shortens pathlength by 4 instructions.

> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4639:
> 
>> 4637:     negptr(limit);
>> 4638: 
>> 4639:     bind(COMPARE_WIDE_VECTORS_16);
> 
> Understanding-check.. this loop will execute at most 2 times, right?
> 
> i.e. process as many 32-byte chunks as possible, then 1-or-2 16-byte chunks then byte-by-byte?
> 
> (Still a good optimization, just trying to understand the scope)

Yes.

> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4718:
> 
>> 4716:     jmp(TRUE_LABEL);
>> 4717:   } else {
>> 4718:     movl(chr, Address(ary1, limit, scaleFactor));
> 
> scaleFactor is always Address::times_1 here (expand_ary2==false), might be clearer to change it back

*Sigh*.  Changing it back.

> test/jdk/java/lang/StringBuffer/ECoreIndexOf.java line 57:
> 
>> 55: 
>> 56:     generator = new Random();
>> 57:     long seed = generator.nextLong();//-5291521104060046276L;
> 
> dead code

Fixed

> test/jdk/java/lang/StringBuffer/ECoreIndexOf.java line 63:
> 
>> 61:     ///////////////////////////  WARM-UP //////////////////////////
>> 62: 
>> 63:     for (int i = 0; i < 20000; i++) {
> 
> -Xcomp should be more deterministic (and quicker) way to reach the intrinsic (i.e. like the other tests)
> 
> On other hand, perhaps this doesn't matter? @vnkozlov Understanding-check please.. these tests will run as part of every build from this point-till-infinity; Therefore, long test will affect every openjdk developer. But if this test is not run on every build, then the build-time does not matter, then this test can run for as long as it 'wants'.

This test runs in well under 2 minutes.  I'm not sure what is trying to be accomplished?

> test/jdk/java/lang/StringBuffer/ECoreIndexOf.java line 160:
> 
>> 158:   }
>> 159: 
>> 160:   private static String generateTestString(int min, int max) {
> 
> I see you have various `Charset[] charSets` above, but this function still only generates LL. Are those separate tests? Or am I missing some concatenation somewhere that will convert the generated string string to the correct encoding?
> 
> You could had implemented my suggestion from IndexOf.generateTestString here instead, so that the tests that do call this function endup with multiple encodings; i.e. similar to what you already do in the next function.
> 
> I suppose, with addition of String/IndexOf.java that is a moot point.

Yes, I think it's a moot point.  Thanks.

> test/jdk/java/lang/StringBuffer/ECoreIndexOf.java line 185:
> 
>> 183:   }
>> 184: 
>> 185:   private static int indexOfKernel(String haystack, String needle) {
> 
> Is the intention of kernels not to be inlined so that it would be part of separate compilation?
> 
> If so, you probably want to annotate it with `@CompilerControl(CompilerControl.Mode.DONT_INLINE)`
> 
> i.e. https://github.com/openjdk/jmh/blob/master/jmh-samples/src/main/java/org/openjdk/jmh/samples/JMHSample_16_CompilerControl.java

Fixed.

> test/jdk/java/lang/StringBuffer/ECoreIndexOf.java line 539:
> 
>> 537:     failCount = indexOfKernel("", "");
>> 538: 
>> 539:     for (int x = 0; x < 1000000; x++) {
> 
> Should we be concerned about the increased run-time? Or does this execute 'quickly enough'

Runs in well under 2 minutes.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1613997645
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1613993657
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1613998432
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1614000081
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1614000885
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1614001480
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1614002801
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1614003072


More information about the core-libs-dev mailing list