RFR: 8320448: Accelerate IndexOf using AVX2 [v19]

Scott Gibbons sgibbons at openjdk.org
Wed May 22 14:53:31 UTC 2024


On Wed, 15 May 2024 19:18:27 GMT, Volodymyr Paprotski <duke at openjdk.org> wrote:

>> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Rearrange; add lambdas for clarity
>
> test/jdk/java/lang/StringBuffer/IndexOf.java line 54:
> 
>> 52:   // for (int i = 1; i < 128; i++) {
>> 53:   //   haystack_16[i] = (char) (i);
>> 54:   // }
> 
> dead code

Removed.

> test/jdk/java/lang/StringBuffer/IndexOf.java line 83:
> 
>> 81:   shs = "$&),,18+-!'8)+";
>> 82:   endNeedle = "8)-";
>> 83:   l_offset = 9;
> 
> dead code

Fixed.

> test/jdk/java/lang/StringBuffer/IndexOf.java line 237:
> 
>> 235:             + sourceBuffer.toString() + " len Buffer = " + sourceBuffer.toString().length());
>> 236:         System.err.println("  naive = " + naiveFind(sourceBuffer.toString(), targetString, 0) + ", IndexOf = "
>> 237:             + sourceBuffer.indexOf(targetString));
> 
> More tracing left behind here and rest of this function (original just recorded failure and moved along)

I think it's more valuable for a test to print out what it can when a failure occurs rather than just saying "failed".

> test/jdk/java/lang/StringBuffer/IndexOf.java line 284:
> 
>> 282: 
>> 283:   // Note: it is possible although highly improbable that failCount will
>> 284:   // be > 0 even if everthing is working ok
> 
> This sounds like either a bug or a testcase bug? Same as line 301, `extremely remote possibility of > 1 match`?

This was there from the original author.  I think they were trying to infer that a match could occur in the rare case that the same random string was produced.  They're random after all, and there's no reason the same sequence could be generated.

> test/micro/org/openjdk/bench/java/lang/StringIndexOfHuge.java line 81:
> 
>> 79:     lateMatchString16 = dataStringHuge16.substring(dataStringHuge16.length() - 31);
>> 80: 
>> 81:     searchString = "oscar";
> 
> Would had liked to see a few more small needles (i.e. to test/verify individual switch statement cases)

I'm hoping we can incorporate your test to cover more cases :-).

> test/micro/org/openjdk/bench/java/lang/StringIndexOfHuge.java line 132:
> 
>> 130:   @Benchmark
>> 131:   public int searchHugeLargeSubstring() {
>> 132:       return dataStringHuge.indexOf("B".repeat(30) + "X" + "A".repeat(30), 74);
> 
> .repeat() call and string concatenation shouldn't be part of the benchmark (here and several other @Benchmark functions in this file) since it will detract from the measurement.
> 
> (String concatenation gets converted (by javac) into StringBuilder().append().append()....append().toString())

Since we're only concerned with the delta of performance, does this really matter?  Can you suggest an alternative?

> test/micro/org/openjdk/bench/java/lang/StringIndexOfHuge.java line 242:
> 
>> 240:   @Benchmark
>> 241:   public int search16HugeLargeSubstring16() {
>> 242:     return dataStringHuge16.indexOf("B".repeat(30) + "X" + "A".repeat(30), 74);
> 
> `search16HugeLargeSubstring16` implies UU, but with `"B".repeat(30) + "X" + "A".repeat(30)` is UL

Fixed.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1610131285
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1610134566
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1610138116
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1610142104
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1610130140
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1610126743
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1610128630


More information about the core-libs-dev mailing list