RFR: 8345120: A likely bug in StringSupport::chunkedStrlenShort

Fri Nov 29 09:04:39 UTC 2024

On Fri, 29 Nov 2024 07:55:25 GMT, Per Minborg <pminborg at openjdk.org> wrote:

> This PR proposes to rewrite the `StringSupport::chunkedStrlen*`  methods and move them to the class `SegmentBulkOperations` where other bulk operations reside.
> 
> This PR fixes a bug in the `short_strlen` variant for offsets that were odd (`offset % 2 != 0`).
> 
> This PR also improves performance on modern hardware, as there is no need for pre-looping alignment. Removing this improves performance by about 30% for larger strings.
> 
> Passes the `jdk_foreign` test suit.
> 
> Base:
> 
> 
> Benchmark                               (size)  Mode  Cnt    Score   Error  Units
> InternalStrLen.changedElementQuad            1  avgt   30    2.057 ? 0.012  ns/op
> InternalStrLen.changedElementQuad            4  avgt   30    3.776 ? 0.031  ns/op
> InternalStrLen.changedElementQuad           16  avgt   30    6.690 ? 0.060  ns/op
> InternalStrLen.changedElementQuad          251  avgt   30   48.581 ? 0.764  ns/op
> InternalStrLen.changedElementQuad         1024  avgt   30  196.188 ? 3.484  ns/op
> InternalStrLen.chunkedDouble                 1  avgt   30    1.903 ? 0.013  ns/op
> InternalStrLen.chunkedDouble                 4  avgt   30    3.446 ? 0.025  ns/op
> InternalStrLen.chunkedDouble                16  avgt   30    5.759 ? 0.062  ns/op
> InternalStrLen.chunkedDouble               251  avgt   30   26.892 ? 0.141  ns/op
> InternalStrLen.chunkedDouble              1024  avgt   30   72.940 ? 1.562  ns/op
> InternalStrLen.chunkedSingle                 1  avgt   30    1.897 ? 0.015  ns/op
> InternalStrLen.chunkedSingle                 4  avgt   30    5.357 ? 0.560  ns/op
> InternalStrLen.chunkedSingle                16  avgt   30    3.821 ? 0.052  ns/op
> InternalStrLen.chunkedSingle               251  avgt   30   19.482 ? 0.190  ns/op
> InternalStrLen.chunkedSingle              1024  avgt   30   38.938 ? 0.411  ns/op
> InternalStrLen.chunkedSingleMisaligned       1  avgt   30    2.230 ? 0.147  ns/op
> InternalStrLen.chunkedSingleMisaligned       4  avgt   30    5.424 ? 0.688  ns/op
> InternalStrLen.chunkedSingleMisaligned      16  avgt   30    9.573 ? 0.063  ns/op
> InternalStrLen.chunkedSingleMisaligned     251  avgt   30   22.242 ? 0.182  ns/op
> InternalStrLen.chunkedSingleMisaligned    1024  avgt   30   45.442 ? 0.252  ns/op
> InternalStrLen.elementByteMisaligned         1  avgt   30    1.616 ? 0.041  ns/op
> InternalStrLen.elementByteMisaligned         4  avgt   30    2.982 ? 0.018  ns/op
> InternalStrLen.elementByteMisaligned        16  avgt   30    8.662 ? 0.085  ns/op
> InternalStrLen.elementByteMisaligned       251  avgt   30 ...

src/java.base/share/classes/jdk/internal/foreign/SegmentBulkOperations.java line 400:

> 398:      */
> 399:     @ForceInline
> 400:     public static int strlenByte(final AbstractMemorySegmentImpl segment,

Can we put these back in `StringSupport`? It makes it hard to review the differences side-by-side. (We can maybe decide later whether to move). In general if we keep the `StringSupport` class it's a bit weird to have the `strlen` methods not be there -- but I agree there's a similarity between this and other bulk operations.

src/java.base/share/classes/jdk/internal/foreign/SegmentBulkOperations.java line 504:

> 502: 
> 503: 
> 504: /*    // The gain of using `long` wide operations for `int` is lower than for the two other

Leftover?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/22451#discussion_r1863181620
PR Review Comment: https://git.openjdk.org/jdk/pull/22451#discussion_r1863176170