RFR: 8294432: Add provisions to calculate hash values from MemorySegments

Wed Nov 27 11:27:38 UTC 2024

On Mon, 25 Nov 2024 15:27:13 GMT, Per Minborg <pminborg at openjdk.org> wrote:

> This PR proposes adding a _JDK-internal_ method for calculating hash codes for content in a `MemorySegment`.
> 
> The internal method uses a polynomial 32-bit hash function equivalent to `Arrays::hashCode`. The new method is almost two times faster than naïvely iterating over individual bytes for larger regions. Also, it is more lean on inlining space compared to a naïve loop.
> 
> 
> 
> Benchmark                          (ELEM_SIZE)  Mode  Cnt   Score   Error  Units
> SegmentBulkHash.array                        8  avgt   30   2.645 ? 0.078  ns/op
> SegmentBulkHash.array                       64  avgt   30   6.062 ? 0.171  ns/op
> SegmentBulkHash.heapSegment                  8  avgt   30   4.181 ? 0.145  ns/op
> SegmentBulkHash.heapSegment                 64  avgt   30  25.716 ? 1.043  ns/op
> SegmentBulkHash.nativeSegment                8  avgt   30   3.939 ? 0.150  ns/op
> SegmentBulkHash.nativeSegment               64  avgt   30  23.262 ? 0.694  ns/op
> SegmentBulkHash.nativeSegmentJava            8  avgt   30   5.219 ? 0.183  ns/op    <- Naïve iteration
> SegmentBulkHash.nativeSegmentJava           64  avgt   30  39.668 ? 1.040  ns/op    <- Naïve iteration
> 
> 
> ![image](https://github.com/user-attachments/assets/5646cf21-b202-4dce-9555-e460f9df4cb6)
> 
> 
> If internal JDK code uses this method, it will automatically benefit from future performance improvements that can be implemented once the Vector API becomes available.

Nice work. Let's see how often this is used and then evaluate whether to promote to public API. Let's also investigate ways to make the autovectorizer kick in for such code shapes w/o having to manually unroll :-)

-------------

Marked as reviewed by mcimadamore (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/22364#pullrequestreview-2464745357