RFR: 8343502: RISC-V: SIGBUS in updateBytesCRC32 after JDK-8339738

Hamlin Li mli at openjdk.org
Mon Nov 4 13:27:28 UTC 2024


On Mon, 4 Nov 2024 03:48:48 GMT, Fei Yang <fyang at openjdk.org> wrote:

> Hi, please review this small change.
> 
> [JDK-8339738](https://bugs.openjdk.org/browse/JDK-8339738) adds vectorization for crc32 intrinsic, which does `vle32.v` from the input byte buffer and calculates the checksum. But the input byte buffer could be misaligned (not 4-byte aligned). This leads to SIGBUS on hardware platforms like `BPI-F3` board where misaligned vector loads are not supported. Similar issue is there for scalar version as well, which could mean performance issue on other hardwares. Patch fixes this issue by adding a small alignment processing on entry for both scalar and vector version.
> 
> This also fixes another potential issue in tail handling, where we do a single `lwu` for both versions to load the remaining bytes from the input byte buffer and extract each byte from the loaded 32-bit value. Since we only have less than 4 bytes remaining, this `lwu` would exceed the buffer limit, which I think is not safe. Patch fixes this issue by doing three separate `lbu` instead.
> 
> Testing on `BPI-F3` with RVV 1.0 extension:
> - [x] test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java
> - [x] test/jdk/java/util/zip/TestCRC32.java
> - [x] SPECjbb2015
> 
> No obvious impact witnessed on `micro:java.util.TestCRC32`:
> 
> 
> Before:
> Benchmark                    (count)   Mode  Cnt     Score    Error   Units
> TestCRC32.testCRC32Update         64  thrpt   12  4778.903 ±  1.793  ops/ms
> TestCRC32.testCRC32Update        128  thrpt   12  2655.639 ±  2.958  ops/ms
> TestCRC32.testCRC32Update        256  thrpt   12  1430.997 ±  0.970  ops/ms
> TestCRC32.testCRC32Update        512  thrpt   12   965.785 ±  1.840  ops/ms
> TestCRC32.testCRC32Update       2048  thrpt   12   303.056 ±  0.620  ops/ms
> TestCRC32.testCRC32Update      16384  thrpt   12    40.601 ±  0.220  ops/ms
> TestCRC32.testCRC32Update      65536  thrpt   12     9.575 ±  0.045  ops/ms
> TestCRC32C.testCRC32CUpdate       64  thrpt   12  3923.698 ± 23.209  ops/ms
> TestCRC32C.testCRC32CUpdate      128  thrpt   12  2514.616 ± 22.991  ops/ms
> TestCRC32C.testCRC32CUpdate      256  thrpt   12  1477.223 ±  2.319  ops/ms
> TestCRC32C.testCRC32CUpdate      512  thrpt   12   806.179 ±  1.961  ops/ms
> TestCRC32C.testCRC32CUpdate     2048  thrpt   12   216.396 ±  0.172  ops/ms
> TestCRC32C.testCRC32CUpdate    16384  thrpt   12    27.526 ±  0.049  ops/ms
> TestCRC32C.testCRC32CUpdate    65536  thrpt   12     6.530 ±  0.041  ops/ms
> 
> After:
> Benchmark                    (count)   Mode  Cnt     Score    Error   Units
> TestCRC32.testCRC32Update         64 ...

Does the non-alignment data cause any similar issue with other intrinsics?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/21863#issuecomment-2454710709


More information about the hotspot-dev mailing list