RFR: 8358032: Use crypto pmull for CRC32(C) on Ampere CPU and improve for short inputs [v4]
Liming Liu
lliu at openjdk.org
Tue Jun 24 06:32:32 UTC 2025
On Mon, 23 Jun 2025 05:50:57 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> Liming Liu has updated the pull request incrementally with one additional commit since the last revision:
>>
>> Add the message for the assertions
>
> src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 126:
>
>> 124: warning("CryptoPmullForCRC32LowLimit must be a multiple of 128");
>> 125: CryptoPmullForCRC32LowLimit = align_down(CryptoPmullForCRC32LowLimit, 128);
>> 126: }
>
> Can you describe somewhere why it has to be a multiple of `128`? Imagine someone comes across this later, and wonders if that is just some strange implementation limitation or something more fundamental, or something very subtle.
There are 4 kinds of loops labeled as CRC_by128_loop, CRC_by32_loop, CRC_by4_loop and CRC_by1_loop. If the flag is 266 which is 128x2+10, then for 265 bytes of inputs, there are 256 bytes that are handled by CRC_by32_loop, while for 266 bytes of inputs, the corresponding 256 bytes are handled by CRC_by128_loop, and I think this cases inconsistency. If CRC_by32_loop handles 256 bytes better than CRC_by128_loop on a platform, it should be used for 266 bytes as well.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/25609#discussion_r2163069514
More information about the hotspot-dev
mailing list