RFC: Untangle native libraries and the JVM: SVML, SLEEF, and libsimdsort

Wed Dec 11 18:29:30 UTC 2024

> On Dec 10, 2024, at 4:54 PM, Vladimir Ivanov <vladimir.x.ivanov at oracle.com> wrote:
> 
> 
> 
> On 12/9/24 07:55, Paul Sandoz wrote:
>> Some further observations.
>> - This arguably makes it harder for the auto-vectorize to access the SVML/SLEEF functionality. However, in comes cases, we cannot guarantee the same guarantees (IIRC mainly around monotonicity) as the scalar operations in Math.
> 
> I'm not too optimistic about auto-vectorization unless the very same stubs are shared between scalar and vectorized code. Our previous experience with FP operations strongly indicates that users expect FP operations to give reproducible results (bitwise equivalent) across the same run.
> 
> Moreover, migration to FFI enables usage of SVML/SLEEF across all execution modes which should make it easier to reason about Vector API usages.
> 

Agreed.

>> - There is an open bug to adjust the simd sort behavior on AMD zen 4 cores due to poor performance of an AVX 512 instruction. The simplest solution is to fall back to AVX2. That may be simpler to manage in Java? (I was looking at the HotSpot code).
> 
> For now, the patch guards AVX512 entries with VM.isIntelCPU() check. In order to distinguish between AMD Zen 4 and 5, either a new platform-sensing check is needed or reimplementation of x86-specific platform sensing in Java on top of CPUID info.
> 

Probably best just to update as required. (Also I don’t seem anything in the HotSpot code to determine the AMD zen core version.)

Any general CPU vendor/model solution seems a little more challenging than that of surfacing up the CPU feature set as a string. Note that the System.getProperties() surfaces up “os.arch”. A more general solution could add further properties for the CPU?

Paul.