RFR: 8310949: RISC-V: Initialize UseUnalignedAccesses [v5]

Palmer Dabbelt duke at openjdk.org
Thu Jun 29 05:59:55 UTC 2023


On Wed, 28 Jun 2023 13:56:43 GMT, Palmer Dabbelt <duke at openjdk.org> wrote:

>>> One case to consider: lets say I have a system with X big cores ( which support MISALIGNED_FAST) and X small cores ( with MISALIGNED_EMU)
>>> 
>>> if I run some java workload on all cores, what should hw_prober return ? obvious result here is to use +AvoidUnallignedAccesses.
>>> 
>>> If I run same java workload but use taskset to run it only on big cores, how will jdk's hw_prober code work ? should it work properly and disable AvoidUnallignedAccesses or it's too much and one need to manually set -XX:-AvoidUnallignedAccesses ?
>> 
>> FYI:
>> The only way today using hwprobe in example above is to query each cpu individually to find the cpu set that have fast and the cpu set have emulated. (or vector or any other extension which may differ)
>> I have proposed that hwprobe should be able to also return a cpu set for some set of features.
>> To either inform user of what affinity they should use, or if we want change the affinity of the VM automagically.
>
>> > One case to consider: lets say I have a system with X big cores ( which support MISALIGNED_FAST) and X small cores ( with MISALIGNED_EMU)
>> > if I run some java workload on all cores, what should hw_prober return ? obvious result here is to use +AvoidUnallignedAccesses.
>> > If I run same java workload but use taskset to run it only on big cores, how will jdk's hw_prober code work ? should it work properly and disable AvoidUnallignedAccesses or it's too much and one need to manually set -XX:-AvoidUnallignedAccesses ?
>> 
>> FYI: The only way today using hwprobe in example above is to query each cpu individually to find the cpu set that have fast and the cpu set have emulated. (or vector or any other extension which may differ) I have proposed that hwprobe should be able to also return a cpu set for some set of features. To either inform user of what affinity they should use, or if we want change the affinity of the VM automagically.
> 
> I think Robbin brought this up at some meeting type thing, but IMO that's a pretty reasonable ask.  We hadn't thought of it when writing the syscall, but we've got some flags for extensibility so I think we could make it work.
> 
> Another option might be to tie this to some hueristics in userspace, maybe probing along the CPU topology or something.  There's been some vague discussions about having a hwprobe userspace library to handle things like bit->string mappings, maybe we should just have it do this too?

> Hello @palmer-dabbelt, I meant I don't know how this thing should work "properly" when we have cpus with different capabilities (regarding misaligned access) and affinity manually set to misaligned_fast cores.

Sorry, I'm kind of lost here.  There's some kernel docs for how `riscv_hwprobe()` works, it's got a CPU set argument to control which cores are being probed.  That argument is meant to line up with how the other scheduling controls work, so it's semi-transparent when userspace doesn't care that much.

I guess I'd need to go read the docs, but we might have a grey area for the misaligned access speed: if we say it's slow, is that slow on one core or slow on all cores.  For the other features it's for all cores, but not sure if we were explicit enough in the docs for the performance stuff.  Probably worth a read of the docs to see if we can improve them, this is all pretty new so I wouldn't be surprised if that's the case.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/14676#issuecomment-1612470077


More information about the hotspot-dev mailing list