Re: Re: 答复: The usage of fence.i in openjdk
yangfei at iscas.ac.cn
yangfei at iscas.ac.cn
Sat Jul 30 13:35:11 UTC 2022
Hi Palmer,
> -----Original Messages-----
> From: "Palmer Dabbelt" <palmer at dabbelt.com>
> Sent Time: 2022-07-30 02:02:59 (Saturday)
> To: yadonn.wang at huawei.com
> Cc: vladimir.kempik at gmail.com, riscv-port-dev at openjdk.org
> Subject: Re: 答复: The usage of fence.i in openjdk
>
> On Fri, 29 Jul 2022 08:12:21 PDT (-0700), yadonn.wang at huawei.com wrote:
> > Hi, Vladimir,
> >
> >> I believe Java’s threads can migrate to different hart at any moment, hence the use of fence.i is dangerous.
> > Could you describe in detail why the use of fence.i is dangerous? I think it may be just inefficient but not dangerous.
>
> The issue here is that fence.i applies to the current hart, whereas
> Linux userspace processes just know the current thread. Executing a
> fence.i in userspace adds a bunch of orderings to the thread's state
> that can only be moved to a new hart via another fence.i. Normally that
> sort of thing isn't such a big deal (there's similar state for things
> like fence and lr), but the SiFive chips implement fence.i by flushing
> the instruction cache which is a slow operation to put on the scheduling
> path.
>
> The only way for the kernel to avoid that fence.i on the scheduling path
> is for it to know if one's been executed in userspace. There's no way
> to trap on fence.i so instead the Linux uABI just requires userspace to
> make a syscall (or a VDSO library call). If userspace directly executes
> a fence.i then the kernel won't know and thus can't ensure the thread
> state is adequately moved to the new hart during scheduling, which may
> result in incorrect behavior.
>
> We've known for a while that this will cause performance issues for JITs
> on some implemenations, but so far it's just not been a priority. I
> poked around the RISC-V OpenJDK port a few months ago and I think
> there's some improvements that can be made there, but we're probably
> also going to want some kernel support. Exactly how to fix it is
> probably going to depend on the workloads and implementations, though,
> and while I think I understand the OpenJDK part pretty well it's not
> clear what the other fence.i implementations are doing.
>
> In the long run we're also going to need some ISA support for doing this
> sanely, but that's sort of a different problem. I've been kind of
> crossing my fingers and hoping that anyone who has a system where JIT
> performance is important is also going to have some better write/fetch
> ordering instructions, but given how long it's been maybe that's a bad
> plan.
>
> That said, the direct fence.i is incorrect and it's likely that the
> long-term solution involves making the VDSO call so it's probably best
> to swap over. I remember having written a patch to do that at some
> point, but I can't find it so maybe I just forgot to send it?
Thanks for all those considerations about the design. It's very helpfull.
> > To a certain extent, this code is hangover from the aarch64 port and we use fence.i to mimic isb.
> >> Maybe there is a need to add __asm__ volatile ("fence":::"memory") at the beginning of this method.
> > You're right. It'd better place a full data fence before the syscall, because we cannot guarantee here the syscall leave a data fence there before IPI remote fence.i to other harts
>
> That's not necessary with the current implementation, but it's not
But looks like this is not reflected in kernel function flush_icache_mm?
I checked the code and it looks to me that data fence is issued for only one path:
59 if (mm == current->active_mm && local) {
60 /*
61 * It's assumed that at least one strongly ordered operation is
62 * performed on this hart between setting a hart's cpumask bit
63 * and scheduling this MM context on that hart. Sending an SBI
64 * remote message will do this, but in the case where no
65 * messages are sent we still need to order this hart's writes
66 * with flush_icache_deferred().
67 */
68 smp_mb();
69 }
I just want to make sure that the data fence is there in this syscall with the current implementation.
But I am not familar with the riscv linux kernel code and it's appreciated if you have more details.
Thanks,
Fei</palmer at dabbelt.com>
More information about the riscv-port-dev
mailing list