The usage of fence.i in openjdk

Fri Aug 5 22:06:32 UTC 2022

More on this subject
I can see the use of ifence() in the code is identical to the use of isb() in aarch64.
Checking the documentation for fence.i and isb, I don’t see them to be 1:1 identical

fence.i ( https://five-embeddev.com/riscv-isa-manual/latest/zifencei.html <https://five-embeddev.com/riscv-isa-manual/latest/zifencei.html> ): 
FENCE.I instruction provides explicit synchronization between writes to instruction memory and instruction fetches on the same hart.

ISB ( https://developer.arm.com/documentation/den0024/a/Memory-Ordering/Barriers/ISB-in-more-detail ):
An ISB flushes the pipeline, and re-fetches the instructions from the cache or memory and ensures that the effects of any completed context-changing operation before the ISB are visible to any instruction after the ISB. It also ensures that any context-changing operations after the ISB instruction only take effect after the ISB has been executed and are not seen by instructions before the ISB. 
And some info from the web:

To me it sound like isb ( in aarch64) does the job a bit different than fence.i ( in rv64)

So, I think here:

  __ la_patchable(t0, RuntimeAddress(CAST_FROM_FN_PTR(address, SharedRuntime::fixup_callers_callsite)), offset);
  __ jalr(x1, t0, offset);

  // Explicit fence.i required because fixup_callers_callsite may change the code
  // stream.
  __ safepoint_ifence();

  __ pop_CPU_state();
  // restore sp
  __ leave();
  __ bind(L);

 we still have a small chance to start executing invalid ( old) code  from l1i if right after safepoint_ifence() our thread would be moved to another hart. Otherwise if fixup_callers_callsite would call icache_flush() somewhere inside, then safepoint_ifence wouldn’t be needed here

Regards, Vladimir

> 30 июля 2022 г., в 13:29, Vladimir Kempik <vladimir.kempik at gmail.com> написал(а):
> 
> Hello
> Thanks for explanation.
> that sounds like the fence.i in userspace code is not needed at all
> Regards, Vladimir
>> 30 июля 2022 г., в 05:41, wangyadong (E) <yadonn.wang at huawei.com> написал(а):
>> 
>>> Lets say you have a thread A running on hart 1.
>>> You've changed some code in region 0x11223300 and need fence.i before executing that code.
>>> you execute fence.i in your thread A running on hart 1. 
>>> right after that your thread ( for some reason) got rescheduled ( by kernel) to hart 2.
>>> if hart 2 had something in l1i corresponding to region 0x11223300, then you gonna have a problem: l1i on hart 2 has old code, it wasn’t refreshed, because fence.i was executed on hart 1 ( and never on hart 2). And you thread gonna execute old code, or mix of old and new code.
>> 
>> @vladimir Thanks for your explanation. I understand your concern now. We know the fence.i's scope, so the write hart does not rely solely on the fence.i in RISC-V port, but calls the icache_flush syscall in ICache::invalidate_range() every time after modifying the code.
>> 
>> For example:
>> Hart 1
>> void MacroAssembler::emit_static_call_stub() {
>> // CompiledDirectStaticCall::set_to_interpreted knows the
>> // exact layout of this stub.
>> 
>> ifence();
>> mov_metadata(xmethod, (Metadata*)NULL); <- patchable code here
>> 
>> // Jump to the entry point of the i2c stub.
>> int32_t offset = 0;
>> movptr_with_offset(t0, 0, offset);
>> jalr(x0, t0, offset);
>> }
>> 
>> Hart 2 (write hart)
>> void NativeMovConstReg::set_data(intptr_t x) {
>> // ...
>>   // Store x into the instruction stream.
>>   MacroAssembler::pd_patch_instruction_size(instruction_address(), (address)x); <- write code
>>   ICache::invalidate_range(instruction_address(), movptr_instruction_size);  <- syscall here
>> // ...
>> }  
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/riscv-port-dev/attachments/20220806/87157cb9/attachment.htm>