RFR(m): 8220351: Cross-modifying code

Fri Mar 15 11:08:12 UTC 2019

Hi Robbin,

I'm referring to x86 here.

My comment/question about the requirement of an ICache invalidation in the
patcher thread was a little "undereducated". After reading recent manuals [1]:

  1. I think your right that a serializing instruction after the safepoint poll
     is necessary.

  2. I doubt that cache flushing is required at all, according to [1]. At least
     SPECjvm2008 runs with an AbstractICache::_flush_icache_stub that
     immediately returns.

In the sense of [1] JIT compilers and JIT runtimes are cross-modifying code,
too, when they compile bytecode and resolve compiled calls. But afaik the java
threads don't execute a serializing instruction before executing that code. So,
if I'm not mistaken, (* OPTION 1 *) of [1] where the patchee threads jump to the
modified code is legal as well.

So at least theoretically you should be able to replace the serializing
instruction with a call to an address you would load from the thread. The
address should lead to the runtime, where the thread would block if the
safepoint is still active or just return if it is not.

The last point was rather exploring the space of legal solutions than a
suggestion to solve the issue. I'd favor your approach if you manage to do the
serialization conditionally (BTW: a mechanism similar to JavaThread::*deopt_suspend*
would be yet another way to do that).

Cheers, Richard.

[1] 8.1.3 Handling Self- and Cross-Modifying Code
    in Intel(R) 64 and IA-32 ArchitecturesSoftware Developer's Manual. Volume 3
    (3A, 3B, 3C & 3D): System Programming Guide
    https://software.intel.com/sites/default/files/managed/a4/60/325384-sdm-vol-3abcd.pdf
    on https://software.intel.com/en-us/articles/intel-sdm

-----Original Message-----
From: Robbin Ehn <robbin.ehn at oracle.com> 
Sent: Donnerstag, 14. März 2019 13:24
To: Reingruber, Richard <richard.reingruber at sap.com>; Doerr, Martin <martin.doerr at sap.com>; David Holmes <david.holmes at oracle.com>; hotspot-dev at openjdk.java.net
Subject: Re: RFR(m): 8220351: Cross-modifying code

Hi Richard,

On 3/14/19 11:12 AM, Reingruber, Richard wrote:
> Hi Robbin,
> 
>    > the reason I heard for not doing invalidate_range/word in nmethod::oops_do
>    > is that it was deemed that a safepoint is 'long enough'.
> 
> This doesn't seem to be a very good reason. I did a little modification to your
> initial figure:

That's why we are adding the instruction barrier!

> 
> JavaThread:             | VMThread
> StoreLoad               |
>                          | update <immediate oop>
>                          | ? Global ICache Invalidate ?
>                          | disarm
> load thread_poll        |
> 
> Question would be: does an ICache flush happen after the immediate oop update?
> I would consider it problematic if not. Even if the safepoint was very long.

The icache flush assumes it can stop speculation and flush instruction cache in 
patchee thread.
Correct me if anything is wrong:
- sparc needs flush <adr> in all patchee threads.
- arm/aarch64 needs isb + ___clear_cache(adr,..) in all patchee threads.
- ppc needs isync in all patchee threads?
- s390 needs zarch_sync all in patchee threads?
- x86 needs cpuid in all in patchee threads.

So a "Global ICache Invalidate" done by patcher thread (VMThread in this case) 
have no effect in patchee threads!?

Adding a few words on x86 ICache::Invalidate/word.
We have TSO, so if the patchee thread can see poll is disarmed all other stores 
are visible. I have seen no case ICache::Invalidate/word would make a difference 
from a StoreStore/compiler barrier (still talking about platform code for x86) ?
The problem is if we prefect an instruction over the load of the poll.
If poll is disarmed this perfected instruction can be fully executed.
If that happens to contain an immediate oop, we don't know if it's the patched
oop or not.

/Robbin

> 
> Cheers Richard.
> 
> -----Original Message-----
> From: Robbin Ehn <robbin.ehn at oracle.com>
> Sent: Donnerstag, 14. März 2019 10:34
> To: Doerr, Martin <martin.doerr at sap.com>; David Holmes <david.holmes at oracle.com>; hotspot-dev at openjdk.java.net
> Cc: Reingruber, Richard <richard.reingruber at sap.com>
> Subject: Re: RFR(m): 8220351: Cross-modifying code
> 
> Hi Martin,
> 
> On 3/14/19 10:25 AM, Doerr, Martin wrote:
>> Hi Robbin,
>>
>>> But that is whole new subject and let's keep that separate from this :)
>> Sure, we can keep the modification side discussion separate. This change is already difficult enough ��
>>
>>
>> Btw. I haven't sent the ppc/s390 instruction cache barriers. Here they are:
>>
>> orderAccess_aix/linux_ppc.hpp:
>> inline void OrderAccess::instruction_pipeline() { inlasm_isync(); }
>>
>> orderAccess_linux_s390.hpp:
>> inline void OrderAccess::instruction_pipeline() { inlasm_zarch_sync(); }
>>
> 
> Thanks!
> 
>>> nmethod::oops_do calls the closure f->do_oop(r->oop_addr()) for oops encoded into the instruction stream.
>>> I couldn't find a call to AbstractICache::invalidate_range.
> 
> I think I miss-understood you here, the reason I heard for not doing
> invalidate_range/word in nmethod::oops_do is that it was deemed that a
> safepoint is 'long enough'.
> 
> /Robbin
> 
>>
>> Look for ICache::invalidate_range, I see 11 calls on x64.
>> And one call to AbstractICache/ICache::invalidate_word.
>> Btw I have spent some time looking at x86/icache_x86.cpp...
>> But that is whole new subject and let's keep that separate from this :)
>>
>> I'll create issues for some of these topics, which if nothing else should be
>> discussed.
>>
>> /Robbin
>>
>>>
>>> Does anybody know where it is or if it's missing?
>>>
>>> Best regards,
>>> Martin
>>>