RFR(m): 8220351: Cross-modifying code

Mon Mar 18 11:00:03 UTC 2019

> Let's assume J2's modification makes it into J1's instruction stream. How is it
> guaranteed that the new compiled code for methodB from the JIT makes it into
> J1's instruction stream, too, if no serializing instruction is executed by J1?

In short, no, there are no such guarantee AFAIK.
If methodB is generate in memory with a valid instruction stream, we may execute 
the old one.
And J1 may see the unresolved call and also try to resolve it, seem much more 
likely.

ErikÖ have looked more into these cases. (I don't plan to address them in this 
patch)

/Robbin

> My explanation was that the call instruction has the serializing effect, just
> like in OPTION 1. Note as well the difference between OPTION 1 and 2: if the new
> code is not reached by a jump then a serializing instruction is required as
> well.
> 
>    > That's the "accidental success" that is being avoided by this fix
> 
> Which part of the fix would that be?
> 
> Thanks, Richard.
> 
> -----Original Message-----
> From: David Holmes <david.holmes at oracle.com>
> Sent: Montag, 18. März 2019 03:07
> To: Reingruber, Richard <richard.reingruber at sap.com>; Robbin Ehn <robbin.ehn at oracle.com>; Doerr, Martin <martin.doerr at sap.com>; Andrew Haley <aph at redhat.com>; hotspot-dev at openjdk.java.net
> Subject: Re: RFR(m): 8220351: Cross-modifying code
> 
> On 15/03/2019 11:04 pm, Reingruber, Richard wrote:
>>     > OPTION 1 is for self-modifying code, not cross-modifying code.
>>
>> I know. I wanted to point out that it seems to work for cross-modifying code,
>> too, in the sense that patchee java threads don't execute a serializing instruction
>> before jumping to compiled code or before executing a compiled call another java
>> thread just resolved.
> 
> That's the "accidental success" that is being avoided by this fix. Today
> it all seems to work just fine because the "distance" between the
> actions of the patching thread and the patchee thread that wants to use
> the new code is "vast".
> 
> Cheers,
> David
> 
>> Cheers, Richard.
>>
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Freitag, 15. März 2019 12:55
>> To: Reingruber, Richard <richard.reingruber at sap.com>; Robbin Ehn <robbin.ehn at oracle.com>; Doerr, Martin <martin.doerr at sap.com>; Andrew Haley <aph at redhat.com>; hotspot-dev at openjdk.java.net
>> Subject: Re: RFR(m): 8220351: Cross-modifying code
>>
>> On 15/03/2019 9:08 pm, Reingruber, Richard wrote:
>>> Hi Robbin,
>>>
>>> I'm referring to x86 here.
>>>
>>> My comment/question about the requirement of an ICache invalidation in the
>>> patcher thread was a little "undereducated". After reading recent manuals [1]:
>>>
>>>      1. I think your right that a serializing instruction after the safepoint poll
>>>         is necessary.
>>>
>>>      2. I doubt that cache flushing is required at all, according to [1]. At least
>>>         SPECjvm2008 runs with an AbstractICache::_flush_icache_stub that
>>>         immediately returns.
>>>
>>> In the sense of [1] JIT compilers and JIT runtimes are cross-modifying code,
>>> too, when they compile bytecode and resolve compiled calls. But afaik the java
>>> threads don't execute a serializing instruction before executing that code. So,
>>> if I'm not mistaken, (* OPTION 1 *) of [1] where the patchee threads jump to the
>>> modified code is legal as well.
>>
>> OPTION 1 is for self-modifying code, not cross-modifying code.
>>
>> David
>> -----
>>
>>> So at least theoretically you should be able to replace the serializing
>>> instruction with a call to an address you would load from the thread. The
>>> address should lead to the runtime, where the thread would block if the
>>> safepoint is still active or just return if it is not.
>>>
>>> The last point was rather exploring the space of legal solutions than a
>>> suggestion to solve the issue. I'd favor your approach if you manage to do the
>>> serialization conditionally (BTW: a mechanism similar to JavaThread::*deopt_suspend*
>>> would be yet another way to do that).
>>>
>>> Cheers, Richard.
>>>
>>> [1] 8.1.3 Handling Self- and Cross-Modifying Code
>>>        in Intel(R) 64 and IA-32 ArchitecturesSoftware Developer's Manual. Volume 3
>>>        (3A, 3B, 3C & 3D): System Programming Guide
>>>        https://software.intel.com/sites/default/files/managed/a4/60/325384-sdm-vol-3abcd.pdf
>>>        on https://software.intel.com/en-us/articles/intel-sdm
>>>
>>> -----Original Message-----
>>> From: Robbin Ehn <robbin.ehn at oracle.com>
>>> Sent: Donnerstag, 14. März 2019 13:24
>>> To: Reingruber, Richard <richard.reingruber at sap.com>; Doerr, Martin <martin.doerr at sap.com>; David Holmes <david.holmes at oracle.com>; hotspot-dev at openjdk.java.net
>>> Subject: Re: RFR(m): 8220351: Cross-modifying code
>>>
>>> Hi Richard,
>>>
>>> On 3/14/19 11:12 AM, Reingruber, Richard wrote:
>>>> Hi Robbin,
>>>>
>>>>       > the reason I heard for not doing invalidate_range/word in nmethod::oops_do
>>>>       > is that it was deemed that a safepoint is 'long enough'.
>>>>
>>>> This doesn't seem to be a very good reason. I did a little modification to your
>>>> initial figure:
>>>
>>> That's why we are adding the instruction barrier!
>>>
>>>>
>>>> JavaThread:             | VMThread
>>>> StoreLoad               |
>>>>                             | update <immediate oop>
>>>>                             | ? Global ICache Invalidate ?
>>>>                             | disarm
>>>> load thread_poll        |
>>>>
>>>> Question would be: does an ICache flush happen after the immediate oop update?
>>>> I would consider it problematic if not. Even if the safepoint was very long.
>>>
>>> The icache flush assumes it can stop speculation and flush instruction cache in
>>> patchee thread.
>>> Correct me if anything is wrong:
>>> - sparc needs flush <adr> in all patchee threads.
>>> - arm/aarch64 needs isb + ___clear_cache(adr,..) in all patchee threads.
>>> - ppc needs isync in all patchee threads?
>>> - s390 needs zarch_sync all in patchee threads?
>>> - x86 needs cpuid in all in patchee threads.
>>>
>>> So a "Global ICache Invalidate" done by patcher thread (VMThread in this case)
>>> have no effect in patchee threads!?
>>>
>>> Adding a few words on x86 ICache::Invalidate/word.
>>> We have TSO, so if the patchee thread can see poll is disarmed all other stores
>>> are visible. I have seen no case ICache::Invalidate/word would make a difference
>>> from a StoreStore/compiler barrier (still talking about platform code for x86) ?
>>> The problem is if we prefect an instruction over the load of the poll.
>>> If poll is disarmed this perfected instruction can be fully executed.
>>> If that happens to contain an immediate oop, we don't know if it's the patched
>>> oop or not.
>>>
>>> /Robbin
>>>
>>>>
>>>> Cheers Richard.
>>>>
>>>> -----Original Message-----
>>>> From: Robbin Ehn <robbin.ehn at oracle.com>
>>>> Sent: Donnerstag, 14. März 2019 10:34
>>>> To: Doerr, Martin <martin.doerr at sap.com>; David Holmes <david.holmes at oracle.com>; hotspot-dev at openjdk.java.net
>>>> Cc: Reingruber, Richard <richard.reingruber at sap.com>
>>>> Subject: Re: RFR(m): 8220351: Cross-modifying code
>>>>
>>>> Hi Martin,
>>>>
>>>> On 3/14/19 10:25 AM, Doerr, Martin wrote:
>>>>> Hi Robbin,
>>>>>
>>>>>> But that is whole new subject and let's keep that separate from this :)
>>>>> Sure, we can keep the modification side discussion separate. This change is already difficult enough ��
>>>>>
>>>>>
>>>>> Btw. I haven't sent the ppc/s390 instruction cache barriers. Here they are:
>>>>>
>>>>> orderAccess_aix/linux_ppc.hpp:
>>>>> inline void OrderAccess::instruction_pipeline() { inlasm_isync(); }
>>>>>
>>>>> orderAccess_linux_s390.hpp:
>>>>> inline void OrderAccess::instruction_pipeline() { inlasm_zarch_sync(); }
>>>>>
>>>>
>>>> Thanks!
>>>>
>>>>>> nmethod::oops_do calls the closure f->do_oop(r->oop_addr()) for oops encoded into the instruction stream.
>>>>>> I couldn't find a call to AbstractICache::invalidate_range.
>>>>
>>>> I think I miss-understood you here, the reason I heard for not doing
>>>> invalidate_range/word in nmethod::oops_do is that it was deemed that a
>>>> safepoint is 'long enough'.
>>>>
>>>> /Robbin
>>>>
>>>>>
>>>>> Look for ICache::invalidate_range, I see 11 calls on x64.
>>>>> And one call to AbstractICache/ICache::invalidate_word.
>>>>> Btw I have spent some time looking at x86/icache_x86.cpp...
>>>>> But that is whole new subject and let's keep that separate from this :)
>>>>>
>>>>> I'll create issues for some of these topics, which if nothing else should be
>>>>> discussed.
>>>>>
>>>>> /Robbin
>>>>>
>>>>>>
>>>>>> Does anybody know where it is or if it's missing?
>>>>>>
>>>>>> Best regards,
>>>>>> Martin
>>>>>>