RFR(XXS): 8227117: normal interpreter table is not restored after single stepping with TLH
Daniel D. Daugherty
daniel.daugherty at oracle.com
Fri Jul 5 19:53:37 UTC 2019
> I'll file a follow up bug after the dust settles for 8227117.
I filed the following:
JDK-8227338 templateInterpreter.cpp: copy_table() needs to be safer
https://bugs.openjdk.java.net/browse/JDK-8227338
Dan
On 7/5/19 1:07 PM, Daniel D. Daugherty wrote:
> On 7/4/19 3:10 AM, Erik Österlund wrote:
>> Hi Dan,
>>
>> Thanks for picking this up. The change looks good.
>
> Thanks! Of course, just the size of the comment below makes me wonder
> what I got myself into... :-) And I was so happy that the non-logging
> part of the fix was an else-statement with _one_ line...
>
>
>> However, when reviewing this, I looked at the code for actually
>> restoring the table (ignore/notice safepoints). It copies the
>> dispatch table for the interpreter. There is a comment stating it is
>> important the copying is atomic for MT-safety, and I can definitely
>> see why. However, the copying the line after that comment is in fact
>> not atomic.
>
> Actually, the comment doesn't mention 'atomic', but that's probably
> because the code and the comment are very, very old. It mentions
> 'word wise for MT safety' and I agree that 'atomic' is what the
> person likely meant...
>
> The history:
>
> $ sgv src/share/vm/interpreter/templateInterpreter.cpp | grep 'The
> copy has to occur word wise for MT safety'
> 1.1 // Copy non-overlapping tables. The copy has to occur word
> wise for MT safety.
>
> $ sp -r1.1 src/share/vm/interpreter/templateInterpreter.cpp
> src/share/vm/interpreter/SCCS/s.templateInterpreter.cpp:
>
> D 1.1 07/08/29 13:42:26 sgoldman 1 0 00600/00000/00000
> MRs:
> COMMENTS:
> 6571248 - continuation_for is specialized for template interpreter
>
> Hmmm... I expected that comment to be even older... ahhhh... a little
> more poking around and I found:
>
> $ sgv -r1.147 src/share/vm/interpreter/interpreter.cpp | grep 'The
> copy has to occur word wise for MT safety'
> 1.147 // Copy non-overlapping tables. The copy has to occur word
> wise for MT safety.
>
> $ sp -r1.147 src/share/vm/interpreter/interpreter.cpp
> src/share/vm/interpreter/SCCS/s.interpreter.cpp:
>
> D 1.147 99/02/17 10:14:36 steffen 235 233 00008/00002/00762
> MRs:
> COMMENTS:
>
> This makes more sense (timeline wise) and dates back to when all
> of the interpreter was in vm/interpreter/interpreter.cpp.
>
>
>> Here is the copying code in templateInterpreter.cpp:
>>
>> static inline void copy_table(address* from, address* to, int size) {
>> // Copy non-overlapping tables. The copy has to occur word wise for
>> MT safety.
>> while (size-- > 0) *to++ = *from++;
>> }
>>
>> Copying using a loop of non-volatile loads and stores can and
>> definitely will on some compilers turn into memcpy calls instead as
>> the compiler (correctly) considers that an equivalent transformation.
>
> Yet another C++ compiler optimization land mine... sigh...
>
>
>> And memcpy does not guarantee atomicity. Indeed on some platforms it
>> is not atomic. On some platforms it will even enjoy out-of-thin-air
>> values.
>
> That last bit is scary...
>
>
>> Perhaps Copy::disjoint_words_atomic() would be a better choice for
>> atomic word copying. If not, at the very least we should use
>> Atomic::load/store here.
>
> Copy::disjoint_words_atomic() sounds appealing...
>
> For those folks that aren't familiar with this part of safepointing...
>
> SafepointSynchronize::arm_safepoint() calls
> Interpreter::notice_safepoints()
> which calls calls copy_table(). So we're not at a safepoint yet, and,
> in fact,
> we're trying to bring those pesky JavaThreads to a safepoint...
>
> SafepointSynchronize::disarm_safepoint() calls
> Interpreter::ignore_safepoints()
> which also calls copy_table(). However, we did that before we have
> woken the
> JavaThreads that are blocked for the safepoint so that use of
> copy_table is safe:
>
>
> // Release threads lock, so threads can be created/destroyed again.
> Threads_lock->unlock();
>
> // Wake threads after local state is correctly set.
> _wait_barrier->disarm();
> }
>
> The 'Threads_lock->unlock()' should synchronize memory so that the
> restored
> table should be properly synced out to memory...
>
>
>> Having said that, the fix for that issue seems like a separate RFE,
>> because it has been sitting there for a lot longer than TLH has been
>> around.
>
> Yes I would like to keep the copy_table() issue for a separate bug
> (not RFE).
> I'll file a follow up bug after the dust settles for 8227117.
>
> Thanks again for the review!
>
> Dan
>
>>
>> Thanks,
>> /Erik
>>
>> On 2019-07-04 04:04, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> Robbin recently discovered this issue with Thread Local Handshakes.
>>> Since
>>> he's not available at the moment, I'm handling the issue:
>>>
>>> JDK-8227117 normal interpreter table is not restored after
>>> single stepping with TLH
>>> https://bugs.openjdk.java.net/browse/JDK-8227117
>>>
>>> When using Thread Local Handshakes, the normal interpreter table is
>>> not restored after single stepping. This issue is caused by the
>>> VM_ChangeSingleStep VM-op relying on SafepointSynchronize::end() to
>>> restore the normal interpreter table for the "off" case.
>>>
>>> Prior to Thread Local Handshakes, this was a valid assumption to make.
>>> SafepointSynchronize::end() has been refactored into
>>> disarm_safepoint() and it only calls Interpreter::ignore_safepoints()
>>> on the global safepoint branch. That matches up with the call to
>>> Interpreter::notice_safepoints() that is also on the global safepoint
>>> branch.
>>>
>>> The solution is for the VM_ChangeSingleStep VM-op for the "off" case
>>> to call Interpreter::ignore_safepoints() directly.
>>>
>>> Here's the webrev URL:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8227117-webrev/0_for_jdk14/
>>>
>>> The fix is just a small addition to VM_ChangeSingleStep::doit():
>>>
>>> if (_on) {
>>> Interpreter::notice_safepoints();
>>> + } else {
>>> + Interpreter::ignore_safepoints();
>>> }
>>>
>>> Everything else is just new logging support for future debugging of
>>> interpreter table management and single stepping.
>>>
>>> Tested this fix with Mach5 Tier[1-3] on the standard Oracle platforms.
>>> Mach5 Tier[4-6] on standard Oracle platforms is running now.
>>>
>>> Thanks, in advance, for questions, comments or suggestions.
>>>
>>> Dan
>>>
>
>
More information about the serviceability-dev
mailing list