RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue()
David Holmes
david.holmes at oracle.com
Mon Aug 10 05:35:15 UTC 2020
Hi Richard,
On 31/07/2020 5:28 pm, Reingruber, Richard wrote:
> Hi,
>
> I rebase the fix after JDK-8250042.
>
> New webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.2/
The general fix for this seems good. A minor nit:
588 if (!is_assignable(signature, ob_k, Thread::current())) {
You know that the current thread is the VMThread so can use
VMThread::vm_thread().
Similarly for this existing code:
694 Thread* current_thread = Thread::current();
---
Looking at the test code ... I'm less clear on exactly what is happening
and the use of spin-waits raises some red-flags for me in terms of test
reliability on different platforms. The "while (--waitCycles > 0)" loop
in particular offers no certainty that the agent thread is executing
anything in particular. And the use of the spin_count as a guide to
future waiting time seems somewhat arbitrary. In all seriousness I got
a headache trying to work out how the test was expecting to operate.
Some parts could be simplified using raw monitors, I think. But there's
no sure way to know the agent thread is in the midst of the stackwalk
when the target thread wants to leave the native code. So I understand
what you are trying to achieve here, I'm just not sure how reliably it
will actually achieve it.
test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/libGetLocalWithoutSuspendTest.cpp
32 static volatile jlong spinn_count = 0;
Using a 64-bit counter seems like it will be a problem on 32-bit systems.
Should be spin_count not spinn_count.
36 // Agent thread waits for value != 0, then performas the JVMTI call
to get local variable.
typo: performas
Thanks,
David
-----
> Thanks, Richard.
>
> -----Original Message-----
> From: serviceability-dev <serviceability-dev-retn at openjdk.java.net> On Behalf Of Reingruber, Richard
> Sent: Montag, 27. Juli 2020 09:45
> To: serguei.spitsyn at oracle.com; serviceability-dev at openjdk.java.net
> Subject: [CAUTION] RE: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue()
>
> Hi Serguei,
>
> > I tested it on Linux and Windows but not yet on MacOS.
>
> The test succeeded now on all platforms.
>
> Thanks, Richard.
>
> -----Original Message-----
> From: Reingruber, Richard
> Sent: Freitag, 24. Juli 2020 15:04
> To: serguei.spitsyn at oracle.com; serviceability-dev at openjdk.java.net
> Subject: RE: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue()
>
> Hi Serguei,
>
>> The fix itself looks good to me.
>
> thanks for looking at the fix.
>
>> I still need another look at new test.
>> Could you, please, convert the agent of new test to C++?
>> It will make it a little bit simpler.
>
> Sure, here is the new webrev.1 with a C++ version of the test agent:
>
> http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.1/
>
> I tested it on Linux and Windows but not yet on MacOS.
>
> Thanks,
> Richard.
>
> -----Original Message-----
> From: serguei.spitsyn at oracle.com <serguei.spitsyn at oracle.com>
> Sent: Freitag, 24. Juli 2020 00:00
> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-dev at openjdk.java.net
> Subject: Re: RFR(S) 8249293: Unsafe stackwalk in VM_GetOrSetLocal::doit_prologue()
>
> Hi Richard,
>
> Thank you for filing the CR and taking care about it!
> The fix itself looks good to me.
> I still need another look at new test.
> Could you, please, convert the agent of new test to C++?
> It will make it a little bit simpler.
>
> Thanks,
> Serguei
>
>
> On 7/20/20 01:15, Reingruber, Richard wrote:
>> Hi,
>>
>> please help review this fix for VM_GetOrSetLocal. It moves the unsafe stackwalk from the vm
>> operation prologue before the safepoint into the doit() method executed at the safepoint.
>>
>> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8249293/webrev.0/index.html
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249293
>>
>> According to the JVMTI spec on local variable access it is not required to suspend the target thread
>> T [1]. The operation will simply fail with JVMTI_ERROR_NO_MORE_FRAMES if T is executing
>> bytecodes. It will succeed though if T is blocked because of synchronization or executing some native
>> code.
>>
>> The issue is that in the latter case the stack walk in VM_GetOrSetLocal::doit_prologue() to prepare
>> the access to the local variable is unsafe, because it is done before the safepoint and it races
>> with T returning to execute bytecodes making its stack not walkable. The included test shows that
>> this can crash the VM if T wins the race.
>>
>> Manual testing:
>>
>> - new test test/hotspot/jtreg/serviceability/jvmti/GetLocalVariable/GetLocalWithoutSuspendTest.java
>> - test/hotspot/jtreg/vmTestbase/nsk/jvmti
>> - test/hotspot/jtreg/serviceability/jvmti
>>
>> Nightly regression tests @SAP: JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015,
>> Renaissance Suite, SAP specific tests with fastdebug and release builds on all platforms
>>
>> Thanks, Richard.
>>
>> [1] https://docs.oracle.com/en/java/javase/14/docs/specs/jvmti.html#local
>
More information about the serviceability-dev
mailing list