[9] RFR (XS): 8036588 : VerifyFieldClosure fails instanceKlass:3133

Vladimir Kozlov vladimir.kozlov at oracle.com
Fri Jul 11 20:51:57 UTC 2014


Sorry, I grepped only opto/ directory.

webrev.02 looks good.

Thanks,
Vladimir

On 7/11/14 1:15 PM, David Chase wrote:
> Whoops -- return_value_is_used is used by other code.
>
> ./cpu/x86/vm/x86_32.ad:      if ((rt == T_FLOAT || rt == T_DOUBLE) && !return_value_is_used()) {
>
>        if ((rt == T_FLOAT || rt == T_DOUBLE) && !return_value_is_used()) {
>          // A C runtime call where the return value is unused.  In SSE2+
>          // mode the result needs to be removed from the FPU stack.  It's
>          // likely that this function call could be removed by the
>          // optimizer if the C function is a pure function.
>          __ ffree(0);
>
> How about that?
> I think I will put that method back and keep the new modified one under the new name; it seems that this depends on the original intent of that code.
>
> New webrev: http://cr.openjdk.java.net/~drchase/8036588/webrev.02
> I'm doing a test run with jprt.
>
> David
>
> On 2014-07-11, at 3:42 PM, David Chase <david.r.chase at oracle.com> wrote:
>
>> Thanks much (and Igor also, and John Rose, who I just now realize I left off the list of reviewers), it is now in the hands of the Meddling Medium.
>>
>> On 2014-07-11, at 1:50 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>>
>>> Looks good.
>>>
>>> Vladimir
>>>
>>> On 7/11/14 9:34 AM, David Chase wrote:
>>>> How's this look?
>>>>
>>>> http://cr.openjdk.java.net/~drchase/8036588/webrev.01/
>>>>
>>>> Retested with jtreg, also (this time) 900 reps of the bug-tickling test with no error.
>>>>
>>>> David
>>>>
>>>> On 2014-07-09, at 7:02 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>>>>
>>>>> Yes, the return_value_is_used() method is incorrect, it does not take into account that the returned value could be used after deoptimization for reallocated objects.
>>>>>
>>>>> return_oop is checked by deoptimization only when EliminateAllocations is true. Only in such case and when objects are reallocated GC can happen during deoptimization.
>>>>>
>>>>> The fix will also preserve a return value when it is really not used:
>>>>>
>>>>> A a = new A();
>>>>> (void)foo();
>>>>> a.f = 1;
>>>>>
>>>>> or simple
>>>>>
>>>>> (void)foo();
>>>>>
>>>>> Which is fine since deoptimization code is not critical for performance.
>>>>>
>>>>> return_value_is_used() is used only by this code. I would suggest to rename it to returns_pointer() and copy-paste code from CallNode::returns_pointer() (MachCallNode is not based on CallNode). And use it in the check.
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> On 7/8/14 12:13 PM, David Chase wrote:
>>>>>>
>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8036588 (closed because only seen in SQE)
>>>>>>
>>>>>> webrev: http://cr.openjdk.java.net/~drchase/8036588/webrev.00/
>>>>>>
>>>>>> cause+fix:
>>>>>> The root cause is use of the wrong liveness information at deoptimization point.
>>>>>> The old code uses the optimizer's notion of "live" -- but deoptimization transfers to the
>>>>>> interpreter and which can (will) manipulate values that are dead to the optimizer.
>>>>>> The trigger is very tricky -- the following things need to happen:
>>>>>>
>>>>>> 1) an object D that will be dead is allocated
>>>>>> 2) a method M is invoked that returns an object F, to only be stored in a field f of D
>>>>>> 3) the optimizer eliminates the allocation of D and the storefield into D.f
>>>>>> 4) deoptimization hits an execution of M; deoptimization reallocates D for the
>>>>>>     interpreter; BUT the reallocation triggers a GC, which would forward F if
>>>>>>     it had been correctly noted as live out of the call to M (but the bug is that it
>>>>>>     was not).
>>>>>> 5) the interpreter evaluates D.f = F (this succeeds)
>>>>>> 6) before the frame with D in it exits, ANOTHER garbage collection occurs (or perhaps
>>>>>>      GC was running concurrently in some way) and attempts to trace/copy through
>>>>>>      D and D.f.
>>>>>> 7) Hilarity ensues.
>>>>>> 8) For extra giggles, this has only ever been observed with -Xmx=32G (or the corresponding
>>>>>>      -XX:MaxRAMFraction= option) plus of course -XX:+DeoptimizeALot.  Also setting
>>>>>>     -XX:DeoptimizeALotInterval=1 increases the failure rate to about 10% of test runs.
>>>>>>     There's some additional missing context, because following this recipe to write a simpler
>>>>>>     test for public consumption did not result in a crashing program.
>>>>>>
>>>>>> fix: Use a simpler test for "pointer is live from M" -- if the return type is an object,
>>>>>>       then it is "live", at least for the interpreter.
>>>>>>
>>>>>> testing:
>>>>>>   jtreg of runtime, gc, compiler
>>>>>>
>>>>>>   got to the point where I could see fails often enough for the two tests known to trigger this,
>>>>>>   and after the fix neither test was observed to fail even once, even with hundreds of repetitions.
>>>>>>
>>>>
>>
>


More information about the hotspot-compiler-dev mailing list