Regarding 8033626: C2: Merging exception state: reexecute=true vs reexecute=false
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Fri Jun 6 23:12:40 UTC 2014
>>> The test checks deoptimization for null check at the beginning of
>>> intrinsic. Additional testing should be done for OOM exception coming
>>> from allocation (during runtime call). Trigger OOM and run with
>>> StressCompiledExceptionHandlers. With this flag code generated by
>>> GraphKit::gen_stub() will be executed and
>>> StubRoutines::forward_exception_entry() will be called. What will happen
>>> in this case?
>> I tested such case manually (instrumented new_instance_C to throw OOM).
>> The exception is forwarded (and pending exception is cleared) to the
>> handler with contains uncommon trap. After deoptimization is done,
>
> So StubRoutines::forward_exception_entry() will call
> SharedRuntime::raw_exception_handler_for_return_address(). Which one
> code succeed?:
>
> nm->is_deopt_pc(return_address) and unpack_with_exception()
> or it calls nm->exception_begin()
>
> I am trying to understand how it jumps to the uncommon trap code. Can
> you show here the sequence of calls/jumps?
It calls nm->exception_begin(). I'll send you complete sequence on Monday.
Best regards,
Vladimir Ivanov
>
> thanks,
> Vladimir
>
>> Object::clone is reexecuted and it goes to JVM_Clone (diagnostic output
>> [1]).
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> [1]
>> new_instance_C: OOM
>> Exception <a 'java/lang/OutOfMemoryError'> (0x0000000115766790)
>> thrown
>> [/Users/vladimir/ws/jdk/ws/jdk9-dev/hotspot/src/share/vm/opto/runtime.cpp,
>>
>> line 241]
>> for thread 0x00007f8dc881e800
>> 1 [Exception (): a 'java/lang/OutOfMemoryError' in {method}
>> {0x00000001566011e8} 'f' '(LTest;)Ljava/lang/Object;' in public
>> synchronized 'Test' at 0x0000000111ff8818]
>> Uncommon trap occurred in Test::f (@0x0000000111ff8844) thread=6403
>> reason=unhandled action=none unloaded_class_index=-1
>> DEOPT PACKING thread 0x00007f8dc881e800 Compiled frame
>> (sp=0x0000000111d6b4c0 unextended sp=0x0000000111d6b4c0,
>> fp=0x0000000115766bc0, real_fp=0x0000000111d6b4e0, pc=0x0000000111ff8844)
>> nmethod 1568 1 Test::f (5 bytes)
>>
>> Virtual frames (innermost first):
>> 0 - frame( sp=0x0000000111d6b4c0,
>> unextended_sp=0x0000000111d6b4c0, fp=0x0000000115766bc0,
>> pc=0x0000000111ff8844)
>> Test.f(Test.java:67) - invokevirtual @ bci 1
>> {method} {0x00000001566011e8} 'f' '(LTest;)Ljava/lang/Object;' in
>> public synchronized 'Test'
>> bci: 1
>> locals:
>> 0 conflict
>> expressions:
>> 0 a 'Test$A' <0x0000000115766bc0>
>>
>> Created vframeArray 0x00007f8dc882a618
>> DEOPT UNPACKING thread 0x00007f8dc881e800 vframeArray 0x00007f8dc882a618
>> mode 2
>> [1 Interpreted Frame]
>> Interpreted frame (sp=0x0000000111d6b488 unextended
>> sp=0x0000000111d6b488, fp=0x0000000111d6b4d0,
>> real_fp=0x0000000111d6b4d0, pc=0x0000000111e66b00)
>> (~deoptimization entry points [0x0000000111e66b00, 0x0000000111e75a20]
>> 61216 bytes
>> )
>> BufferBlob (0x0000000111e60590) used for Interpreter
>>
>> - local [0x0000000000000000]; #0
>> - stack [0x0000000115766bc0]; #0
>> - monitor[0x0000000111d6b490]
>> - bcp [0x00000001566011c9]; @1
>> - locals [0x0000000111d6b4e0]
>> - method [0x00000001566011e8]; static jobject Test.f(jobject)
>> {method} {0x00000001566011e8} 'f' '(LTest;)Ljava/lang/Object;' in
>> public synchronized 'Test'
>> bci: 1
>> locals:
>> 0 NULL <0x0000000000000000>
>> locals size 1
>> expression size 1
>> {method} {0x00000001566011e8} 'f' '(LTest;)Ljava/lang/Object;' in public
>> synchronized 'Test'
>> JVM_Clone
>>
>>> Thanks,
>>> Vladimir K
>>>
>>>
>>> On 6/6/14 10:46 AM, Vladimir Ivanov wrote:
>>>> I played with uncommon trap approach and it works fine.
>>>>
>>>> While I'm finishing extensive testing, I'd be grateful for early
>>>> feedback.
>>>>
>>>> Preliminary fix:
>>>> http://cr.openjdk.java.net/~vlivanov/8033626/webrev.00/
>>>>
>>>> The fix is to bail out to interpreter if any exception is generated in
>>>> new_instance_C. There's no need to update exception state. When
>>>> reexecuting the call, native version of Object::clone (JVM_Clone) will
>>>> be invoked.
>>>>
>>>> Controlling exception handling explicitly by passing
>>>> deoptimize_on_exception parameters looks cleaner to me, than having a
>>>> check (is_LibraryCallKit() or JVMS reexecute bit) burden in
>>>> GraphKit::make_slow_call_ex.
>>>>
>>>> I fixed only Object::clone case, since I don't see other places where
>>>> similar IR shape can be produced.
>>>>
>>>> Best regards,
>>>> Vladimir Ivanov
>>>>
>>>> On 6/3/14 2:27 AM, Vladimir Kozlov wrote:
>>>>> Unfortunately it may not be a whole solution. The uncommon trap may
>>>>> still needs to be recorded as an exceptions handler for compiled
>>>>> code so
>>>>> that when we return from new_instance_C() runtime call with OOM
>>>>> exception it will find it. See GraphKit::gen_stub(),
>>>>> generate_catch_exception() and
>>>>> SharedRuntime::raw_exception_handler_for_return_address()
>>>>>
>>>>> Also do additional tests with StressCompiledExceptionHandlers
>>>>> otherwise
>>>>> we deoptimize immediately on the return from new_instance_C().
>>>>>
>>>>> In short, please investigate all return paths from new_instance_C()
>>>>> when
>>>>> there is pending OOM exception and we have uncommon trap instead of
>>>>> CreateExNode.
>>>>>
>>>>> thanks,
>>>>> Vladimir
>>>>>
>>>>> On 6/2/14 12:21 PM, Vladimir Ivanov wrote:
>>>>>> Yes, uncommon trap should solve the problem.
>>>>>> Thanks for excellent idea!
>>>>>>
>>>>>> Best regards,
>>>>>> Vladimir Ivanov
>>>>>>
>>>>>> On 5/31/14 6:30 AM, Vladimir Kozlov wrote:
>>>>>>> I thought about this more. I think we should not generate
>>>>>>> exceptions in
>>>>>>> intrinsic when we are going to re-execute the method. Interpreter
>>>>>>> will
>>>>>>> throw exception if needed and intrinsic has wrong bci anyway (bci of
>>>>>>> call site and not allocation). *We should generate plain "intrinsic"
>>>>>>> uncommon trap in such case*.
>>>>>>> There are GraphKit::is_LibraryCallKit() and is_Parse() methods which
>>>>>>> allow to check where the request for exception generation is coming
>>>>>>> from.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Vladimir
>>>>>>>
>>>>>>> On 5/29/14 7:00 PM, Vladimir Kozlov wrote:
>>>>>>>> Now I understand the problem. I think we need to keep separate an
>>>>>>>> exception code which requires re-execution. Currently
>>>>>>>> we don't check for should_reexecute() when we merge or search for
>>>>>>>> existing exception code (we only compare type).
>>>>>>>>
>>>>>>>> Vladimir K.
>>>>>>>>
>>>>>>>> On 5/29/14 2:14 AM, Vladimir Ivanov wrote:
>>>>>>>>> The problem is with exception state of two branches [1].
>>>>>>>>> Exception on fast path comes from GraphKit::new_instance during
>>>>>>>>> intrinsic construction in LibraryCallKit::inline_native_clone.
>>>>>>>>> It is
>>>>>>>>> executed under should_reexecute=true. PreserveReexecuteState
>>>>>>>>> doesn't
>>>>>>>>> touch exception state, so
>>>>>>>>> kit.add_exception_states_from(new_jvms) (@
>>>>>>>>> callGenerator.cpp:703) tries to accumulate incompatible exceptions
>>>>>>>>> from
>>>>>>>>> fast & slow paths.
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Vladimir Ivanov
>>>>>>>>>
>>>>>>>>> [1] FAST PATH (intrinsic):
>>>>>>>>>
>>>>>>>>> JVMS depth=1 loc=5 stk=9 arg=10 mon=14 scalar=14 end=14 mondepth=0
>>>>>>>>> sp=1
>>>>>>>>> bci=12 reexecute=false method=virtual jobject
>>>>>>>>>
>>>>>>>>> Exception state:
>>>>>>>>>
>>>>>>>>> JVMS depth=1 loc=5 stk=9 arg=9 mon=14 scalar=14 end=14 mondepth=0
>>>>>>>>> sp=0
>>>>>>>>> bci=12 reexecute=true method=virtual jobject
>>>>>>>>>
>>>>>>>>> SLOW PATH
>>>>>>>>>
>>>>>>>>> JVMS depth=1 loc=5 stk=9 arg=10 mon=14 scalar=14 end=14 mondepth=0
>>>>>>>>> sp=1
>>>>>>>>> bci=12 reexecute=false method=virtual jobject
>>>>>>>>>
>>>>>>>>> Exception state:
>>>>>>>>>
>>>>>>>>> JVMS depth=1 loc=5 stk=9 arg=9 mon=14 scalar=14 end=14 mondepth=0
>>>>>>>>> sp=0
>>>>>>>>> bci=12 reexecute=false method=virtual jobject
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 5/29/14 2:18 AM, Vladimir Kozlov wrote:
>>>>>>>>>> PredictedIntrinsicGenerator was added to have a general
>>>>>>>>>> predicate.
>>>>>>>>>> For
>>>>>>>>>> example, AES crypto predicate checks klass of an object pointed
>>>>>>>>>> by a
>>>>>>>>>> field. PredictedCallGenerator checks klass of method's holder
>>>>>>>>>> only.
>>>>>>>>>>
>>>>>>>>>> Both cases would have the same problem with reexecute. Using
>>>>>>>>>> PredictedIntrinsicGenerator will not help you.
>>>>>>>>>>
>>>>>>>>>> But you should not have this problem at the merge point, because
>>>>>>>>>> reexecute should be reset back after intrinsic is generated:
>>>>>>>>>>
>>>>>>>>>> { PreserveReexecuteState preexecs(this);
>>>>>>>>>> jvms()->set_should_reexecute(true);
>>>>>>>>>>
>>>>>>>>>> It is strange that it is still true after PreserveReexecuteState
>>>>>>>>>> destructor.
>>>>>>>>>>
>>>>>>>>>> Vladimir K
>>>>>>>>>>
>>>>>>>>>> On 5/28/14 9:37 AM, Vladimir Ivanov wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I'm looking at 8033626 [1]:
>>>>>>>>>>> "assert(ex_map->jvms()->same_calls_as(_exceptions->jvms()))
>>>>>>>>>>> failed:
>>>>>>>>>>> all
>>>>>>>>>>> collected exceptions must come from the same place".
>>>>>>>>>>>
>>>>>>>>>>> 8014447 [2] introduced new code shape for virtual intrinsifiable
>>>>>>>>>>> methods
>>>>>>>>>>> (Object::hashCode & Object::clone): "invokevirtual
>>>>>>>>>>> Object.clone()"
>>>>>>>>>>> can
>>>>>>>>>>> be represented as a PredictedCallGenerator with
>>>>>>>>>>> LibraryIntrinsic on
>>>>>>>>>>> fast
>>>>>>>>>>> path and VirtualCallGenerator on slow path.
>>>>>>>>>>>
>>>>>>>>>>> The problem arises for Object::clone, because for OOM case
>>>>>>>>>>> intrinsified
>>>>>>>>>>> version requires reexecution (reexecute=true), but slow path
>>>>>>>>>>> (virtual
>>>>>>>>>>> dispatch) doesn't (reexecute=false). Further attempt to finish
>>>>>>>>>>> the
>>>>>>>>>>> diamond and merge 2 exception states fires the assertion.
>>>>>>>>>>>
>>>>>>>>>>> So, I have 2 questions:
>>>>>>>>>>> (1) I don't see any way to merge reexecute=true &
>>>>>>>>>>> reexecute=false
>>>>>>>>>>> states, so both branches should have the same reexecute status.
>>>>>>>>>>> Am I
>>>>>>>>>>> missing something?
>>>>>>>>>>>
>>>>>>>>>>> (2) Does PredictedCallGenerator actually support such shape
>>>>>>>>>>> (intrinsic on fast path)? There are 2 call generators:
>>>>>>>>>>> PredictedIntrinsicGenerator & PredictedCallGenerator. What's the
>>>>>>>>>>> difference?
>>>>>>>>>>>
>>>>>>>>>>> I see there's a way to customize predicate in
>>>>>>>>>>> LibraryCallKit::try_to_predicate() for
>>>>>>>>>>> PredictedIntrinsicGenerator,
>>>>>>>>>>> but
>>>>>>>>>>> is it the only reason why PredictedIntrinsicGenerator was
>>>>>>>>>>> introduced?
>>>>>>>>>>>
>>>>>>>>>>> Best regards,
>>>>>>>>>>> Vladimir Ivanov
>>>>>>>>>>>
>>>>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8033626
>>>>>>>>>>>
>>>>>>>>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8014447
>>>>>>>>>>> "Object.hashCode intrinsic breaks inline caches"
More information about the hotspot-compiler-dev
mailing list