Regarding 8033626: C2: Merging exception state: reexecute=true	vs reexecute=false
    Vladimir Kozlov 
    vladimir.kozlov at oracle.com
       
    Mon Jun  9 16:11:11 UTC 2014
    
    
  
Thank you for this information. Your fix is good then.
Thanks,
Vladimir
On 6/9/14 4:16 AM, Vladimir Ivanov wrote:
> Here's the call sequence (excerpts from compiled method disassembly [1] and copy-paste from debugging session [2]):
>
>    - compiled method calls _new_instance_Java (@ 0x10519d791) (OptoRuntime::new_instance_C)
>
>    -  OptoRuntime::new_instance_C throws OOM exception and _new_instance_Java calls StubRoutines::forward exception
>
>    - StubRoutines::forward exception ==> SharedRuntime::exception_handler_for_return_address ==>
> SharedRuntime::raw_exception_handler_for_return_address, which returns nm->exception_begin() (== 0x10519d860)
>
>     - exception handler in compiled method calls exception blob, which calls OptoRuntime::handle_exception_C ==>
> OptoRuntime::handle_exception_C_helper
>
>     - OptoRuntime::handle_exception_C_helper returns the address of exception handler with uncommon trap in compiled
> method ( == 0x10519d83a)
>
> Best regards,
> Vladimir Ivanov
>
> [1]
> ...
>    0x000000010519d77c: mov    0x70(%r15),%rbx
>    0x000000010519d780: mov    %rbx,%r10
>    0x000000010519d783: add    $0x330,%r10
>    0x000000010519d78a: cmp    0x80(%r15),%r10
>    0x000000010519d791: jae    0x000000010519d809
> ...
>    # _new_instance_Java
>    0x000000010519d809: movabs $0x7c0060240,%rsi  ; {metadata(synchronized 'Test$A')}
>    0x000000010519d813: callq  0x0000000105166260  ; OopMap{rbp=Oop off=184}
>                                                  ;*invokevirtual clone
>                                                  ; - Test::f at 1 (line 67)
>                                                  ;   {runtime_call} _new_istance_Java
> ...
>    # exception handler
>    0x000000010519d83a: mov    $0xffffffa7,%esi
>    0x000000010519d83f: callq  0x00000001051074e0  ; OopMap{rbp=Oop off=228}
>                                                  ;*invokevirtual clone
>                                                  ; - Test::f at 1 (line 67)
>                                                  ;   {runtime_call} UncommonTrapBlob
>    0x000000010519d844: callq  0x0000000101c045f6  ;*invokevirtual clone
>                                                  ; - Test::f at 1 (line 67)
>                                                  ;   {runtime_call} os::breakpoint()
> ...
> [Exception Handler]
> [Stub Code]
>    0x000000010519d860: jmpq   0x00000001051669e0  ;   {no_reloc}
>
>
> [2] http://cr.openjdk.java.net/~vlivanov/8033626/oom_handling.txt
>
> On 6/7/14 3:12 AM, Vladimir Ivanov wrote:
>>
>>  >>> The test checks deoptimization for null check at the beginning of
>>>>> intrinsic. Additional testing should be done for OOM exception coming
>>>>> from allocation (during runtime call). Trigger OOM and run with
>>>>> StressCompiledExceptionHandlers. With this flag code generated by
>>>>> GraphKit::gen_stub() will be executed and
>>>>> StubRoutines::forward_exception_entry() will be called. What will
>>>>> happen
>>>>> in this case?
>>>> I tested such case manually (instrumented new_instance_C to throw OOM).
>>>> The exception is forwarded (and pending exception is cleared) to the
>>>> handler with contains uncommon trap. After deoptimization is done,
>>>
>>> So StubRoutines::forward_exception_entry() will call
>>> SharedRuntime::raw_exception_handler_for_return_address(). Which one
>>> code succeed?:
>>>
>>> nm->is_deopt_pc(return_address) and unpack_with_exception()
>>> or it calls nm->exception_begin()
>>>
>>> I am trying to understand how it jumps to the uncommon trap code. Can
>>> you show here the sequence of calls/jumps?
>> It calls nm->exception_begin(). I'll send you complete sequence on Monday.
>>
>> Best regards,
>> Vladimir Ivanov
>>
>>>
>>> thanks,
>>> Vladimir
>>>
>>>> Object::clone is reexecuted and it goes to JVM_Clone (diagnostic output
>>>> [1]).
>>>>
>>>> Best regards,
>>>> Vladimir Ivanov
>>>>
>>>> [1]
>>>> new_instance_C: OOM
>>>> Exception <a 'java/lang/OutOfMemoryError'> (0x0000000115766790)
>>>> thrown
>>>> [/Users/vladimir/ws/jdk/ws/jdk9-dev/hotspot/src/share/vm/opto/runtime.cpp,
>>>>
>>>>
>>>> line 241]
>>>> for thread 0x00007f8dc881e800
>>>> 1 [Exception (): a 'java/lang/OutOfMemoryError' in {method}
>>>> {0x00000001566011e8} 'f' '(LTest;)Ljava/lang/Object;' in public
>>>> synchronized 'Test' at 0x0000000111ff8818]
>>>> Uncommon trap occurred in Test::f (@0x0000000111ff8844) thread=6403
>>>> reason=unhandled action=none unloaded_class_index=-1
>>>> DEOPT PACKING thread 0x00007f8dc881e800 Compiled frame
>>>> (sp=0x0000000111d6b4c0 unextended sp=0x0000000111d6b4c0,
>>>> fp=0x0000000115766bc0, real_fp=0x0000000111d6b4e0,
>>>> pc=0x0000000111ff8844)
>>>>       nmethod   1568    1             Test::f (5 bytes)
>>>>
>>>>       Virtual frames (innermost first):
>>>>          0 - frame( sp=0x0000000111d6b4c0,
>>>> unextended_sp=0x0000000111d6b4c0, fp=0x0000000115766bc0,
>>>> pc=0x0000000111ff8844)
>>>> Test.f(Test.java:67) - invokevirtual @ bci 1
>>>>      {method} {0x00000001566011e8} 'f' '(LTest;)Ljava/lang/Object;' in
>>>> public synchronized 'Test'
>>>>      bci:    1
>>>>      locals:
>>>>         0 conflict
>>>>      expressions:
>>>>         0 a 'Test$A' <0x0000000115766bc0>
>>>>
>>>>       Created vframeArray 0x00007f8dc882a618
>>>> DEOPT UNPACKING thread 0x00007f8dc881e800 vframeArray 0x00007f8dc882a618
>>>> mode 2
>>>> [1 Interpreted Frame]
>>>> Interpreted frame (sp=0x0000000111d6b488 unextended
>>>> sp=0x0000000111d6b488, fp=0x0000000111d6b4d0,
>>>> real_fp=0x0000000111d6b4d0, pc=0x0000000111e66b00)
>>>> (~deoptimization entry points  [0x0000000111e66b00, 0x0000000111e75a20]
>>>>   61216 bytes
>>>> )
>>>>       BufferBlob (0x0000000111e60590) used for Interpreter
>>>>
>>>>   - local  [0x0000000000000000]; #0
>>>>   - stack  [0x0000000115766bc0]; #0
>>>>   - monitor[0x0000000111d6b490]
>>>>   - bcp    [0x00000001566011c9]; @1
>>>>   - locals [0x0000000111d6b4e0]
>>>>   - method [0x00000001566011e8]; static jobject Test.f(jobject)
>>>>      {method} {0x00000001566011e8} 'f' '(LTest;)Ljava/lang/Object;' in
>>>> public synchronized 'Test'
>>>>      bci:    1
>>>>      locals:
>>>>         0 NULL <0x0000000000000000>
>>>> locals size     1
>>>> expression size 1
>>>> {method} {0x00000001566011e8} 'f' '(LTest;)Ljava/lang/Object;' in public
>>>> synchronized 'Test'
>>>> JVM_Clone
>>>>
>>>>> Thanks,
>>>>> Vladimir K
>>>>>
>>>>>
>>>>> On 6/6/14 10:46 AM, Vladimir Ivanov wrote:
>>>>>> I played with uncommon trap approach and it works fine.
>>>>>>
>>>>>> While I'm finishing extensive testing, I'd be grateful for early
>>>>>> feedback.
>>>>>>
>>>>>> Preliminary fix:
>>>>>> http://cr.openjdk.java.net/~vlivanov/8033626/webrev.00/
>>>>>>
>>>>>> The fix is to bail out to interpreter if any exception is generated in
>>>>>> new_instance_C. There's no need to update exception state. When
>>>>>> reexecuting the call, native version of Object::clone (JVM_Clone) will
>>>>>> be invoked.
>>>>>>
>>>>>> Controlling exception handling explicitly by passing
>>>>>> deoptimize_on_exception parameters looks cleaner to me, than having a
>>>>>> check (is_LibraryCallKit() or JVMS reexecute bit) burden in
>>>>>> GraphKit::make_slow_call_ex.
>>>>>>
>>>>>> I fixed only Object::clone case, since I don't see other places where
>>>>>> similar IR shape can be produced.
>>>>>>
>>>>>> Best regards,
>>>>>> Vladimir Ivanov
>>>>>>
>>>>>> On 6/3/14 2:27 AM, Vladimir Kozlov wrote:
>>>>>>> Unfortunately it may not be a whole solution. The uncommon trap may
>>>>>>> still needs to be recorded as an exceptions handler for compiled
>>>>>>> code so
>>>>>>> that when we return from new_instance_C() runtime call with OOM
>>>>>>> exception it will find it. See GraphKit::gen_stub(),
>>>>>>> generate_catch_exception() and
>>>>>>> SharedRuntime::raw_exception_handler_for_return_address()
>>>>>>>
>>>>>>> Also do additional tests with StressCompiledExceptionHandlers
>>>>>>> otherwise
>>>>>>> we deoptimize immediately on the return from new_instance_C().
>>>>>>>
>>>>>>> In short, please investigate all return paths from new_instance_C()
>>>>>>> when
>>>>>>> there is pending OOM exception and we have uncommon trap instead of
>>>>>>> CreateExNode.
>>>>>>>
>>>>>>> thanks,
>>>>>>> Vladimir
>>>>>>>
>>>>>>> On 6/2/14 12:21 PM, Vladimir Ivanov wrote:
>>>>>>>> Yes, uncommon trap should solve the problem.
>>>>>>>> Thanks for excellent idea!
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> Vladimir Ivanov
>>>>>>>>
>>>>>>>> On 5/31/14 6:30 AM, Vladimir Kozlov wrote:
>>>>>>>>> I thought about this more. I think we should not generate
>>>>>>>>> exceptions in
>>>>>>>>> intrinsic when we are going to re-execute the method. Interpreter
>>>>>>>>> will
>>>>>>>>> throw exception if needed and intrinsic has wrong bci anyway
>>>>>>>>> (bci of
>>>>>>>>> call site and not allocation). *We should generate plain
>>>>>>>>> "intrinsic"
>>>>>>>>> uncommon trap in such case*.
>>>>>>>>> There are GraphKit::is_LibraryCallKit() and is_Parse() methods
>>>>>>>>> which
>>>>>>>>> allow to check where the request for exception generation is coming
>>>>>>>>> from.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Vladimir
>>>>>>>>>
>>>>>>>>> On 5/29/14 7:00 PM, Vladimir Kozlov wrote:
>>>>>>>>>> Now I understand the problem. I think we need to keep separate an
>>>>>>>>>> exception code which requires re-execution. Currently
>>>>>>>>>> we don't check for should_reexecute() when we merge or search for
>>>>>>>>>> existing exception code (we only compare type).
>>>>>>>>>>
>>>>>>>>>> Vladimir K.
>>>>>>>>>>
>>>>>>>>>> On 5/29/14 2:14 AM, Vladimir Ivanov wrote:
>>>>>>>>>>> The problem is with exception state of two branches [1].
>>>>>>>>>>> Exception on fast path comes from GraphKit::new_instance during
>>>>>>>>>>> intrinsic construction in LibraryCallKit::inline_native_clone.
>>>>>>>>>>> It is
>>>>>>>>>>> executed under should_reexecute=true. PreserveReexecuteState
>>>>>>>>>>> doesn't
>>>>>>>>>>> touch exception state, so
>>>>>>>>>>> kit.add_exception_states_from(new_jvms) (@
>>>>>>>>>>> callGenerator.cpp:703) tries to accumulate incompatible
>>>>>>>>>>> exceptions
>>>>>>>>>>> from
>>>>>>>>>>> fast & slow paths.
>>>>>>>>>>>
>>>>>>>>>>> Best regards,
>>>>>>>>>>> Vladimir Ivanov
>>>>>>>>>>>
>>>>>>>>>>> [1] FAST PATH (intrinsic):
>>>>>>>>>>>
>>>>>>>>>>> JVMS depth=1 loc=5 stk=9 arg=10 mon=14 scalar=14 end=14
>>>>>>>>>>> mondepth=0
>>>>>>>>>>> sp=1
>>>>>>>>>>> bci=12 reexecute=false method=virtual jobject
>>>>>>>>>>>
>>>>>>>>>>> Exception state:
>>>>>>>>>>>
>>>>>>>>>>> JVMS depth=1 loc=5 stk=9 arg=9 mon=14 scalar=14 end=14 mondepth=0
>>>>>>>>>>> sp=0
>>>>>>>>>>> bci=12 reexecute=true method=virtual jobject
>>>>>>>>>>>
>>>>>>>>>>> SLOW PATH
>>>>>>>>>>>
>>>>>>>>>>> JVMS depth=1 loc=5 stk=9 arg=10 mon=14 scalar=14 end=14
>>>>>>>>>>> mondepth=0
>>>>>>>>>>> sp=1
>>>>>>>>>>> bci=12 reexecute=false method=virtual jobject
>>>>>>>>>>>
>>>>>>>>>>> Exception state:
>>>>>>>>>>>
>>>>>>>>>>> JVMS depth=1 loc=5 stk=9 arg=9 mon=14 scalar=14 end=14 mondepth=0
>>>>>>>>>>> sp=0
>>>>>>>>>>> bci=12 reexecute=false method=virtual jobject
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 5/29/14 2:18 AM, Vladimir Kozlov wrote:
>>>>>>>>>>>> PredictedIntrinsicGenerator was added to have a general
>>>>>>>>>>>> predicate.
>>>>>>>>>>>> For
>>>>>>>>>>>> example, AES crypto predicate checks klass of an object pointed
>>>>>>>>>>>> by a
>>>>>>>>>>>> field. PredictedCallGenerator checks klass of method's holder
>>>>>>>>>>>> only.
>>>>>>>>>>>>
>>>>>>>>>>>> Both cases would have the same problem with reexecute. Using
>>>>>>>>>>>> PredictedIntrinsicGenerator will not help you.
>>>>>>>>>>>>
>>>>>>>>>>>> But you should not have this problem at the merge point, because
>>>>>>>>>>>> reexecute should be reset back after intrinsic is generated:
>>>>>>>>>>>>
>>>>>>>>>>>>    { PreserveReexecuteState preexecs(this);
>>>>>>>>>>>>      jvms()->set_should_reexecute(true);
>>>>>>>>>>>>
>>>>>>>>>>>> It is strange that it is still true after PreserveReexecuteState
>>>>>>>>>>>> destructor.
>>>>>>>>>>>>
>>>>>>>>>>>> Vladimir K
>>>>>>>>>>>>
>>>>>>>>>>>> On 5/28/14 9:37 AM, Vladimir Ivanov wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm looking at 8033626 [1]:
>>>>>>>>>>>>> "assert(ex_map->jvms()->same_calls_as(_exceptions->jvms()))
>>>>>>>>>>>>> failed:
>>>>>>>>>>>>> all
>>>>>>>>>>>>> collected exceptions must come from the same place".
>>>>>>>>>>>>>
>>>>>>>>>>>>> 8014447 [2] introduced new code shape for virtual
>>>>>>>>>>>>> intrinsifiable
>>>>>>>>>>>>> methods
>>>>>>>>>>>>> (Object::hashCode & Object::clone): "invokevirtual
>>>>>>>>>>>>> Object.clone()"
>>>>>>>>>>>>> can
>>>>>>>>>>>>> be represented as a PredictedCallGenerator with
>>>>>>>>>>>>> LibraryIntrinsic on
>>>>>>>>>>>>> fast
>>>>>>>>>>>>> path and VirtualCallGenerator on slow path.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The problem arises for Object::clone, because for OOM case
>>>>>>>>>>>>> intrinsified
>>>>>>>>>>>>> version requires reexecution (reexecute=true), but slow path
>>>>>>>>>>>>> (virtual
>>>>>>>>>>>>> dispatch) doesn't (reexecute=false). Further attempt to finish
>>>>>>>>>>>>> the
>>>>>>>>>>>>> diamond and merge 2 exception states fires the assertion.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So, I have 2 questions:
>>>>>>>>>>>>>    (1) I don't see any way to merge reexecute=true &
>>>>>>>>>>>>> reexecute=false
>>>>>>>>>>>>> states, so both branches should have the same reexecute status.
>>>>>>>>>>>>> Am I
>>>>>>>>>>>>> missing something?
>>>>>>>>>>>>>
>>>>>>>>>>>>>    (2) Does PredictedCallGenerator actually support such shape
>>>>>>>>>>>>> (intrinsic on fast path)? There are 2 call generators:
>>>>>>>>>>>>> PredictedIntrinsicGenerator & PredictedCallGenerator. What's
>>>>>>>>>>>>> the
>>>>>>>>>>>>> difference?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I see there's a way to customize predicate in
>>>>>>>>>>>>> LibraryCallKit::try_to_predicate() for
>>>>>>>>>>>>> PredictedIntrinsicGenerator,
>>>>>>>>>>>>> but
>>>>>>>>>>>>> is it the only reason why PredictedIntrinsicGenerator was
>>>>>>>>>>>>> introduced?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>> Vladimir Ivanov
>>>>>>>>>>>>>
>>>>>>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8033626
>>>>>>>>>>>>>
>>>>>>>>>>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8014447
>>>>>>>>>>>>>      "Object.hashCode intrinsic breaks inline caches"
    
    
More information about the hotspot-compiler-dev
mailing list