[9] RFR(S): 8029443: 'assert(klass->is_loader_alive(_is_alive)) failed: must be alive' during VM_CollectForMetadataAllocation
Vladimir Kozlov
vladimir.kozlov at oracle.com
Fri Jul 25 17:27:41 UTC 2014
Tobias, you also need to check for general metadata and not just
klassptr - is_metadataptr(). Unfortunately they not inherited from
one-another in C2 types.
Thanks,
Vladimir
On 7/25/14 6:54 AM, Tobias Hartmann wrote:
> Mikael, Vladimir, thanks for the review.
>
> The problem is indeed caused by a missing check for a metadata pointer
> in sparc.ad.
>
> Adding 'n->bottom_type()->isa_klassptr()' checks to immP_load() and
> immP_no_oop_cheap() fixes the problem. The Klass pointer is then loaded
> from the constant table (loadConP_load()) and a metadata relocation is
> added by Compile::ConstantTable::emit().
>
> I had a look at the Aurora chessboard and it looks like as if the bug
> recently occured on x86_32 as well. I was not yet able to reproduce it
> but will try again next week.
>
> Thanks,
> Tobias
>
> On 22.07.2014 20:04, Vladimir Kozlov wrote:
>> I agree with Mikael, the case without compressed oops is incorrect.
>> The problem is immP_load() and immP_no_oop_cheap() operands miss the
>> check for metadata pointer, they only check for oop. For class loading
>> loadConP_load() should be used instead of loadConP_no_oop_cheap() and
>> it should check relocation type and do what loadConP_set() does.
>>
>>
>> X64 seems fine. X86_64.ad use $$$emit32$src$$constant; in such case
>> (load_immP31) which is expanded to the same code as load_immP:
>>
>> if ( opnd_array(1)->constant_reloc() != relocInfo::none ) {
>> emit_d32_reloc(cbuf, opnd_array(1)->constant(),
>> opnd_array(1)->constant_reloc(), 0);
>>
>> Thanks,
>> Vladimir
>>
>> On 7/22/14 2:22 AM, Tobias Hartmann wrote:
>>> On 21.07.2014 20:59, Vladimir Kozlov wrote:
>>>> On 7/21/14 1:44 AM, Tobias Hartmann wrote:
>>>>> Vladimir, Coleen, thanks for the reviews!
>>>>>
>>>>> On 18.07.2014 20:09, Vladimir Kozlov wrote:
>>>>>> On 7/18/14 11:02 AM, Coleen Phillimore wrote:
>>>>>>>
>>>>>>> On 7/18/14, 11:06 AM, Vladimir Kozlov wrote:
>>>>>>>> On 7/18/14 4:38 AM, Tobias Hartmann wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I spend some more days and was finally able to implement a test
>>>>>>>>> that
>>>>>>>>> deterministically triggers the bug:
>>>>>>>>
>>>>>>>> Why do you need to switch off compressed oops? Do you need to
>>>>>>>> switch
>>>>>>>> off compressed klass pointers too (UseCompressedClassPointers)?
>>>>>>>
>>>>>>> CompressedOops when off turns off CompressedClassPointers.
>>>>>>
>>>>>> You are right, I forgot that. Still the question is why switch off
>>>>>> coop?
>>>>>
>>>>> I'm only able to reproduce the bug without compressed oops. The
>>>>> original
>>>>> bug also only reproduces with -XX:-UseCompressedOops. I tried to
>>>>> figure
>>>>> out why (on Sparc):
>>>>>
>>>>> With compressed oops enabled, Method* metadata referencing
>>>>> 'WorkerClass'
>>>>> is added to 'doWork' in MacroAssembler::set_narrow_klass(..). In
>>>>> CodeBuffer::finalize_oop_references(..) the metadata is processed
>>>>> and an
>>>>> oop to the class loader 'URLClassLoader' is added. This oop leads
>>>>> to the
>>>>> unloading of 'doWork', hence the verification code is never executed.
>>>>>
>>>>> I'm not sure what set_narrow_klass(..) is used for in this case. I
>>>>> assume it stores a 'WorkerClass' Klass* in a register as part of an
>>>>> optimization? Because 'doWork' potentially works on any class.
>>>>> Apparently this optimization is not performed without compressed oops.
>>>>
>>>> I would suggest to compare 'doWork' assembler
>>>> (-XX:CompileCommand=print,TestMethodUnloading::doWork) with coop and
>>>> without it. Usually loaded into register class is used for klass
>>>> compare do guard inlining code. Or to initialize new object.
>>>>
>>>> I don't see loading (constructing) uncompressed (whole) klass pointer
>>>> from constant in sparc.ad. It could be the reason for different
>>>> behavior. It could be loaded from constants section. But constants
>>>> section should have metadata relocation info in such case.
>>>
>>> I did as you suggested and found the following:
>>>
>>> During the profiling phase the class given to 'doWork' always is
>>> 'WorkerClass'. The C2 compiler therefore optimizes the compiled version
>>> to expect a 'WorkerClass'. The branch that instantiates a new object is
>>> guarded by an uncommon trap (class_check). The difference between the
>>> two versions (with and without compressed oops) is the loading of the
>>> 'WorkerClass' Klass to check if the given class is equal:
>>>
>>> With compressed oops:
>>> SET narrowklass: precise klass WorkerClass:
>>> 0x00000001004a0d40:Constant:exact *,R_L1 ! compressed klass ptr
>>> CWBne R_L2,R_L1,B8 ! compressed ptr P=0.000001 C=-1.000000
>>>
>>> Without:
>>> SET precise klass WorkerClass: 0x00000001004aeab0:Constant:exact
>>> *,R_L1 ! non-oop ptr
>>> CXBpne R_L2,R_L1,B8 ! ptr P=0.000001 C=-1.000000
>>>
>>> R_L2: class given as parameter
>>> B8: location of uncommon trap
>>>
>>> In the first case, the Klass is loaded by a 'loadConNKlass' instruction
>>> that calls MacroAssembler::set_narrow_klass(..) which then creates a
>>> metadata_Relocation for the 'WorkerClass'. This metada_Relocation is
>>> processed by CodeBuffer::finalize_oop_references(..) and an oop to
>>> 'WorkerClass' is added. This oop causes the unloading of the method.
>>>
>>> In the second case, the Klass is loaded by a 'loadConP_no_oop_cheap'
>>> instruction that does not create a metadata_Relocation.
>>>
>>> I don't understand why the metadata_Relocation in the first case is
>>> needed? As the test shows it is better to only unload the method if we
>>> hit the uncommon trap because we could still use other (potentially
>>> complex) branches of the method.
>>>
>>> Thanks,
>>> Tobias
>>>
>>>>
>>>> thanks,
>>>> Vladimir
>>>>
>>>>>
>>>>> Best,
>>>>> Tobias
>>>>>
>>>>>>
>>>>>> Vladimir
>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~thartmann/8029443/webrev.02/
>>>>>>>>
>>>>>>>> Very nice!
>>>>>>>
>>>>>>> Yes, I agree. Impressive.
>>>>>>>
>>>>>>> The refactoring in nmethod.cpp looks good to me. I have no further
>>>>>>> comments.
>>>>>>> Thanks!
>>>>>>> Coleen
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> @Vladimir: The test shows why we should only clean the ICs but not
>>>>>>>>> unload the nmethod if possible. The method ' doWork'
>>>>>>>>> is still valid after WorkerClass was unloaded and depending on the
>>>>>>>>> complexity of the method we should avoid unloading it.
>>>>>>>>
>>>>>>>> Make sense.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sparc my patch fixes the bug and leads to the nmethod not being
>>>>>>>>> unloaded. The compiled version is therefore used even
>>>>>>>>> after WorkerClass is unloaded.
>>>>>>>>>
>>>>>>>>> On x86 the nmethod is unloaded anyway because of a dead oop.
>>>>>>>>> This is
>>>>>>>>> probably due to a slightly different implementation
>>>>>>>>> of the ICs. I'll have a closer look to see if we can improve that.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Vladimir
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Tobias
>>>>>>>>>
>>>>>>>>> On 16.07.2014 10:36, Tobias Hartmann wrote:
>>>>>>>>>> Sorry, forgot to answer this question:
>>>>>>>>>>> Were you able to create a small test case for it that would be
>>>>>>>>>>> useful to add?
>>>>>>>>>> Unfortunately I was not able to create a test. The bug only
>>>>>>>>>> reproduces on a particular system with a > 30 minute run
>>>>>>>>>> of runThese.
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Tobias
>>>>>>>>>>
>>>>>>>>>> On 16.07.2014 09:54, Tobias Hartmann wrote:
>>>>>>>>>>> Hi Coleen,
>>>>>>>>>>>
>>>>>>>>>>> thanks for the review.
>>>>>>>>>>>> *+ if (csc->is_call_to_interpreted() &&
>>>>>>>>>>>> stub_contains_dead_metadata(is_alive, csc->destination())) {*
>>>>>>>>>>>> *+ csc->set_to_clean();*
>>>>>>>>>>>> *+ }*
>>>>>>>>>>>>
>>>>>>>>>>>> This appears in each case. Can you fold it and the new
>>>>>>>>>>>> function
>>>>>>>>>>>> into a function like
>>>>>>>>>>>> clean_call_to_interpreted_stub(is_alive, csc)?
>>>>>>>>>>>
>>>>>>>>>>> I folded it into the function
>>>>>>>>>>> clean_call_to_interpreter_stub(..).
>>>>>>>>>>>
>>>>>>>>>>> New webrev:
>>>>>>>>>>> http://cr.openjdk.java.net/~thartmann/8029443/webrev.01/
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Tobias
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Coleen
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> So before the permgen removal embedded method* were oops and
>>>>>>>>>>>>> they
>>>>>>>>>>>>> were processed in relocInfo::oop_type loop.
>>>>>>>>>>>>>
>>>>>>>>>>>>> May be instead of specializing opt_virtual_call_type and
>>>>>>>>>>>>> static_call_type call site you can simple add a loop for
>>>>>>>>>>>>> relocInfo::metadata_type (similar to oop_type loop)?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Vladimir
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 7/14/14 4:56 AM, Tobias Hartmann wrote:
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> please review the following patch for JDK-8029443.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8029443
>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~thartmann/8029443/webrev.00/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *Problem*
>>>>>>>>>>>>>> After the tracing/marking phase of GC,
>>>>>>>>>>>>>> nmethod::do_unloading(..)
>>>>>>>>>>>>>> checks
>>>>>>>>>>>>>> if a nmethod can be unloaded because it contains dead
>>>>>>>>>>>>>> oops. If
>>>>>>>>>>>>>> class
>>>>>>>>>>>>>> unloading occurred we additionally clear all ICs where the
>>>>>>>>>>>>>> cached
>>>>>>>>>>>>>> metadata refers to an unloaded klass or method. If the
>>>>>>>>>>>>>> nmethod
>>>>>>>>>>>>>> is not
>>>>>>>>>>>>>> unloaded, nmethod::verify_metadata_loaders(..) finally
>>>>>>>>>>>>>> checks if
>>>>>>>>>>>>>> all
>>>>>>>>>>>>>> metadata is alive. The assert in CheckClass::check_class
>>>>>>>>>>>>>> fails
>>>>>>>>>>>>>> because
>>>>>>>>>>>>>> the nmethod contains Method* metadata corresponding to a dead
>>>>>>>>>>>>>> Klass.
>>>>>>>>>>>>>> The Method* belongs to a to-interpreter stub [1] of an
>>>>>>>>>>>>>> optimized
>>>>>>>>>>>>>> compiled IC. Normally we clear those stubs prior to
>>>>>>>>>>>>>> verification to
>>>>>>>>>>>>>> avoid dangling references to Method* [2], but only if the
>>>>>>>>>>>>>> stub
>>>>>>>>>>>>>> is not in
>>>>>>>>>>>>>> use, i.e. if the IC is not in to-interpreted mode. In this
>>>>>>>>>>>>>> case the
>>>>>>>>>>>>>> to-interpreter stub may be executed and hand a stale Method*
>>>>>>>>>>>>>> to the
>>>>>>>>>>>>>> interpreter.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *Solution
>>>>>>>>>>>>>> *The implementation of nmethod::do_unloading(..) is
>>>>>>>>>>>>>> changed to
>>>>>>>>>>>>>> clean
>>>>>>>>>>>>>> compiled ICs and compiled static calls if they call into a
>>>>>>>>>>>>>> to-interpreter stub that references dead Method* metadata.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The patch was affected by the G1 class unloading changes
>>>>>>>>>>>>>> (JDK-8048248)
>>>>>>>>>>>>>> because the method nmethod::do_unloading_parallel(..) was
>>>>>>>>>>>>>> added. I
>>>>>>>>>>>>>> adapted the implementation as well.
>>>>>>>>>>>>>> *
>>>>>>>>>>>>>> Testing
>>>>>>>>>>>>>> *Failing test (runThese)
>>>>>>>>>>>>>> JPRT
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Tobias
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [1] see CompiledStaticCall::emit_to_interp_stub(..)
>>>>>>>>>>>>>> [2] see nmethod::verify_metadata_loaders(..),
>>>>>>>>>>>>>> static_stub_reloc()->clear_inline_cache() clears the stub
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>
>
More information about the hotspot-dev
mailing list