Request for review (#4) 7174978: NPG: Fix bactrace builder for class redefinition
Coleen Phillimore
coleen.phillimore at oracle.com
Tue Jan 15 13:12:58 PST 2013
David,
Thank you for the prompt review!
On 01/14/2013 10:48 PM, David Holmes wrote:
> Hi Coleen,
>
> Thanks for the detailed explanation. Overall approach sounds good to
> me. A few comments ...
>
Thanks!
> Copyrights to 2013 :)
>
It's unclear whether we have to do this. I think there is going to be
some RE process that fixes these files.
> Are you sure you can remove the old merlin API? :D But note that it is
> still referenced in jdk/src/share/javavm/export/jvm.h so a follow up
> cleanup on the JDK side is needed. (Hmmm does that also mean a CCC?)
>
I will file a bug against JDK to clean this up and see what hoops I have
to go through to remove an unused entry. This is the only file where
it appears. It must have had a CCC to remove it in the first place.
> ---
>
> src/share/vm/prims/jvmtiRedefineClasses.cpp
>
> 2411 constantPoolHandle smaller_cp(THREAD,
> 2412 ConstantPool::allocate(loader_data, scratch_cp_length, THREAD));
>
> Can you indent 2412 please so it is more obvious allocate is being
> used to pass a parameter to the handle constructor. (The old code had
> the same flaw).
>
Fixed.
> ---
>
> src/share/vm/oops/constantPool.hpp
>
> ! void add_version(int version) {
> ! if (version == -1) {
> ! _saved._version = version; // keep overflow
> ! } else {
> ! int new_version = version++;
> ! _saved._version = version == INT_MAX ? -1 : version;
> ! }
> ! }
>
> new_version is unused here but I think you meant to do
>
> _saved._version = new_version == INT_MAX ? -1 : version;
>
This was a pre-code-review cleanup... I didn't mean to assign
version++ into new_version. The second line stays the same. I
retested this fix.
> ---
>
> src/share/vm/classfile/javaClasses.cpp
>
> 1350 // Allocate temporary buffer with extra space for formatting
> and line number
> 1351 char* buf = NEW_RESOURCE_ARRAY(char, buf_len + 64);
>
> Can you elaborate on the value 64?
Not my fault? Someone must have calculated that 64 is enough to hold
the potential things printed to the buffer below. I recalculated and 64
is sufficient because source_file_name is added to the buf_len earlier
so all that's left is the line numbers and
"(Unknown Source) (nmethod 0x1234567812345678)"
>
> 1370 // Neither soucename and linenumber
>
> Typo: soucename
>
> And it should be "Neither sourcename nor linenumber"
Fixed.
>
> 1425 // NULL mirror means end of stack trace
> 1426 if (mirror.is_null()) goto handle_cause;
>
> Don't like the goto :) But I'm having trouble following all the
> looping in this section of code. Is a NULL mirror intentional or does
> it indicate an error elsewhere? I'm trying to see why encountering a
> NULL mirror in the inner for loop means the enclosing while loop is
> also done - and if that loop would terminate "naturally" if you just
> did a break from the for loop ?
I don't like the 'goto' either. It's 2013 as you point out earlier ;)
I had a version that extracted this code getting rid of the goto, but
opted for the lesser change. Yes, a NULL mirror means there are no
more stack trace elements in that chunk. The chunk (length 32) is
preallocated, preinitialized to zero, and filled in as method/mirror/bci
elements are found on the stack. When you hit 0 for mirror, that's the
end. 0 for method id doesn't mean it's the end. There's an assert
that we never add a NULL mirror in BacktraceBuilder::push.
This code is only called for exceptions thrown at initialization time.
It's hard to test this code since most exceptions during initialization
in universe_post_init() cause
# Internal Error
(/java/east/u1/cphillim/hg/rt.bt2-clean/src/share/vm/interpreter/linkResolver.cpp:902),
pid=10369, tid=47170653824768
# assert(resolved_method->method_holder()->is_linked()) failed: must be
linked
#
Should be another CR, which I thought we'd already fixed!
>
> 1434 handle_cause:
> 1435 {
> 1436 EXCEPTION_MARK;
> 1437 JavaValue result(T_OBJECT);
>
> This seems to be hiding the "result" variable declared at 1409 at the
> top of the while loop ??
Yes, it does. I renamed this "result" to "cause".
>
> 1682 // Count element in remaining partial chunk. Non-null mirror
> means theres
> 1683 // the stack trace element is saved.
>
> This comment doesn't read correctly.
Rewrote:
// Count element in remaining partial chunk. NULL value for mirror
// marks the end of the stack trace elements that are saved.
>
> 1735 // SystemDictionary::stackTraceElement_klass() will be null for
> pre-1.4 JDKs
> 1736 assert(JDK_Version::is_gte_jdk14x_version(), "should only be
> called in >= 1.4");
>
> Seems obsolete now - chuck it out with the rest of the merlin code?
With pleasure.
Thanks,
Coleen
>
> Cheers,
> David
> ------
>
> On 15/01/2013 12:36 PM, Coleen Phillimore wrote:
>> Summary: Remove Method* from backtrace but save version so redefine
>> classes doesn't give inaccurate line numbers. Removed old Merlin API
>> with duplicate code.
>>
>> Sorry, this is so long but I want to explain this change properly.
>>
>> When exceptions are thrown, the VM saves a backtrace for each which may
>> or may not be printed or saved for later. For speed, the information
>> that the VM saves are arrays 32 of Method* and bci. For permgen
>> elimination we added an array of mirrors (instances of java_lang_Class),
>> so that the class loader owning the Method* pointer doesn't get unloaded
>> and deallocated.
>>
>> The problem with redefine classes is that the method can be redefined
>> and the Method* pointer can be deallocated even if the class loader
>> owning the Method* is kept alive. Printing or touching the backtrace
>> later will crash the VM.
>>
>> This is my 4th attempt at fixing this problem. The first that was
>> reviewed, saved a side structure for each backtrace created so that we
>> can walk the Method* pointers and keep them from being deallocated. This
>> change had an unfortunate side structure to manage, and had also a 6.1%
>> slowdown in javac. The second attempt was to predigest the Method* into
>> the data needed for java_lang_StackTraceElement into method_name,
>> class_name, file_name, and line_number, so that the Method* doesn't need
>> to be saved. This had a 18% slowdown in javac.
>>
>> The 3rd attempt is to look for a way to have GC handle backtraces and
>> mark methods as on_stack but adding a InstanceThrowableKlass and
>> determining which closures needed special actions for the array of
>> Method* pointers is a lot of additional special case code to the garbage
>> collectors. It's unclear how this would work actually, but we might have
>> to revisit this idea for java_lang_invoke_MemberName.
>>
>> The 4th attempt is to save the method_idnum, bci, version_number, and
>> mirror. From the method_idnum, we can get the Method*. The version
>> number is updated for redefine classes. There was an unused field that I
>> repurposed. If the method's version doesn't match the backtrace, we do
>> not return the source_file and line_number from the Method* because it's
>> changed. You get a backtrace of the form:
>>
>> java.lang.RuntimeException: Test exception
>> at RedefineMethodInBacktraceTarget.methodToRedefine(Unknown Source)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>
>>
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>
>>
>> at java.lang.reflect.Method.invoke(Method.java:474)
>> at
>> RedefineMethodInBacktraceApp.getThrowableFromMethodToRedefine(RedefineMethodInBacktraceApp.java:60)
>>
>>
>> at
>> RedefineMethodInBacktraceApp.doMethodInBacktraceTest(RedefineMethodInBacktraceApp.java:44)
>>
>>
>>
>> This fix for the Method* crash does not add to the footprint of the
>> backtraces (idnum is short and bci and version number is packed into an
>> int). It does not degrade performance of javac (or refworkload). The
>> cost is to not support getting a source file and line number from a
>> method that was redefined after being saved in a backtrace. Note that
>> this change does not return the wrong information. A CR can be filed for
>> somebody to someday lift this restriction if someday we don't care about
>> performance or a new solution is found.
>>
>> This might be hard to review as a diff because I refactored duplicated
>> code.
>>
>> The webrev: http://cr.openjdk.java.net/~coleenp/7174978_5/
>> Bug link: http://bugs.sun.com/view_bug.do?bug_id=7174978
>>
>> Tested with NSK quick testlist, runThese jck tests, JPRT, refworkload.
>>
>> Thanks,
>> Coleen
>>
More information about the hotspot-runtime-dev
mailing list