RFR(M): 8210754: print_location is not reliable enough (printing register info)

coleen.phillimore at oracle.com coleen.phillimore at oracle.com
Thu Oct 4 10:50:45 UTC 2018



On 10/4/18 6:11 AM, Doerr, Martin wrote:
>
> Hi Coleen,
>
> thank you for reviewing.
>
> > In is_valid_klass() can you check for the vtable before calling class_loader_data() or getting the Symbol?
>
> I like this idea, but it’s by far not as simple as for methods.
>
> We’d have to check for all valid subclasses of Klass:
>
> ObjArrayKlass
>
> TypeArrayKlass
>
> InstanceKlass
>
> InstanceClassLoaderKlass
>
> InstanceMirrorKlass
>
> InstanceRefKlass
>
> In order to determine the vtable ptr the same way as for methods, we’d 
> need to create a default constructed instance of each one which is 
> currently prohibited by “assert(DumpSharedSpaces || UseSharedSpaces, 
> "only for CDS");” in the default constructors.
>
> Should I remove these assertions and determine the vtable pointers of 
> all these subclasses?
>
> Or do you have a better idea?
>

I don't have a better idea and agree with you, that this is cumbersome 
and not worth doing.
>
> > There's also a 'valid' bit on a lot of the metadata that you might want to check in debug mode.
>
> We could use them in debug mode, but my main concern is error handling 
> in product. If we make it more solid in debug mode developers may get 
> the wrong impression that it was rock solid. So I’d appreciate if we 
> could make it solid enough by only using what’s available in product. 
> Do you agree?
>

Yes, that's fine.
>
> > The Symbols might be allocated in metaspace for CDS, or considered to be allocated there by the 
> metaspace range matching code, so this might not be quite right to 
> return false if they're in metaspace.
>
> Right, I need to fix this.
>
> Should I use “Metaspace::contains_non_shared(s)” or better remove the 
> check at all?
>

I think removing the check is probably best.  You've already added a lot 
of checking.
>
> > This comment meant that we only print method but Klass has a vptr too.
>
> I should update this comment.
>

Thank you.

I don't need to see another version if y ou make these changes.
Thanks,
Coleen

> Best regards,
>
> Martin
>
> *From:*coleen.phillimore at oracle.com <coleen.phillimore at oracle.com>
> *Sent:* Mittwoch, 3. Oktober 2018 01:20
> *To:* Doerr, Martin <martin.doerr at sap.com>; Thomas Stüfe 
> <thomas.stuefe at gmail.com>
> *Cc:* hotspot-runtime-dev at openjdk.java.net
> *Subject:* Re: RFR(M): 8210754: print_location is not reliable enough 
> (printing register info)
>
>
> http://cr.openjdk.java.net/~mdoerr/8210754_print_location/webrev.03/src/hotspot/share/oops/klass.cpp.frames.html 
> <http://cr.openjdk.java.net/%7Emdoerr/8210754_print_location/webrev.03/src/hotspot/share/oops/klass.cpp.frames.html>
>
> In is_valid_klass() can you check for the vtable before calling 
> class_loader_data() or getting the Symbol?  There's also a 'valid' bit 
> on a lot of the metadata that you might want to check in debug mode.
>
> http://cr.openjdk.java.net/~mdoerr/8210754_print_location/webrev.03/src/hotspot/share/oops/symbol.cpp.frames.html 
> <http://cr.openjdk.java.net/%7Emdoerr/8210754_print_location/webrev.03/src/hotspot/share/oops/symbol.cpp.frames.html>
>
> The Symbols might be allocated in metaspace for CDS, or considered to 
> be allocated there by the metaspace range matching code, so this might 
> not be quite right to return false if they're in metaspace. I think 
> I've contradicted my earlier comments (sorry about that).  I think you 
> can't have that conditional.
>
> http://cr.openjdk.java.net/~mdoerr/8210754_print_location/webrev.03/src/hotspot/share/runtime/os.cpp.frames.html 
> <http://cr.openjdk.java.net/%7Emdoerr/8210754_print_location/webrev.03/src/hotspot/share/runtime/os.cpp.frames.html>
>
> 1121   // Check if in metaspace and print types that have vptrs (only method now)
> 1123     if (Klass::is_valid((Klass*)addr)) {
>
>
> This comment meant that we only print method but Klass has a vptr too.
>
> I think this looks like a good improvement to error handling. Thank 
> you for taking up my suggestions and merging with the 
> classLoaderDataGraph change.
>
> Coleen
>
> On 10/2/18 8:58 AM, Doerr, Martin wrote:
>
>     Hi Coleen,
>
>     I've created a rebased webrev after 8209645:
>
>     http://cr.openjdk.java.net/~mdoerr/8210754_print_location/webrev.03/
>     <http://cr.openjdk.java.net/%7Emdoerr/8210754_print_location/webrev.03/>
>
>     It contains minor improvements (added missing include and type cast).
>
>     I think I should reuse the code from VirtualSpaceList::contains. Changed.
>
>     I have tested most cases by inserting e.g.
>
>     "if (UseNewCode) __ code()->insts()->emit_int8((unsigned char) 0x37 /* aaa */);"
>
>     at different places. I got the expected output.
>
>     Please take a look.
>
>     Thanks and best regards,
>
>     Martin
>
>     -----Original Message-----
>
>     From: hotspot-runtime-dev<hotspot-runtime-dev-bounces at openjdk.java.net>
>     <mailto:hotspot-runtime-dev-bounces at openjdk.java.net>  On Behalf Of Doerr, Martin
>
>     Sent: Freitag, 28. September 2018 17:27
>
>     To:coleen.phillimore at oracle.com <mailto:coleen.phillimore at oracle.com>; Thomas Stüfe<thomas.stuefe at gmail.com> <mailto:thomas.stuefe at gmail.com>
>
>     Cc:hotspot-runtime-dev at openjdk.java.net
>     <mailto:hotspot-runtime-dev at openjdk.java.net>
>
>     Subject: [CAUTION] RE: RFR(M): 8210754: print_location is not reliable enough (printing register info)
>
>     Hi Coleen,
>
>     thank you for taking time to review my proposal in detail. I have fixed a few things in place and uploaded a new webrev which moves functions into their classes:
>
>     http://cr.openjdk.java.net/~mdoerr/8210754_print_location/webrev.02
>     <http://cr.openjdk.java.net/%7Emdoerr/8210754_print_location/webrev.02>
>
>         Hi, I am sorry for the delay getting to this.
>
>     No problem. I also needed some time to get my build working again. I had to build a gcc 7.3.
>
>         http://cr.openjdk.java.net/~mdoerr/8210754_print_location/webrev.01/src/hotspot/share/classfile/classLoaderData.cpp.udiff.html
>         <http://cr.openjdk.java.net/%7Emdoerr/8210754_print_location/webrev.01/src/hotspot/share/classfile/classLoaderData.cpp.udiff.html>
>
>         Can you assert is_error_reported() and add a comment that this is only to be called during error reporting?  I share David's concern about walking CLDG graph without a lock but I think the only place where we unlink items is limited and the timing window where you'll get into trouble is very limited.
>
>     Done (in place).
>
>           105 bool vmErrorHelper::is_valid_symbol(Symbol* s) {
>
>         This is actually wrong. The Symbol is not in metaspace.  I agree with Thomas's suggestion of testing validity and moving to symbol.hpp/cpp.
>
>     Right, I had broken it in webrev.01. Fixed in place.
>
>           122   return is_readable_range(bytes, bytes + len);
>
>         I kind of wonder if this is going too far.   Why not just ask os::is_readable_pointer?
>
>     Hmm... Should usually be sufficient. But is there a disadvantage of using the range version? Performance shouldn't be a concern when iterating pagewise.
>
>         +    } else if (Method::has_method_vptr((const void*)addr)) {
>
>         This should be changed to is_valid_method().  is_valid_klass() should be written similarly.
>
>     Done (in place).
>
>         Isn't if (Metaspace::contains_non_shared(m))  a sufficient check for the Klass pointer, ie, like the code in Method::is_valid_method()?
>
>     Can't we have Klasses in shared space for CDS? I guess this would break it for CDS.
>
>     I think checking if at least the common Klass part is completely contained in one metaspace region is better. Do you agree?
>
>           141 bool vmErrorHelper::is_valid_oop(oop obj) {
>
>         I expected is_valid_oop to be the top level verification and to load the klass pointer and ask is_valid_klass.  verify_oop does that.  Maybe the CLDG iteration is warranted to see if the klass is still loaded.   I think using just using Universe::heap()->is_in_reserved() should be sufficient and you should avoid using block_start() and other things that might be broken with a crash.
>
>     Removing the block_start part would degrade the functionality. The current implementation tries to find the start address of a Java object when addr points into it.
>
>         I've always wanted the code blob printing code to be in codeBlob.cpp.
>
>     I agree. Done.
>
>         I appreciate the hardening of error reporting.  It's long overdue!
>
>     Thanks.
>
>     I'll try to find some time for testing various cases next week.
>
>     Best regards,
>
>     Martin
>



More information about the hotspot-runtime-dev mailing list