RFR[S] 8005165 Platform-independent C++ vtables for CDS

Ioi Lam ioi.lam at oracle.com
Tue Mar 7 17:05:00 UTC 2017


Hi Coleen & Thomas,

Thanks again for the review.

On 3/7/17 5:48 AM, coleen.phillimore at oracle.com wrote:
> This looks really good!!  Are you going to do a follow up to remove 
> the MC section?
Yes, MC will be removed in JDK-8072061 - Automatically determine optimal 
sizes for the CDS region sizes

Thanks
- Ioi
> thanks!
> Coleen
>
> On 3/6/17 8:25 PM, Ioi Lam wrote:
>> Hi Thomas & Coleen,
>>
>> Thanks for your comments. I have updated the rev:
>>
>> http://cr.openjdk.java.net/~iklam/jdk10/8005165-platform-independent-cds-vtable.v04/
>>
>> [1] Switched back to computing the exact vtable size
>> [2] Move non-trivial functions outside of their class declaration
>>
>> Thanks
>> - Ioi
>>
>> On 3/6/17 8:51 AM, Thomas Stüfe wrote:
>>> Hi Coleen and Ioi,
>>>
>>> I had to port C++ code to platforms with terrible compilers for a 
>>> time in my life, that is why I like code to be as portable as 
>>> possible. That said, you are right in your argumentation, the 
>>> SafeFetch solution is not terribly elegant and Ioi's original way of 
>>> determining the vtable size is cleaner.
>>>
>>> I did some checks on some of our architectures with a test similar 
>>> to Ioi's and on a first glance it seems to work for simple cases 
>>> (single and public inheritance) on ppc (AIX) and Linux ia64. 
>>> Although the vtables seemed to me to contain function descriptors, 
>>> not real pointers to code, so this is something to keep in mind. But 
>>> if the live vtable are copied, the function descriptors they contain 
>>> should point to valid code too, so it should not matter. Just to 
>>> remember to not expect every slot in the array to be a valid code 
>>> pointer.
>>>
>>> So, in short, I remove my objection to Ioi's original solution, as 
>>> far as that matters.
>>>
>>> I still think we rely on a lot here: Contiguous vtable containing 
>>> absolute memory addresses, vtable pointer at start of object and 
>>> vtable entries to be ordered from base->derived class. So I wonder 
>>> how much effort it would be (now or in the future as a separate 
>>> change) to have a fallback where - at loading time - instead of 
>>> copying vtables the vtable pointers in the objects were fixed up to 
>>> point to the new live vtables? I know this would be more expensive 
>>> and potentially defeat the point of shared classes. But maybe not, 
>>> it depends on how many objects are there, no?
>>>
>>> Kind Regards, Thomas
>>>
>>> On Sun, Mar 5, 2017 at 4:17 PM, <coleen.phillimore at oracle.com 
>>> <mailto:coleen.phillimore at oracle.com>> wrote:
>>>
>>>
>>>     Ioi,  Some comments inline (where no comments, insert "ok") :)
>>>
>>>
>>>     On 3/2/17 10:37 PM, Ioi Lam wrote:
>>>
>>>         Hi Coleen,
>>>
>>>         Thanks for the comments. I have updated the webrev. See
>>>         in-line for responses.
>>>
>>>         http://cr.openjdk.java.net/~iklam/jdk10/8005165-platform-independent-cds-vtable.v03/
>>>         <http://cr.openjdk.java.net/%7Eiklam/jdk10/8005165-platform-independent-cds-vtable.v03/>
>>>
>>>
>>>         On 3/2/17 8:48 AM, coleen.phillimore at oracle.com
>>>         <mailto:coleen.phillimore at oracle.com> wrote:
>>>
>>>
>>>             Ioi
>>>             I like the concept of this a lot but have some stylistic
>>>             comments to help people reading this code later.
>>>
>>>             http://cr.openjdk.java.net/~iklam/jdk10/8005165-platform-independent-cds-vtable.v02/src/share/vm/memory/metaspaceShared.cpp.udiff.html
>>>             <http://cr.openjdk.java.net/%7Eiklam/jdk10/8005165-platform-independent-cds-vtable.v02/src/share/vm/memory/metaspaceShared.cpp.udiff.html>
>>>
>>>             s/vtab/vtable/g and s/Vtab/Vtable/ please.  It doesn't
>>>             save many characters, especially in CppVtableInfo/Testers
>>>
>>>         Done.
>>>
>>>             + // Start at slot 1, because slot 0 may be RTTI (on
>>>             Solaris/Sparc)
>>>             + int i;
>>>             + for (i=1; ; i++) {
>>>             Since you're using 'i' later, can you rename it to
>>>             something descriptive.  Or have another variable
>>>             "vtable_length" to use later.   This looks like an old
>>>             style for loop.
>>>
>>>         Done
>>>
>>>             Can the functions for CppVtableInfo be declared outside
>>>             of the class declaration?  They don't need to be inline
>>>             and then the debug code for testing the vtable size can
>>>             be not in the middle of the class declaration.   Then
>>>             you can move the Tester classes to inside the same
>>>             #ifndef PRODUCT block.
>>>
>>>             Can you put #endif // PRODUCT when the ifdef covers
>>>             several lines of code?
>>>
>>>         Done
>>>
>>>             vtab_of could be more descriptive, like cpp_vtable_for().
>>>
>>>         I changed to vtable_of(). Because the class name is already
>>>         CppVtableCloner, repeating the word "cpp" seems repetitive
>>>         to me.
>>>
>>>             Was PrintSharedSpaces was never converted to UL?
>>>
>>>         Right. I've filed
>>>         https://bugs.openjdk.java.net/browse/JDK-8176132
>>>         <https://bugs.openjdk.java.net/browse/JDK-8176132>
>>>         (-XX:+PrintSharedSpaces should be converted to use Unified
>>>         Logging.)
>>>
>>>             + int n = MAX_VTABLE_SIZE;
>>>
>>>             Can you propagate MAX_VTABLE_SIZE to the places where
>>>             it's used.  n isn't descriptive.  This starts out with
>>>             max_vtable_size and then changes the size.  Reusing 'n'
>>>             makes this really hard to follow.  Not having a comment
>>>             that we only allocate enough slots for the vtable makes
>>>             it hard too.
>>>
>>>             + // allocate CppVtableInfo in the MD section + _info =
>>>             (CppVtabInfo*)md_top;
>>>             + _info->set_vtab_size(n); // initially set to
>>>             max_vtable_size
>>>             + + // allocate temporary local instance of the metadata
>>>             type T + T tmp;
>>>             + intptr_t* srcvtab = vtab_of(tmp);
>>>             + intptr_t* dstvtab = _info->vtab();
>>>             +
>>>
>>>         Fixed.
>>>
>>>             Something like that for comments.  dstvtab is the
>>>             destination_vtable in the MD section.
>>>
>>>
>>>         I've dropped the md_ prefix from the functions that deal
>>>         with the vtables, since they shouldn't care whether it's the
>>>         "MD" section or not. Now it looks like this:
>>>
>>>         // Allocate and initialize the C++ vtables, starting from
>>>         top, but do not go past end.
>>>         intptr_t*
>>>         MetaspaceShared::allocate_cpp_vtable_clones(intptr_t* top,
>>>         intptr_t* end) {
>>>           assert(DumpSharedSpaces, "dump-time only");
>>>           // Layout (each slot is a intptr_t):
>>>           //   [number of slots in the first vtable = n1]
>>>           //   [ <n1> slots for the first vtable]
>>>           //   [number of slots in the first second = n2]
>>>           //   [ <n2> slots for the second vtable]
>>>           //   ...
>>>           // The order of the vtables is the same as the
>>>         CPP_VTAB_PATCH_TYPES_DO macro.
>>>           CPP_VTABLE_PATCH_TYPES_DO(ALLOC_CPP_VTABLE_CLONE);
>>>           return top;
>>>         }
>>>
>>>             + for (int i=0; i<n; i++) {
>>>             + const intptr_t bad = intptr_t(0xdeadbeef);
>>>             + intptr_t num = SafeFetchN(&srcvtab[i], bad);
>>>             + if (num == bad
>>>             + // || i > 120 /* uncomment this line to test */
>>>             + ) {
>>>             + _info->set_vtab_size(i-1);
>>>             + break;
>>>             + }
>>>             + dstvtab[i] = num;
>>>             + }
>>>             I dont understand this code.   You get deadbeef for a
>>>             bad value if SafeFetchN gets a fault but why would it
>>>             get a fault at the end of the metadata's vtable? 
>>>              Couldn't it just run onto the next vtable?  I think
>>>             your original way of counting vtable entries might be
>>>             better (sorry I didn't have time to study that thread).
>>>
>>>         I've modified the comments to this. Does it make sense to you?
>>>
>>>             // It is not always safe to call memcpy(), because
>>>         srcvtable might be shorter than
>>>             // MAX_VTABLE_SIZE, and the C++ linker might have placed
>>>         the vtable at the very
>>>             // end of the last page of libjvm.so. Crossing over to
>>>         the next page might
>>>             // cause a page fault.
>>>
>>>         My fear is the JVM would suddenly start crashing because the
>>>         order of .o files have changed on the linker's command line,
>>>         or if you enable some special linker optimization flags.
>>>         It's better safe than sorry.
>>>
>>>
>>>     This wasn't exactly what I was not understanding.   I didn't see
>>>     that you are copying 120 entries from the old vtable and junk
>>>     memory beyond the old vtable, unless you get a segv, in which
>>>     case you copy less.   I don't think you should copy random
>>>     memory into the vtable in the archive.  This doesn't seem
>>>     secure, even with the segv protection.
>>>
>>>     Since we already have assumptions about C++ vtable layout in the
>>>     code and it's mostly specified by various ABIs, and you have the
>>>     assert code, I think I would prefer that you copy only the
>>>     vtable entries into the archive.   I guess Thomas Stuefe had a
>>>     different opinion.  I've read the original thread.  Two points:
>>>
>>>     If new C++ compiler implementations add a discontigous vtable,
>>>     both the SafeFetchN and subclass additional virtual function at
>>>     end implementation will fail.  I don't think C++ implementations
>>>     would do this and a contiguous vtable as first in the instance
>>>     has been standard for years.   If our metadata adds multiple
>>>     inheritance, the same issue would be a problem for both
>>>     implementations, as well as for the implementation we have
>>>     before Ioi's fix.
>>>
>>>     Ioi's subclass adding virtual function method would work for any
>>>     esoteric C++ implementations in my memory, except the vptr for
>>>     the old DECC++ compiler was after the nonstatic data members
>>>     (which would fail with all of our implementations).
>>>
>>>     Since the code is there anyway for debug purposes, we're not
>>>     saving code by implementing SafeFetchN.  The SafeFetchN
>>>     implementation isn't obvious at all what it's doing, and
>>>     requires better comments, especially if you don't know already
>>>     what SafeFetchN does.  It looks really cryptic.  The poisoned
>>>     values also bothered me in that they overload other poisoned
>>>     values in other parts of the jvm.
>>>
>>>     Ioi, could you make all methods of CppVtableCloner out of line?
>>>
>>>     The other changes look good, although I might have more requests
>>>     for comments.
>>>
>>>     Thanks,
>>>     Coleen
>>>
>>>
>>>             Would be nice to have comments here too!!
>>>
>>>             + intptr_t* start = md_top;
>>>
>>>             This doesn't do anything (?)
>>>
>>>
>>>         Fixed. This was left over code.
>>>
>>>
>>>             + MetaspaceShared::zero_cpp_vtable_clones_for_writing();
>>>
>>>             Why not zero the destination vtable in allocate?   Or
>>>             does patching the vtable pointers call virtual
>>>             functions?  You could prevent that so you don't need
>>>             this code.
>>>
>>>         I added this comment:
>>>
>>>           // During patching, some virtual methods may be called, so
>>>         at this point
>>>           // the vtables must contain valid methods (as filled in by
>>>         CppVtableCloner::allocate).
>>>           MetaspaceShared::patch_cpp_vtable_pointers();
>>>
>>>           // The vtable clones contain addresses of the current process.
>>>           // We don't want to write these addresses into the archive.
>>>           MetaspaceShared::zero_cpp_vtable_clones_for_writing();
>>>
>>>             + // Restore the vtable in case we invoke any virtual
>>>             methods.
>>>             + MetaspaceShared::clone_cpp_vtables((intptr_t*)vtbl_list);
>>>             Can this be restore_cpp_vtables since that's what it's
>>>             doing. The first is after the dump and the second call
>>>             is at UseSharedSpaces.   A couple of comments in this
>>>             clone_cpp_vtables --> restore_cpp_vtables would be
>>>             nice.  eg:
>>>
>>>         I prefer to use the word clone. Otherwise when you just say
>>>         "vtable" it's not clear whether you're talking about the
>>>         original one (made by the c++ linker), or the cloned one in
>>>         the CDS archive.
>>>
>>>             + static intptr_t* clone_vtable(const char* name,
>>>             intptr_t* p) {
>>>             + T tmp;   // Allocate temporary dummy metadata object
>>>             to get vtable initialized
>>>             + CppVtabInfo* info = (CppVtabInfo*)p;
>>>             + int n = info->vtab_size();
>>>             + intptr_t* srcvtab = vtab_of(tmp);
>>>             + intptr_t* dstvtab = info->vtab();
>>>             +
>>>             + // We already checked (and, if necessary, adjusted n)
>>>             when the vtables were allocated, so we are
>>>             + // safe to do memcpy.
>>>             + if (PrintSharedSpaces) {
>>>             + tty->print_cr("%s copying %d vtable entries", name, n);
>>>             + }
>>>             + memcpy(dstvtab, srcvtab, sizeof(intptr_t) * n);
>>>             + return dstvtab + n;
>>>             + }
>>>
>>>         Done. I changed the wording
>>>
>>>             T tmp; // Allocate temporary dummy metadata object to
>>>         get to the original vtable.
>>>
>>>         As we are not really "initializing a vtable" here.
>>>
>>>             Same with 'patch'.  It'd be so much faster and easier to
>>>             read this code with more comments please.
>>>
>>>             http://cr.openjdk.java.net/~iklam/jdk10/8005165-platform-independent-cds-vtable.v02/src/share/vm/oops/constantPool.hpp.udiff.html
>>>             <http://cr.openjdk.java.net/%7Eiklam/jdk10/8005165-platform-independent-cds-vtable.v02/src/share/vm/oops/constantPool.hpp.udiff.html>
>>>
>>>             Why are these testers here?
>>>
>>>
>>>         I updated the comment:
>>>
>>>           // Used by CDS. These classes need to access the private
>>>         ConstantPool() constructor.
>>>           template <class T> friend class CppVtableTesterA;
>>>           template <class T> friend class CppVtableTesterB;
>>>           template <class T> friend class CppVtableCloner;
>>>
>>>
>>>         Thanks
>>>         - Ioi
>>>
>>>
>>>                         On 3/1/17 3:25 AM, Ioi Lam wrote:
>>>                         https://bugs.openjdk.java.net/browse/JDK-8005165
>>>                         <https://bugs.openjdk.java.net/browse/JDK-8005165>
>>>                         http://cr.openjdk.java.net/~iklam/jdk10/8005165-platform-independent-cds-vtable.v02/
>>>                         <http://cr.openjdk.java.net/%7Eiklam/jdk10/8005165-platform-independent-cds-vtable.v02/>
>>>
>>>                         Hi,
>>>
>>>                         This is the official review (follow up of
>>>                         the "Determining the size of C++ vtables"
>>>                         thread onhotspot-dev at openjdk.java.net
>>>                         <mailto:onhotspot-dev at openjdk.java.net>).
>>>
>>>                         The new code has the same assumption as the
>>>                         existing code in JDK 10: for a C++ object
>>>                         that contains virtual methods (e.g.,
>>>                         ConstantPool), we assume the first intptr_t
>>>                         slot of the object is a _vptr, which points
>>>                         to a vtable, which consists of no more than
>>>                         150 intptr_t's.
>>>
>>>                         ConstantPool*p -->[ _vptr    ] -------> [
>>>                         vtable slot 0 ]
>>>                                            [ field #0 ]          [
>>>                         vtable slot 1 ]
>>>                                            [ field #1 ]          [
>>>                         vtable slot 2 ]
>>>                                            [ field #2 ]          [
>>>                         vtable slot 3 ]
>>>                                            [ ....     ]          [
>>>                         vtable slot 4]
>>>                                                                  [
>>>                         vtable slot 5 ]
>>>                                                                  [
>>>                         ...           ]
>>>
>>>                         + In the existing code, we were pointing the
>>>                         vtable slots to
>>>                           code that's generated by HotSpot.
>>>
>>>                         + In the new code, we copy the vtable slots
>>>                         from an existing
>>>                           vtable (generated by the C++ linker).
>>>
>>>                         Per Thomas Stüfe's advice, I don't try to
>>>                         determine the size of the vtable (as that
>>>                         would add one more compiler requirement
>>>                         where new virtual methods added by a
>>>                         subclass must be placed at a higher offset
>>>                         in the vtable).
>>>
>>>                         Instead, I have added code in non-product
>>>                         builds to ensure that the vtables are no
>>>                         longer than 150 entries. You can run with
>>>                         "-XX:+PrintSharedSpaces -Xshare:dump" to
>>>                         print out the actual size of the vtables for
>>>                         your particular platform:
>>>
>>>                           ConstantPool has 12 virtual methods
>>>                           InstanceKlass has 113 virtual methods
>>>                           InstanceClassLoaderKlass has 113 virtual
>>>                         methods
>>>                           InstanceMirrorKlass has 113 virtual methods
>>>                           InstanceRefKlass has 113 virtual methods
>>>                           Method has 12 virtual methods
>>>                           ObjArrayKlass has 114 virtual methods
>>>                           TypeArrayKlass has 114 virtual methods
>>>
>>>                         As mentioned in the code comments, if you
>>>                         have an esoteric C++ compiler, the
>>>                         verify_sufficient_size() function will
>>>                         probably fail, but hopefully that would give
>>>                         you some hints for porting this code.
>>>
>>>                         To avoid accidentally touching an unmapped
>>>                         page, the code uses SafeFetchN for copying
>>>                         the vtable contents, and would shrink the
>>>                         vtable to less than 150 entries if
>>>                         necessary. I can't test this for real, but
>>>                         I've added some code to simulate an error:
>>>
>>>                             for (int i=0; i<n; i++) {
>>>                               const intptr_t bad = intptr_t(0xdeadbeef);
>>>                               intptr_t num = SafeFetchN(&srcvtab[i],
>>>                         bad);
>>>                               if (num == bad
>>>                                   // || i > 120 /* uncomment this
>>>                         line to test */
>>>                                   ) {
>>>                                 _info->set_vtab_size(i-1);
>>>                                 break;
>>>                               }
>>>                               dstvtab[i] = num;
>>>                             }
>>>
>>>                         Results:
>>>
>>>                         + Removed 850 lines of CPU-dependent code
>>>
>>>                         + CDS image is about 50K smaller
>>>
>>>                         + Previously Metadata objects must live in
>>>                         the read-write section in the CDS
>>>                           archive, because their _vptr was updated
>>>                         at run time. Now _vptr is no longer
>>>                           updated, so ConstantPool can be moved to
>>>                         the read-only section (see JDK-8171392).
>>>
>>>                         Thanks
>>>                         - Ioi
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>



More information about the hotspot-dev mailing list