The magic of self-patching vtable

Fri Dec 14 14:10:01 PST 2012

On 12/14/2012 5:04 PM, Ioi Lam wrote:
> Moving this discussion to hotspot-dev per John's suggestion.
>
> Actually, strictly from the point of view of CDS, it would be OK for 
> objects in the RO region to have vtables. This is because the _vptr is 
> fixed during dump time, and never modified during runtime.
>
> Only the contents of the vtables (pointed to by _vptr) are updated at 
> CDS image load time.

The way it works today is that the _vptr in the object is replaced, so 
you'd have to change that.

Coleen

>
> - Ioi
>
>
> On 12/14/2012 01:31 PM, John Rose wrote:
>> This is really interesting.
>>
>> I'll bet there could be a shell script that could detect and flag 
>> unintentional vtables on RO classes, at least on Linux or Solaris. 
>> That would be enough to block JPRT submissions.
>>
>> -- John
>>
>> On Dec 14, 2012, at 12:56 PM, Coleen Phillimore 
>> <coleen.phillimore at oracle.com> wrote:
>>
>>> On 12/14/2012 3:46 PM, Ioi Lam wrote:
>>>> Hi Coleen,
>>>>
>>>> Yes, I could find lots comments for 'how' the self-patching vtable 
>>>> works. I just couldn't find a justification for 'why' it needs to 
>>>> be done this way.
>>>>
>>>> The _vptr of all the Metadata objects in the CDS image are already 
>>>> fixed during dump time. They are never changed at run-time. I.e., 
>>>> immediately after the CDS image is loaded, we have this invariant:
>>>>
>>>> Universe::boolArrayKlassObj()->_vptr == cds_TypeArrayKlass_vtable
>>>> Universe::byteArrayKlassObj()->_vptr == cds_TypeArrayKlass_vtable
>>>> Universe::charArrayKlassObj()->_vptr == cds_TypeArrayKlass_vtable
>>>> ...
>>>>
>>>> where
>>>>
>>>> cds_TypeArrayKlass_vtable points to somewhere slightly above the 
>>>> "md" region of the mapped CDS image.
>>>>
>>>> If we can find out the size of the vtable for TypeArrayKlass, using 
>>>> the method I proposed below, why can't we simply do this as part of 
>>>> CDS image loading?
>>>>
>>>> {
>>>>     TypeArrayKlass o;
>>>>     void * real_vptr = *(void**)(&o);// == o._vptr;
>>>>     memcpy(cds_TypeArrayKlass_vtable, real_vptr, sizeof(void*) * 
>>>> num_vtable_slots_TypeArrayKlass);
>>>> }
>>> Yeah, that would be good except I can't figure out how that works, 
>>> but I'll take your word for it rather than trying to work it out 
>>> right now.
>>>
>>> You'd do this restoration in the function restore_unshareable_info() 
>>> instead of loading (lazily) though.
>>>
>>> Coleen
>>>
>>>> Thanks
>>>> - Ioi
>>>>
>>>> On 12/14/2012 05:03 AM, Coleen Phillimore wrote:
>>>>> Hi Ioi,
>>>>>
>>>>> There are comments about self-patching vtables in 
>>>>> memory/universe.cpp and memory/metaspaceShared.cpp and the target 
>>>>> dependent metaspaceShared_<cpu>.cpp code. You have summarized 
>>>>> how/why it works quite nicely.     The main reason for vtable 
>>>>> patching is that the .text is in different places for the 
>>>>> executables which load the shared archive, so we have this code to 
>>>>> fix it up in the miscellaneous code section.
>>>>>
>>>>> The other thing you might have left off which people should be 
>>>>> very aware of is that metadata that is shared read-only, like 
>>>>> classes ConstMethod, Array<alltypes>, and Symbols. Adding virtual 
>>>>> function calls to these will result in them having a self-patching 
>>>>> vtable and they will no longer be read only.    Do not add virtual 
>>>>> functions to these types!   I hope there are enough comments 
>>>>> warning of this.   I cannot think of a programmatic way to 
>>>>> disallow this.
>>>>>
>>>>> I have another comment for your [5]-[6] below.   The cpu dependent 
>>>>> code is to create the self patching entries per platform.  We 
>>>>> can't tell how long the vtable is so picked 200 for historical 
>>>>> purposes.   But we still need the cpu dependent code because we 
>>>>> still need vtable patching because of point [2] and it's done with 
>>>>> macro assembler. If the macro assembler was made to be a high 
>>>>> level macro assembler, we wouldn't need cpu dependent code. 
>>>>> Knowing in advance the size of the vtable would save the 
>>>>> hard-coded constant, but that's not really a big benefit.
>>>>>
>>>>> An aside why 200 was picked.  In the pre-permgen world, the 
>>>>> metadata in the shared space were Java objects and were inherited 
>>>>> from oopDesc.   Any new virtual function added for GC in oopDesc 
>>>>> would regularly blow out the length of the vtable size.   This 
>>>>> isn't the case anymore since the base class for these types is 
>>>>> Metadata and there aren't very many virtual functions there.   The 
>>>>> type Klass* has a lot more, but still way under 200.   A sanity 
>>>>> check in debug mode might be useful.
>>>>>
>>>>> thanks,
>>>>> Coleen
>>>>>
>>>>> On 12/14/2012 1:17 AM, Ioi Lam wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I am reading the CDS code and came upon the rather intriguing 
>>>>>> concept of "self patching vtables" -- i.e., patch_klass_vtables() 
>>>>>> and friends in hotspot/src/share/vm/memory/metaspaceShared.cpp.
>>>>>>
>>>>>> I couldn't find any comments in the source code for the reasons 
>>>>>> for such a complicated mechanism. Here's my understanding of it. 
>>>>>> Please let me know if this is correct:
>>>>>>
>>>>>> [1] Objects of the Metadata types (such as Klass and 
>>>>>> ConstantPool) have vtables.
>>>>>>     In GCC this is the field <Type>::_vptr, i.e., first word in
>>>>>>     the object.
>>>>>>
>>>>>> [2] Addresses of the vtables and the methods may be different 
>>>>>> across JVM runs,
>>>>>>     if libjvm.so is loaded at a different base address.
>>>>>>
>>>>>> [3] Although all of the Metadata objects are mapped R/W in the 
>>>>>> CDS image,
>>>>>>     at load time, we don't want to rewrite _vptr of each Metadata 
>>>>>> object
>>>>>>     (to maximize sharing).
>>>>>>
>>>>>> [4] Therefore, we redirect _vptr to our own vtables at CDS image 
>>>>>> dump time.
>>>>>>     Then,we patch our own vtables at run time.
>>>>>>
>>>>>> [5] The problem with [4] is with most C++ compilers (all?), there 
>>>>>> is no
>>>>>>     easy way to tell the size of the vtable of a given type.
>>>>>>
>>>>>> [6] We cannot safely copy more than the size of the real vtable, 
>>>>>> because the
>>>>>>     real vtable may be at the end of the code section; reading 
>>>>>> past its end
>>>>>>     would cause the VM to crash.
>>>>>>
>>>>>> As a result, the current design of the 'self patching vtable' is to
>>>>>> create a vtable that's "big enough" (currently with 200 method 
>>>>>> slots).
>>>>>> Each slot points to a generated stub that knows the C++ type and
>>>>>> virtual method index of the invoked method. When the stub is 
>>>>>> invoked,
>>>>>> it will look up the real method and patch the vtable accordingly.
>>>>>>
>>>>>> -----
>>>>>>
>>>>>> If I am right that the whole reason for the self patching vtables 
>>>>>> is getting the length of the vtable, wouldn't it be much easier 
>>>>>> if we do:
>>>>>>
>>>>>> class InstanceKlassVTableLengthFinder: public InstanceKlass {
>>>>>> public:
>>>>>>     virtual void ___end_of_vtable_marker();
>>>>>> };
>>>>>>
>>>>>> and then just search for ___end_of_vtable_marker() from the _vptr?
>>>>>>
>>>>>> That way we can get rid of all the CPU dependent code related to 
>>>>>> self-patching vtables.
>>>>>>
>>>>>> - Ioi
>