RFR: JDK-8222545: Safe klass asserts

Roman Kennke rkennke at redhat.com
Thu Apr 18 17:15:55 UTC 2019



Am 18.04.19 um 18:47 schrieb Per Liden:
> 
> On 04/18/2019 06:21 PM, Roman Kennke wrote:
>>
>>
>> Am 18.04.19 um 17:34 schrieb Stefan Karlsson:
>>> On 2019-04-18 17:29, Roman Kennke wrote:
>>>>
>>>>
>>>> Am 18.04.19 um 16:34 schrieb Stefan Karlsson:
>>>>> On 2019-04-18 15:13, Roman Kennke wrote:
>>>>>>>>>> To add a little more detail, I could move the change up into 
>>>>>>>>>> is_objArray(), but I don't want to expose it to any non-assert 
>>>>>>>>>> paths. Therefore I could do 2 different impls there, guarded 
>>>>>>>>>> by #ifdef ASSERT but I don't think it's a good idea to behave 
>>>>>>>>>> differently under ASSERT, that kindof defeats the point of 
>>>>>>>>>> assert, right?
>>>>>>>>>>
>>>>>>>>>> What do you think ?
>>>>>>>>>
>>>>>>>>> I don't follow your argument. Under asserts you need to access 
>>>>>>>>> the klass pointer "safely" but otherwise you do not. So there 
>>>>>>>>> are two behaviours related to accessing the klass pointer 
>>>>>>>>> anyway. I'd rather see that encapsulated in the accessor.
>>>>>>>>>
>>>>>>>>> I assume it's not just asserts but any debug only code that 
>>>>>>>>> wants to access the klass pointer.
>>>>>>>>
>>>>>>>> In general, for any runtime calls into oopDesc::klass() the 
>>>>>>>> access should be safe. The acrobatics is only necessary for 
>>>>>>>> *GC-internal* 
>>>>>>>
>>>>>>> This is the part I don't quite understand, and goes back to my 
>>>>>>> initial question. Why are you doing these operations on 
>>>>>>> from-space objects? I'm thinking you should be in a position in 
>>>>>>> the GC to make sure this can never happen. If you need to do that 
>>>>>>> in the GC (which is fine), then the GC could apply a "resolve" 
>>>>>>> function to get the to-space object, and call size() (or 
>>>>>>> whatever) on that object. This shouldn't have to leak out of the 
>>>>>>> GC, right?
>>>>>>
>>>>>> It is a problem when we are about to evacuate an object. Then we 
>>>>>> need to know its size in order to allocate and copy an appropriate 
>>>>>> chunk. The problem is that this part is racy: two threads (e.g. 
>>>>>> two Java threads via barrier, or one Java thread vs one GC thread) 
>>>>>> might compete over this: both would create a copy of the object, 
>>>>>> but ultimately only one would succeed (by CASing the fwd pointer). 
>>>>>> Therefore, getting hold of the object size is racy, by design, and 
>>>>>> this requires to resolve the _klass. Now, we can do that ahead of 
>>>>>> time, and call oopDesc::size_given_klass() and all would be good, 
>>>>>> except that size_given_klass() asserts that the object is indeed 
>>>>>> of the given klass, and hence fetches _klass again, which, at this 
>>>>>> point, is racy. Solving this inside the GC would require to 
>>>>>> basically copy all the machinery to get hold of object size into 
>>>>>> the GC. Are you asking me to do that?
>>>>>
>>>>> Other GCs store forwarding pointers in the mark word. See 
>>>>> oopDesc::forward_to and friends. Could you do the same and get rid 
>>>>> of this problem?
>>>>>
>>>>
>>>> No. Other GCs store the fwd pointer there, but only during a pause, 
>>>> and while possibly stashing the mark word somewhere else in the 
>>>> meantime.
>>>>
>>>> We need to do it outside of GC pauses, plus we need a way (bit) to 
>>>> indicate what it actually is (fwd pointer or Klass*). The mark word 
>>>> is already badly overloaded and also accessed much more often and in 
>>>> critical paths (e.g. locking), while the Klass* is basically 
>>>> immutable, and has the lowest 3 bits free (when running with 
>>>> -UseCompressedClassPointers, which would be enforced by Shenandoah). 
>>>> Using the Klass* slot is therefore the simplest and most efficient 
>>>> place to keep the fwd pointer. Attempting to use the mark word would 
>>>> require much more barriers and cause more overhead to manage it.
>>>
>>> Are you sure? Remember, the object is in from-space and no thread is 
>>> allowed to change it, except the threads that are copying out of the 
>>> from-space.
>>
>> Right. But we still need one bit in it to differentiate between fwd 
>> ptr and regular mark word. And installing the fwd pointer concurrently 
>> with 
> 
> We already have such bits, and oopDesc::forward_to_atomic() will set 
> them for you. Aren't they enough?
> 
>> locking seems more of a horror story than dealing with an otherwise 
>> immutable Klass*.
> 
> With a to-space invariant, I can't see how this can happen concurrently 
> with locking.

Let's say we have one thread (1) which wants to evacuate an object, and 
another which does some locking operation on it (2). (1) would 
optimistically copy the object, then attempts to install the forwarding 
pointer into header, while (2) modifies the header. Which means 1. oops, 
the header changed, need to CAS again, 2. How to ensure consistency in 
the to-space object after evac ? I suppose this can be solved by a 
reasonable protocol, but even just thinking about it gives me a 
headache, while the same on an immutable Klass* or external brooks 
pointer is trivial.

Also, I don't understand why we even need to bikeshed this. All I am 
proposing is:
1. A sanity assert in klass() and friends which hurts nobody. We could 
actually make this a little nicer and check klass->is_klass(), and add 
the GC-hook CollectedHeap::is_klass() there, similar to what we already 
do in oopDesc::is_oop().
2. Making the asserts in path from size_given_klass() reliable in light 
of concurrent modification. Instead of dropping those asserts, I would 
actually like to strengthen them and not only assert obj->is_XYZArray() 
but even assert obj->klass() == klass, with the appropriate precautions 
needed for GC/Shenandoah.
3. Avoid calling klass() twice in typeArrayOop::size() (JDK-8222537) 
which is actually a (minor) performance improvement.

Is this too much to ask?

Removing an otherwise good assert, and/or not allowing an assert in 
klass() seems to hurt our debugging capabilities. This doesn't make 
sense to me.

Roman



More information about the hotspot-gc-dev mailing list