RFR(XS): JDK-8129440 G1 crash during concurrent root region scan.

Thu Jul 9 08:29:51 UTC 2015

On 06.07.2015 at 15:03 Volker Simonis wrote:
> On Tue, Jun 23, 2015 at 8:45 AM, Siebenborn, Axel
> <axel.siebenborn at sap.com> wrote:
>>
>>
>> Hi Volker,
>>
>> thanks for your comment.Indeed, this was my first idea for a fix. However,
>> there are more than 50 callers of  load_heap_oop, plus some generated by
>> preprocessor defines like this one:
>>
>> #define DO_OOP_WORK_IMPL(closureName) \
>>   template <class T> inline void closureName##Closure::do_oop_work(T* p) {
>> \
>>     T heap_oop = oopDesc::load_heap_oop(p);               \
>>     if (!oopDesc::is_null(heap_oop)) {                    \
>>       oop obj = oopDesc::decode_heap_oop_not_null(heap_oop);       \
>>       do_oop(obj);                                        \
>>     }                                                     \
>>   }
>>
>> We would have to decide for each of the callers, if there might be a
>> concurrency problem or not.
>>
>> Do you think, it is worth the effort and the risk of missing one?
>>
>
> OK, you're probably right.
>
> If you haven’t seen any performance regressions caused by this fix I'm
> fine with your solution.
>
> Unfortunately I can not push this shared-code change so can we please
> get a sponsor from within Oracle for this change?
>
> Thanks,
> Volker

We did not see any performance regressions.
On Itanium compilers add release and acquire semantics to stores and loads of volatile variables. The additional memory barriers really could have a performance impact. But even on Itanium we don’t see any performance regression in our benchmarks.
On other platforms, optimizations prevented by the volatile qualifier are optimizations that we really don't want.    

Thanks,
Axel

>>
>> Regards,
>>
>> Axel
>>
>> On 22.06.2015 at 18:17 Volker Simonis wrote:
>>> Hi Axel,
>>>
>>> the change looks good, but shouldn't we better provide two versions of
>>> load_heap_oop()
>>>
>>> inline oop       oopDesc::load_heap_oop(oop* p)
>>> inline oop       oopDesc::load_heap_oop(volatile oop* p)
>>>
>>> such that the callers can decide on their own which versions they require?
>>>
>>> Or do the callers of load_heap_oop() not always know if there exist
>>> concurrent mutator threads?
>>>
>>> Regards,
>>> Volker
>>>
>>>
>>> On Mon, Jun 22, 2015 at 5:33 PM, Siebenborn, Axel
>>> <axel.siebenborn at sap.com> wrote:
>>>> Hi,
>>>> we have seen crashes with G1 during concurrent root region scan.
>>>> It turned out, that the reason for the crashes was reloading an oop after
>>>> a null check.
>>>> The compiler may legally do this, as the pointer is not declared as
>>>> volatile.
>>>> The code runs concurrently to mutator threads.
>>>>
>>>> template <class T>
>>>> inline void G1RootRegionScanClosure::do_oop_nv(T* p) {
>>>>   T heap_oop = oopDesc::load_heap_oop(p);
>>>> // 1. Load oop
>>>>   if (!oopDesc::is_null(heap_oop)) {
>>>>     oop obj = oopDesc::decode_heap_oop_not_null(heap_oop);    // 2.
>>>> Compiler decides to re-load the oop
>>>>     HeapRegion* hr = _g1h->heap_region_containing((HeapWord*) obj);
>>>>     _cm->grayRoot(obj, obj->size(), _worker_id, hr);
>>>>   }
>>>> }
>>>>
>>>>
>>>> We have seen the problem on AIX with the xlC compiler. However, we have
>>>> seen similar optimizations on other platforms and other platforms.
>>>>
>>>> As this code pattern is used quite often, I would suggest a global fix in
>>>> oopDesc::load_heap_oop(p).
>>>>
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8129440
>>>> Webrev: http://cr.openjdk.java.net/~asiebenborn/8129440/webrev.00/
>>>>
>>>> Thanks,
>>>> Axel
>>>>