RFR(XS): JDK-8129440 G1 crash during concurrent root region scan.

Siebenborn, Axel axel.siebenborn at sap.com
Fri Jul 10 07:38:57 UTC 2015



On 09.07.2015 at 18:50 Jon Masamitsu wrote:
>
> On 7/9/2015 1:29 AM, Siebenborn, Axel wrote:
>> On 06.07.2015 at 15:03 Volker Simonis wrote:
>>> On Tue, Jun 23, 2015 at 8:45 AM, Siebenborn, Axel
>>> <axel.siebenborn at sap.com> wrote:
>>>>
>>>> Hi Volker,
>>>>
>>>> thanks for your comment.Indeed, this was my first idea for a fix. However,
>>>> there are more than 50 callers of  load_heap_oop, plus some generated by
>>>> preprocessor defines like this one:
>>>>
>>>> #define DO_OOP_WORK_IMPL(closureName) \
>>>>    template <class T> inline void closureName##Closure::do_oop_work(T* p) {
>>>> \
>>>>      T heap_oop = oopDesc::load_heap_oop(p);               \
>>>>      if (!oopDesc::is_null(heap_oop)) {                    \
>>>>        oop obj = oopDesc::decode_heap_oop_not_null(heap_oop);       \
>>>>        do_oop(obj);                                        \
>>>>      }                                                     \
>>>>    }
>>>>
>>>> We would have to decide for each of the callers, if there might be a
>>>> concurrency problem or not.
>>>>
>>>> Do you think, it is worth the effort and the risk of missing one?
>>> OK, you're probably right.
>>>
>>> If you haven’t seen any performance regressions caused by this fix I'm
>>> fine with your solution.
>>>
>>> Unfortunately I can not push this shared-code change so can we please
>>> get a sponsor from within Oracle for this change?
>>>
>>> Thanks,
>>> Volker 
>> We did not see any performance regressions.
>> On Itanium compilers add release and acquire semantics to stores and loads of volatile variables. The additional memory barriers really could have a performance impact. But even on Itanium we don’t see any performance regression in our benchmarks.
>> On other platforms, optimizations prevented by the volatile qualifier are optimizations that we really don't want. 
>
> Did you do performance measurements on GC pause times?
>
> Jon
I checked the GC pause times in standard benchmarks for ParrallelGC, CMS and G1. I don't see regressions here.

Axel

>>  
>> Thanks,
>> Axel
>>>> Regards,
>>>>
>>>> Axel
>>>>
>>>> On 22.06.2015 at 18:17 Volker Simonis wrote:
>>>>> Hi Axel,
>>>>>
>>>>> the change looks good, but shouldn't we better provide two versions of
>>>>> load_heap_oop()
>>>>>
>>>>> inline oop       oopDesc::load_heap_oop(oop* p)
>>>>> inline oop       oopDesc::load_heap_oop(volatile oop* p)
>>>>>
>>>>> such that the callers can decide on their own which versions they require?
>>>>>
>>>>> Or do the callers of load_heap_oop() not always know if there exist
>>>>> concurrent mutator threads?
>>>>>
>>>>> Regards,
>>>>> Volker
>>>>>
>>>>>
>>>>> On Mon, Jun 22, 2015 at 5:33 PM, Siebenborn, Axel
>>>>> <axel.siebenborn at sap.com> wrote:
>>>>>> Hi,
>>>>>> we have seen crashes with G1 during concurrent root region scan.
>>>>>> It turned out, that the reason for the crashes was reloading an oop after
>>>>>> a null check.
>>>>>> The compiler may legally do this, as the pointer is not declared as
>>>>>> volatile.
>>>>>> The code runs concurrently to mutator threads.
>>>>>>
>>>>>> template <class T>
>>>>>> inline void G1RootRegionScanClosure::do_oop_nv(T* p) {
>>>>>>    T heap_oop = oopDesc::load_heap_oop(p);
>>>>>> // 1. Load oop
>>>>>>    if (!oopDesc::is_null(heap_oop)) {
>>>>>>      oop obj = oopDesc::decode_heap_oop_not_null(heap_oop);    // 2.
>>>>>> Compiler decides to re-load the oop
>>>>>>      HeapRegion* hr = _g1h->heap_region_containing((HeapWord*) obj);
>>>>>>      _cm->grayRoot(obj, obj->size(), _worker_id, hr);
>>>>>>    }
>>>>>> }
>>>>>>
>>>>>>
>>>>>> We have seen the problem on AIX with the xlC compiler. However, we have
>>>>>> seen similar optimizations on other platforms and other platforms.
>>>>>>
>>>>>> As this code pattern is used quite often, I would suggest a global fix in
>>>>>> oopDesc::load_heap_oop(p).
>>>>>>
>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8129440
>>>>>> Webrev: http://cr.openjdk.java.net/~asiebenborn/8129440/webrev.00/
>>>>>>
>>>>>> Thanks,
>>>>>> Axel


More information about the hotspot-gc-dev mailing list