Request for Review (s) - 8149343: assert(rp->num_q() == no_of_gc_workers) failed: sanity
Jon Masamitsu
jon.masamitsu at oracle.com
Wed Mar 16 22:42:06 UTC 2016
The change to re-initialize _next_id passed testing so is now
the proposed fix.
Delta from original fix:
http://cr.openjdk.java.net/~jmasa/8149343/webrev_delta.00_01/
Full fix version 01
http://cr.openjdk.java.net/~jmasa/8149343/webrev.01/
Thanks.
Jon
On 03/16/2016 08:51 AM, Jon Masamitsu wrote:
>
>
> On 03/16/2016 03:13 AM, Thomas Schatzl wrote:
>> Hi Jon,
>>
>> On Tue, 2016-03-15 at 15:19 -0700, Jon Masamitsu wrote:
>>> Thomas,
>>>
>>> Thanks for having a look.
>>>
>>> On 3/15/2016 5:12 AM, Thomas Schatzl wrote:
>>>> Hi Jon,
>>>>
>>>> On Thu, 2016-03-10 at 11:53 -0800, Jon Masamitsu wrote:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8149343
>>>>>
>>>>> The error here was that the number of active workers was
>>>>> not always set correctly for parallel Reference processing.
>>>>> The first fix was to set the number of active workers in the
>>>>> ReferenceProcessor in the task constructor. Once thatDiff
>>>>> was fixed a subsequent assertion failure occurred.
>>>>>
>>>>> # Internal Error (/export/jmasa/java/jdk9-rt
>>>>> -8149343/src/share/vm/gc/shared/referenceProcessor.cpp:884),
>>>>> pid=18444, tid=18500
>>>>> # assert(id < _max_num_q) failed: Id is out-of-bounds id 23 and
>>>>> max
>>>>> id 23)
>>>>>
>>>>> This was fixed by the change
>>>>>
>>>>> - if (++_next_id == _num_q) {
>>>>> + assert(!_discovery_is_mt, "Round robin should only be used
>>>>> in
>>>>> serial discovery");
>>>>> + if (++_next_id >= _num_q) {
>>>>>
>>>>> See the CR for an example log which showed _num_q changing
>>>>> values between different phases of a collection and where the
>>>>> value of _next_id was greater than _num_q.
>>>> I assume that the cause for the _next_id being greater than _num_q
>>>> is that in a previous GC the number of threads has been larger than
>>>> in the current.
>>> Yes. The log in the CR shows the case where _num_q was 23, then 2
>>> and then 23 again.
>> Then I understood it correctly :)
>>
>>>> While this change seems to work, I would kind of prefer that
>>>> _next_id is simply reset in some initialization. The greater or
>>>> equal seems surprising here.
>>> Early on I started to look for a place where I could add the
>>> initialization and decided not to. The _next_id is part of the
>>> reference processor and reference processor created and reused so
>>> there didn't seem to be a dependable place to put the initialization.
>>> One place that might work is to set it in the
>>> set_active_mt_degree()
>>>
>>> void set_active_mt_degree(uint v) { _num_q = v; }
>>>
>>> Try it?
>> I think this would be an appropriate location.
>
> As I said to Kim, I've implemented it and am testing it now. First round
> of testing on linux and G1 passsed but I want to test with the other
> GC's and on solaris sparcv9.
>
> Thanks.
>
> Jon
>
>>
>> As mentioned I would be okay with just a comment explaining the
>> situation like Kim, so if this is urgent and you are really busy,
>> don't.
>>
>> Thanks,
>> Thomas
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20160316/be32e120/attachment.htm>
More information about the hotspot-gc-dev
mailing list