RFR: 8153745: Avoid spawning G1ParPreserveCMReferentsTask when there is no work to be done

Tue Apr 12 18:47:25 UTC 2016

On 4/12/2016 4:58 AM, Stefan Johansson wrote:
>
> On 2016-04-11 19:58, Jon Masamitsu wrote:
>>
>>
>> On 04/11/2016 08:36 AM, Stefan Johansson wrote:
>>> Thanks for looking at this Thomas,
>>>
>>> On 2016-04-08 13:45, Thomas Schatzl wrote:
>>>> Hi,
>>>>
>>>> On Fri, 2016-04-08 at 13:10 +0200, Stefan Johansson wrote:
>>>>> Hi,
>>>>>
>>>>> Please review this enhancement to avoid spawning tasks when there is
>>>>> no
>>>>> work to do:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8153745
>>>>>
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~sjohanss/8153745/hotspot.00/
>>>>>
>>>>> Summary:
>>>>> During post_evacuate_collection_set the call to preserve_cm_referents
>>>>> ensures that no referents discovered by the concurrent reference
>>>>> processor are lost. This is done unconditionally and sometimes a lot
>>>>> of
>>>>> tasks are spawned just to find that there is no work to be done. This
>>>>> enhancement avoids the unnecessary work by first checking that the
>>>>> cm-ref-processor has at least some discovered reference to take care
>>>>> of.
>>>>    - I would prefer if the referenceprocessor would track whether 
>>>> there
>>>> are any references in it by itself by setting a flag while adding a
>>>> reference. I would think that this would be noise in gathering
>>>> references anyway.
>>>> While I do not think it is very problematic, consider machines having
>>>> thousands of threads (which is reality right now).
>>>>
>>>> Otherwise at least, if there is no way that any references could have
>>>> been enqueued because concurrent marking is not running, add an early
>>>> exit.
>>> I'm leaning towards the second alternative. I've tried both and 
>>> doing the book-keeping isn't to bad but it will affect all GCs since 
>>> the reference processor is shared and it is only G1 that cares about 
>>> this. I've instead updated the patch with an extra check that makes 
>>> sure that we only look through the lists when a concurrent cycle is 
>>> under way.
>>>>    - the code should not add timing information in case there has been
>>>> no work. It will automatically show "skipped" in that case. I.e. 
>>>> please
>>>> move the timing stuff inside the if-block.
>>>>
>>>> There is a difference between taking "0.0" time and not having 
>>>> executed
>>>> that is sometimes interesting to know.
>>> I did discuss this with Bengt before sending out the patch and for 
>>> the debug level we currently don't have any logic for "skipped" 
>>> phases. If turning on logging on trace-level the output will now 
>>> look like this:
>>> [1,241s][info ][gc,phases     ] GC(9)   Other: 1,0ms
>>> [1,241s][debug][gc,phases     ] GC(9)     Choose CSet: 0,0ms
>>> [1,241s][debug][gc,phases     ] GC(9)     Preserve CM Refs: 0,0ms
>>> [1,241s][trace][gc,phases     ] GC(9)       Parallel Preserve CM 
>>> Refs (ms): skipped
>>> [1,241s][debug][gc,phases     ] GC(9)     Reference Processing: 0,4ms
>>> [1,241s][debug][gc,phases     ] GC(9)     Reference Enqueuing: 0,0ms
>>>
>>> I think this is ok, but we might want to look at adding logic for 
>>> "skipping" debug phases as well.
>>>
>>> New webrevs:
>>> Full: http://cr.openjdk.java.net/~sjohanss/8153745/hotspot.01/
>>> Inc: http://cr.openjdk.java.net/~sjohanss/8153745/hotspot.00-01/
>>
>> Stefan,
>>
>> This is a ignorable comment because it's about an enhancement.
>>
>> I think the next thing that we will want is to set the number of GC 
>> threads based
>> on the number of references that need to be processed.
>>
>> 4468 if (concurrent_mark()->cmThread()->during_cycle() &&
>> 4469 ref_processor_cm()->has_discovered_references()) {
>> 4470     double preserve_cm_referents_start = os::elapsedTime();
>> 4471     uint no_of_gc_workers = workers()->active_workers();
>> 4472     G1ParPreserveCMReferentsTask keep_cm_referents(this,
>> 4473 per_thread_states,
>> 4474 no_of_gc_workers,
>> 4475 _task_queues);
>> 4476     workers()->run_task(&keep_cm_referents);
>> 4477     preserve_cm_referents_time = os::elapsedTime() - 
>> preserve_cm_referents_start;
>> 4478   }
>>
>>
>> For that enhancement I think we will want from the reference 
>> processor the
>> maximum references of any kind that needs to be processed.  When I say
>> of any kind, I mean that the reference processing is done by type so 
>> we want
>> to choose that number of GC workers based on the largest number that
>> they will have to process.  Instead of has_discovered_references() I
>> would choose to implement max_discovered_references().  Then the
>> testat 4469  would be
>>
>> if (ref_processor_cm()->max_discovered_references() > 0 ) {
>>
>> and  later we could do something like
>>
>> GC_workers_by_max_of_work = max_discovered_references() /
>>     carefully-chosen-scale-factor
>>
>> Again totally ignorable.
>>
> Hi Jon,
>
> We have had similar discussion here in the office and there are 
> further improvements that can be done, but I'm not sure they are as 
> straight forward as first anticipated. Even if we know the number of 
> lists that have references, we also need to know what queues to really 
> take advantage of the information. Currently the task preserving the 
> referents are picking lists based on their id, and scanning through 
> all possible lists.
>
> Feel free to file a follow up RFE to keep track of your suggestions.

Sounds like you have this in hand.  Thanks.

Jon
>
> Thanks,
> Stefan
>
>> Jon
>>
>>>
>>> Thanks,
>>> Stefan
>>>>
>>>> Thanks,
>>>>    Thomas
>>>>
>>>
>>
>