RFR: 8153745: Avoid spawning G1ParPreserveCMReferentsTask when there is no work to be done

Tue Apr 12 11:58:25 UTC 2016

On 2016-04-11 19:58, Jon Masamitsu wrote:
>
>
> On 04/11/2016 08:36 AM, Stefan Johansson wrote:
>> Thanks for looking at this Thomas,
>>
>> On 2016-04-08 13:45, Thomas Schatzl wrote:
>>> Hi,
>>>
>>> On Fri, 2016-04-08 at 13:10 +0200, Stefan Johansson wrote:
>>>> Hi,
>>>>
>>>> Please review this enhancement to avoid spawning tasks when there is
>>>> no
>>>> work to do:
>>>> https://bugs.openjdk.java.net/browse/JDK-8153745
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~sjohanss/8153745/hotspot.00/
>>>>
>>>> Summary:
>>>> During post_evacuate_collection_set the call to preserve_cm_referents
>>>> ensures that no referents discovered by the concurrent reference
>>>> processor are lost. This is done unconditionally and sometimes a lot
>>>> of
>>>> tasks are spawned just to find that there is no work to be done. This
>>>> enhancement avoids the unnecessary work by first checking that the
>>>> cm-ref-processor has at least some discovered reference to take care
>>>> of.
>>>    - I would prefer if the referenceprocessor would track whether there
>>> are any references in it by itself by setting a flag while adding a
>>> reference. I would think that this would be noise in gathering
>>> references anyway.
>>> While I do not think it is very problematic, consider machines having
>>> thousands of threads (which is reality right now).
>>>
>>> Otherwise at least, if there is no way that any references could have
>>> been enqueued because concurrent marking is not running, add an early
>>> exit.
>> I'm leaning towards the second alternative. I've tried both and doing 
>> the book-keeping isn't to bad but it will affect all GCs since the 
>> reference processor is shared and it is only G1 that cares about 
>> this. I've instead updated the patch with an extra check that makes 
>> sure that we only look through the lists when a concurrent cycle is 
>> under way.
>>>    - the code should not add timing information in case there has been
>>> no work. It will automatically show "skipped" in that case. I.e. please
>>> move the timing stuff inside the if-block.
>>>
>>> There is a difference between taking "0.0" time and not having executed
>>> that is sometimes interesting to know.
>> I did discuss this with Bengt before sending out the patch and for 
>> the debug level we currently don't have any logic for "skipped" 
>> phases. If turning on logging on trace-level the output will now look 
>> like this:
>> [1,241s][info ][gc,phases     ] GC(9)   Other: 1,0ms
>> [1,241s][debug][gc,phases     ] GC(9)     Choose CSet: 0,0ms
>> [1,241s][debug][gc,phases     ] GC(9)     Preserve CM Refs: 0,0ms
>> [1,241s][trace][gc,phases     ] GC(9)       Parallel Preserve CM Refs 
>> (ms): skipped
>> [1,241s][debug][gc,phases     ] GC(9)     Reference Processing: 0,4ms
>> [1,241s][debug][gc,phases     ] GC(9)     Reference Enqueuing: 0,0ms
>>
>> I think this is ok, but we might want to look at adding logic for 
>> "skipping" debug phases as well.
>>
>> New webrevs:
>> Full: http://cr.openjdk.java.net/~sjohanss/8153745/hotspot.01/
>> Inc: http://cr.openjdk.java.net/~sjohanss/8153745/hotspot.00-01/
>
> Stefan,
>
> This is a ignorable comment because it's about an enhancement.
>
> I think the next thing that we will want is to set the number of GC 
> threads based
> on the number of references that need to be processed.
>
> 4468 if (concurrent_mark()->cmThread()->during_cycle() &&
> 4469 ref_processor_cm()->has_discovered_references()) {
> 4470     double preserve_cm_referents_start = os::elapsedTime();
> 4471     uint no_of_gc_workers = workers()->active_workers();
> 4472     G1ParPreserveCMReferentsTask keep_cm_referents(this,
> 4473 per_thread_states,
> 4474 no_of_gc_workers,
> 4475 _task_queues);
> 4476     workers()->run_task(&keep_cm_referents);
> 4477     preserve_cm_referents_time = os::elapsedTime() - 
> preserve_cm_referents_start;
> 4478   }
>
>
> For that enhancement I think we will want from the reference processor 
> the
> maximum references of any kind that needs to be processed.  When I say
> of any kind, I mean that the reference processing is done by type so 
> we want
> to choose that number of GC workers based on the largest number that
> they will have to process.  Instead of has_discovered_references() I
> would choose to implement max_discovered_references().  Then the
> testat 4469  would be
>
> if (ref_processor_cm()->max_discovered_references() > 0 ) {
>
> and  later we could do something like
>
> GC_workers_by_max_of_work = max_discovered_references() /
>     carefully-chosen-scale-factor
>
> Again totally ignorable.
>
Hi Jon,

We have had similar discussion here in the office and there are further 
improvements that can be done, but I'm not sure they are as straight 
forward as first anticipated. Even if we know the number of lists that 
have references, we also need to know what queues to really take 
advantage of the information. Currently the task preserving the 
referents are picking lists based on their id, and scanning through all 
possible lists.

Feel free to file a follow up RFE to keep track of your suggestions.

Thanks,
Stefan

> Jon
>
>>
>> Thanks,
>> Stefan
>>>
>>> Thanks,
>>>    Thomas
>>>
>>
>