RFR: JDK-8134953: Make the GC ID available in a central place

Bengt Rutisson bengt.rutisson at oracle.com
Thu Sep 10 12:37:22 UTC 2015


Hi Jon,

Thanks for looking at this!

On 2015-09-10 01:00, Jon Masamitsu wrote:
> Bengt,
>
> When a CMS concurrent collection hands off to a STW foreground 
> collection,
> (done in acquire_control_and_collect()), I expected a new GCId would be
> used.  Did I miss it?   That STW collection does not go through 
> do_collection().

The call to acquire_control_and_collect() originates from 
GenCollectedHeap::collect_generation() and there is a new GCIdMark in 
there that will create a new GCId for the STW collection. That's the 
"extra" GCIdMark that Per asked about in his review.

*But* I wanted to verify that it worked properly since you asked about 
it and I realized that there is another bug.

The GCId that is set up for the concurrent cycle, the one set up in 
ConcurrentMarkSweepThread::run(), will is still active over the complete 
STW foreground collection. That's fine in my model since the new 
GCIdMark in GenCollectedHeap::collect_generation() will cache that GCId. 
But the ConcurrentMarkSweep thread is not completely idle even though 
control has been left to the foreground collection in 
acquire_control_and_collect(). So, there is some logging done (by fore 
example CMSPhaseAccounting) that is done at the same time as the 
foreground collection. This logging will now use the foreground GCId 
instead of the CMS GCId.

I haven't had time to dig in to the details of that just yet. But this 
is an unintended change of the logging, so I would like to fix it. 
Hopefully I'll have an updated webrev tomorrow.

Thanks,
Bengt


>
> Jon
>
> On 9/9/2015 7:40 AM, Bengt Rutisson wrote:
>>
>> Hi again,
>>
>> I found an issue with how G1 young collections kick off concurrent 
>> marks. There is a short window where we might have two active GC IDs 
>> at the same time. I've fixed this and updated the webrevs. In case 
>> anyone had already started looking at the webrevs you need to do a 
>> refresh now. The fix is in 
>> G1CollectedHeap::do_collection_pause_at_safepoint().
>>
>> Thanks,
>> Bengt
>>
>> On 09/09/15 13:38, Bengt Rutisson wrote:
>>>
>>> Hi Per,
>>>
>>> Thanks for looking at this!
>>>
>>> On 2015-09-08 15:23, Per Liden wrote:
>>>> Hi Bengt,
>>>>
>>>> On 2015-09-08 13:35, Bengt Rutisson wrote:
>>>>>
>>>>> Hi everyone,
>>>>>
>>>>> This is mostly a GC related patch, but it adds a field to the Thread
>>>>> class, so I'm sending this out on the broader hotspot-dev list.
>>>>>
>>>>> Could I have a couple of reviews for this patch?
>>>>>
>>>>> http://cr.openjdk.java.net/~brutisso/8134953/webrev.01/
>>>>> https://bugs.openjdk.java.net/browse/JDK-8134953
>>>>>
>>>>
>>>> Looks good. I think this is a nice simplification, especially for 
>>>> G1's concurrent parts. Just two comments:
>>>>
>>>> genCollectedHeap.cpp:
>>>> ---------------------
>>>> - GenCollectedHeap::do_collection() has two GCIdMarks, in different 
>>>> scopes. Do we really want that second mark?
>>>
>>>
>>> We potentially do two GCs in GenCollectedHeap::do_collection(). 
>>> First a young GC and then potentially a full GC. These two should 
>>> have different GC IDs. So, yes, we need two GCIdMarks in this 
>>> method. The scoping could be better, but I think that is a 
>>> refactoring that should be done separately from my current patch. 
>>> I'll talk to Jesper about it since he has been cleaning up this code 
>>> lately.
>>>
>>>>
>>>>
>>>> gcId.cpp:
>>>> --------
>>>> - I think you might have left currentNamedthread() in there. You 
>>>> probably just want to use Thread::current() instead.
>>>
>>> Yes, good catch. Thanks.
>>>
>>> However, after thinking some more about the implications of moving 
>>> the GC ID out from the GCTracer I figured it would be possible to 
>>> just store the previous GC ID in the GCIdMark and then restore it in 
>>> the destructor of that class. That way we don't have to store 
>>> anything in the Thread class but can still have both a concurrent GC 
>>> ID and a STW GC ID active at the same time. This also removes the 
>>> need to copy the GC ID to the worker tasks.
>>>
>>> Here's an updated webrev with a solution that does not add anything 
>>> to the Thread class:
>>>
>>> http://cr.openjdk.java.net/~brutisso/8134953/webrev.02/
>>>
>>> And here's a diff compared to the last version:
>>>
>>> http://cr.openjdk.java.net/~brutisso/8134953/webrev.01-02.diff/
>>>
>>> Unfotunately the webrev tool had some bad luck when creating the 
>>> diff webrev so at least the g1CollectedHeap.cpp seem to contain the 
>>> complete changes rather than just the diff compared to the 01 
>>> version. The rest of the diff looks correct as far as I can tell.
>>>
>>> Thanks,
>>> Bengt
>>>>
>>>>
>>>> cheers,
>>>> /Per
>>>>
>>>>
>>>>> Currently the GC ID, that is used for event tracing and logging, is
>>>>> stored in the GCTracer object. That has caused some minor problems 
>>>>> since
>>>>> there are multiple GCTracers active at the same time. The correct 
>>>>> GC IDs
>>>>> need to be passed around in many places.
>>>>>
>>>>> For some upcoming GC logging changes I would like to have a more
>>>>> consistent way of finding the currently active GC ID. I've played 
>>>>> around
>>>>> with a couple of different solutions, but eventually I found that the
>>>>> simplest solution is to store the current GC ID in the thread. 
>>>>> That way
>>>>> there is a single way to look it up in all places. It is also fairly
>>>>> straight forward to set it up.
>>>>>
>>>>> I've reworked the GCId class a bit to support this and I've 
>>>>> introduced a
>>>>> GCIdMark class that is a scoped object that helps to set up and tear
>>>>> down a current GC ID. By moving the GC ID out from the GCTracer 
>>>>> class I
>>>>> got rid of a few minor issues as well. One is that we no longer 
>>>>> need to
>>>>> keep track of the G1 "aborted concurrent GC ID". That was necessary
>>>>> before since we did logging *after* we had told the corresponding
>>>>> GCTracer that the concurrent cycle was aborted and it had thrown 
>>>>> its GC
>>>>> ID away. Now the GC ID can stay in scope until we have completed all
>>>>> logging.
>>>>>
>>>>> Also the HeapDumpBeforeFullGC and PrintClassHistogramBeforeFullGC 
>>>>> used
>>>>> to have to create a new GC ID since their logging was done before 
>>>>> we had
>>>>> the correct GCTracer set up. Now the GC ID can be available early 
>>>>> enough
>>>>> for this logging to be using the same (and correct) GC ID as the 
>>>>> rest of
>>>>> the GC. Same for HeapDumpAfterFullGC and 
>>>>> PrintClassHistogramAfterFullGC.
>>>>>
>>>>> I've added an uint to the Thread class to keep track of the currently
>>>>> active GC ID. In the current code there are actually only 
>>>>> NamedThreads
>>>>> that need the GC ID. So, I'm not sure where the best place is for 
>>>>> this
>>>>> field is. But it seems like the Thread class contains most of the 
>>>>> data,
>>>>> even data that is only used by some subclasses, so I opted for 
>>>>> putting
>>>>> the field in Thread as opposed to in NamedThread. This opens up for
>>>>> possible future extensions when Java threads may do part of the GC 
>>>>> work.
>>>>> However, I'm open to moving the field down to NamedThread if that is
>>>>> seen as better.
>>>>>
>>>>> In the GCTracer class there were many asserts that were checking that
>>>>> the GC ID was properly set up. Mostly these assert verify that 
>>>>> start/end
>>>>> is called correctly and that the other methods are called inside 
>>>>> of the
>>>>> start/end scope. Now that the GC ID is moved out of the GCTracer 
>>>>> class
>>>>> these asserts had to be dealt with. I figured there were three 
>>>>> ways to
>>>>> handle it; just remove them, replace them with check that the GC 
>>>>> ID from
>>>>> the current thread is correct, or implement a new status field to 
>>>>> keep
>>>>> track of GC start/end state. Personally I'm not sure these asserts 
>>>>> are
>>>>> valuable enough to spend time on, so I went for the first approach:
>>>>> removing them. I don't think making them use the thread GC ID will be
>>>>> appropriate. But if anyone feels strongly about these asserts I can
>>>>> implement a separate start/end state.
>>>>>
>>>>> There are a few "Tasks" in the GC code that are executed by worker
>>>>> threads. To make the worker threads use the correct GC ID I make sure
>>>>> that the Task constructors copy the GC ID from the initiating thread
>>>>> into a local variable. When the task is executed in its worker 
>>>>> thread it
>>>>> sets up the GC ID based on the local variable.
>>>>>
>>>>> The proposed change does not alter any logging (except for the 
>>>>> small bug
>>>>> fix for
>>>>> HeapDumpBefore(After)FullGC/PrintClassHistogramBefore(After)FullGC. This 
>>>>>
>>>>> means that no existing tests need to be updated. In particular the
>>>>> test/gc/logging/TestGCId.java test passes even after these changes.
>>>>>
>>>>>
>>>>> A big thanks to Per, who pre-reviewed these changes and came with 
>>>>> some
>>>>> really good feedback.
>>>>>
>>>>> Thanks,
>>>>> Bengt
>>>
>>
>



More information about the hotspot-dev mailing list