RFR: JDK-8134953: Make the GC ID available in a central place

Bengt Rutisson bengt.rutisson at oracle.com
Tue Sep 29 15:57:48 UTC 2015

On 2015-09-29 17:59, Jon Masamitsu wrote:
> On 9/29/15 1:09 AM, Bengt Rutisson wrote:
>> Hi Jon,
>> On 2015-09-28 19:19, Jon Masamitsu wrote:
>>> On 09/28/2015 06:21 AM, Bengt Rutisson wrote:
>>>> Hi Jon and Per,
>>>> I've been trying to get the version proposed in webrev.02 to work 
>>>> but I ran into a show stopper for this approach. The G1 concurrent 
>>>> mark thread is at times executing (and logging) at the same time as 
>>>> a young or mixed GC is executing. There is no good way of handling 
>>>> this with a common place to store the GC ID. Instead I've decided 
>>>> to go back to storing the current GC ID in the thread. That way the 
>>>> G1 concurrent marking and the young/mixed GCs will have their own 
>>>> GC IDs.
>>>> Here's an updated webrev:
>>>> http://cr.openjdk.java.net/~brutisso/8134953/webrev.03/
>>> Bengt,
>>> The _gc_id is needed in a JavaThread because a JavaThread
>>> might do work for G1 (concurrent refinement for example) and
>>> needs a GC ID?
>> The Java threads currently don't log anything when they do GC work. 
>> But in the future it is likely that they will.
>>> But a WatcherThread would never need one,
>>> right?
>> Right. Initially I only added the _gc_id field to NamedThread. But 
>> Per thought it would be better to have it in Thread. Both for making 
>> it possible for future improvements (such as logging GC work from 
>> JavaThreads) and because it seems like we normally keep storage in 
>> the Thread class even for things that are not used by all subclasses. 
>> For example TLABs and allocated_bytes() are only relevant for 
>> JavaThreads but have their storage in Thread.
> Yes, Thread could be improved but I'm not sure I can stand adding GC 
> data where
> it's not needed.   Unless GC logging from JavaThread is something 
> that's going to
> happen soon, I'd rather add it when it's needed rather than right 
> now.   I'd personally
> rather duplicate the code in JavaThread if it is needed rather than 
> have WatcherThreads
> have a _gc_id.

I tend to agree. I'll discuss this with Per tomorrow. If he agrees that 
we can move the _gc_id down to NamedThread, are you ok with me pushing 
the fix?


> Jon
>> Thanks,
>> Bengt
>>> Jon
>>>> This is the same as webrev.01 that both of you already looked at. 
>>>> I've made two minor adjustments:
>>>> - I removed the currentNamedthread() function in gcId.cpp that Per 
>>>> pointed out in his review of webrev.01. Instead I use 
>>>> Thread::current() directly.
>>>> - I made GCIdMark a StackObj.
>>>> Otherwise the patch is the same as in webrev.01.
>>>> Thanks,
>>>> Bengt
>>>> On 2015-09-10 14:37, Bengt Rutisson wrote:
>>>>> Hi Jon,
>>>>> Thanks for looking at this!
>>>>> On 2015-09-10 01:00, Jon Masamitsu wrote:
>>>>>> Bengt,
>>>>>> When a CMS concurrent collection hands off to a STW foreground 
>>>>>> collection,
>>>>>> (done in acquire_control_and_collect()), I expected a new GCId 
>>>>>> would be
>>>>>> used.  Did I miss it?   That STW collection does not go through 
>>>>>> do_collection().
>>>>> The call to acquire_control_and_collect() originates from 
>>>>> GenCollectedHeap::collect_generation() and there is a new GCIdMark 
>>>>> in there that will create a new GCId for the STW collection. 
>>>>> That's the "extra" GCIdMark that Per asked about in his review.
>>>>> *But* I wanted to verify that it worked properly since you asked 
>>>>> about it and I realized that there is another bug.
>>>>> The GCId that is set up for the concurrent cycle, the one set up 
>>>>> in ConcurrentMarkSweepThread::run(), will is still active over the 
>>>>> complete STW foreground collection. That's fine in my model since 
>>>>> the new GCIdMark in GenCollectedHeap::collect_generation() will 
>>>>> cache that GCId. But the ConcurrentMarkSweep thread is not 
>>>>> completely idle even though control has been left to the 
>>>>> foreground collection in acquire_control_and_collect(). So, there 
>>>>> is some logging done (by fore example CMSPhaseAccounting) that is 
>>>>> done at the same time as the foreground collection. This logging 
>>>>> will now use the foreground GCId instead of the CMS GCId.
>>>>> I haven't had time to dig in to the details of that just yet. But 
>>>>> this is an unintended change of the logging, so I would like to 
>>>>> fix it. Hopefully I'll have an updated webrev tomorrow.
>>>>> Thanks,
>>>>> Bengt
>>>>>> Jon
>>>>>> On 9/9/2015 7:40 AM, Bengt Rutisson wrote:
>>>>>>> Hi again,
>>>>>>> I found an issue with how G1 young collections kick off 
>>>>>>> concurrent marks. There is a short window where we might have 
>>>>>>> two active GC IDs at the same time. I've fixed this and updated 
>>>>>>> the webrevs. In case anyone had already started looking at the 
>>>>>>> webrevs you need to do a refresh now. The fix is in 
>>>>>>> G1CollectedHeap::do_collection_pause_at_safepoint().
>>>>>>> Thanks,
>>>>>>> Bengt
>>>>>>> On 09/09/15 13:38, Bengt Rutisson wrote:
>>>>>>>> Hi Per,
>>>>>>>> Thanks for looking at this!
>>>>>>>> On 2015-09-08 15:23, Per Liden wrote:
>>>>>>>>> Hi Bengt,
>>>>>>>>> On 2015-09-08 13:35, Bengt Rutisson wrote:
>>>>>>>>>> Hi everyone,
>>>>>>>>>> This is mostly a GC related patch, but it adds a field to the 
>>>>>>>>>> Thread
>>>>>>>>>> class, so I'm sending this out on the broader hotspot-dev list.
>>>>>>>>>> Could I have a couple of reviews for this patch?
>>>>>>>>>> http://cr.openjdk.java.net/~brutisso/8134953/webrev.01/
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8134953
>>>>>>>>> Looks good. I think this is a nice simplification, especially 
>>>>>>>>> for G1's concurrent parts. Just two comments:
>>>>>>>>> genCollectedHeap.cpp:
>>>>>>>>> ---------------------
>>>>>>>>> - GenCollectedHeap::do_collection() has two GCIdMarks, in 
>>>>>>>>> different scopes. Do we really want that second mark?
>>>>>>>> We potentially do two GCs in GenCollectedHeap::do_collection(). 
>>>>>>>> First a young GC and then potentially a full GC. These two 
>>>>>>>> should have different GC IDs. So, yes, we need two GCIdMarks in 
>>>>>>>> this method. The scoping could be better, but I think that is a 
>>>>>>>> refactoring that should be done separately from my current 
>>>>>>>> patch. I'll talk to Jesper about it since he has been cleaning 
>>>>>>>> up this code lately.
>>>>>>>>> gcId.cpp:
>>>>>>>>> --------
>>>>>>>>> - I think you might have left currentNamedthread() in there. 
>>>>>>>>> You probably just want to use Thread::current() instead.
>>>>>>>> Yes, good catch. Thanks.
>>>>>>>> However, after thinking some more about the implications of 
>>>>>>>> moving the GC ID out from the GCTracer I figured it would be 
>>>>>>>> possible to just store the previous GC ID in the GCIdMark and 
>>>>>>>> then restore it in the destructor of that class. That way we 
>>>>>>>> don't have to store anything in the Thread class but can still 
>>>>>>>> have both a concurrent GC ID and a STW GC ID active at the same 
>>>>>>>> time. This also removes the need to copy the GC ID to the 
>>>>>>>> worker tasks.
>>>>>>>> Here's an updated webrev with a solution that does not add 
>>>>>>>> anything to the Thread class:
>>>>>>>> http://cr.openjdk.java.net/~brutisso/8134953/webrev.02/
>>>>>>>> And here's a diff compared to the last version:
>>>>>>>> http://cr.openjdk.java.net/~brutisso/8134953/webrev.01-02.diff/
>>>>>>>> Unfotunately the webrev tool had some bad luck when creating 
>>>>>>>> the diff webrev so at least the g1CollectedHeap.cpp seem to 
>>>>>>>> contain the complete changes rather than just the diff compared 
>>>>>>>> to the 01 version. The rest of the diff looks correct as far as 
>>>>>>>> I can tell.
>>>>>>>> Thanks,
>>>>>>>> Bengt
>>>>>>>>> cheers,
>>>>>>>>> /Per
>>>>>>>>>> Currently the GC ID, that is used for event tracing and 
>>>>>>>>>> logging, is
>>>>>>>>>> stored in the GCTracer object. That has caused some minor 
>>>>>>>>>> problems since
>>>>>>>>>> there are multiple GCTracers active at the same time. The 
>>>>>>>>>> correct GC IDs
>>>>>>>>>> need to be passed around in many places.
>>>>>>>>>> For some upcoming GC logging changes I would like to have a more
>>>>>>>>>> consistent way of finding the currently active GC ID. I've 
>>>>>>>>>> played around
>>>>>>>>>> with a couple of different solutions, but eventually I found 
>>>>>>>>>> that the
>>>>>>>>>> simplest solution is to store the current GC ID in the 
>>>>>>>>>> thread. That way
>>>>>>>>>> there is a single way to look it up in all places. It is also 
>>>>>>>>>> fairly
>>>>>>>>>> straight forward to set it up.
>>>>>>>>>> I've reworked the GCId class a bit to support this and I've 
>>>>>>>>>> introduced a
>>>>>>>>>> GCIdMark class that is a scoped object that helps to set up 
>>>>>>>>>> and tear
>>>>>>>>>> down a current GC ID. By moving the GC ID out from the 
>>>>>>>>>> GCTracer class I
>>>>>>>>>> got rid of a few minor issues as well. One is that we no 
>>>>>>>>>> longer need to
>>>>>>>>>> keep track of the G1 "aborted concurrent GC ID". That was 
>>>>>>>>>> necessary
>>>>>>>>>> before since we did logging *after* we had told the 
>>>>>>>>>> corresponding
>>>>>>>>>> GCTracer that the concurrent cycle was aborted and it had 
>>>>>>>>>> thrown its GC
>>>>>>>>>> ID away. Now the GC ID can stay in scope until we have 
>>>>>>>>>> completed all
>>>>>>>>>> logging.
>>>>>>>>>> Also the HeapDumpBeforeFullGC and 
>>>>>>>>>> PrintClassHistogramBeforeFullGC used
>>>>>>>>>> to have to create a new GC ID since their logging was done 
>>>>>>>>>> before we had
>>>>>>>>>> the correct GCTracer set up. Now the GC ID can be available 
>>>>>>>>>> early enough
>>>>>>>>>> for this logging to be using the same (and correct) GC ID as 
>>>>>>>>>> the rest of
>>>>>>>>>> the GC. Same for HeapDumpAfterFullGC and 
>>>>>>>>>> PrintClassHistogramAfterFullGC.
>>>>>>>>>> I've added an uint to the Thread class to keep track of the 
>>>>>>>>>> currently
>>>>>>>>>> active GC ID. In the current code there are actually only 
>>>>>>>>>> NamedThreads
>>>>>>>>>> that need the GC ID. So, I'm not sure where the best place is 
>>>>>>>>>> for this
>>>>>>>>>> field is. But it seems like the Thread class contains most of 
>>>>>>>>>> the data,
>>>>>>>>>> even data that is only used by some subclasses, so I opted 
>>>>>>>>>> for putting
>>>>>>>>>> the field in Thread as opposed to in NamedThread. This opens 
>>>>>>>>>> up for
>>>>>>>>>> possible future extensions when Java threads may do part of 
>>>>>>>>>> the GC work.
>>>>>>>>>> However, I'm open to moving the field down to NamedThread if 
>>>>>>>>>> that is
>>>>>>>>>> seen as better.
>>>>>>>>>> In the GCTracer class there were many asserts that were 
>>>>>>>>>> checking that
>>>>>>>>>> the GC ID was properly set up. Mostly these assert verify 
>>>>>>>>>> that start/end
>>>>>>>>>> is called correctly and that the other methods are called 
>>>>>>>>>> inside of the
>>>>>>>>>> start/end scope. Now that the GC ID is moved out of the 
>>>>>>>>>> GCTracer class
>>>>>>>>>> these asserts had to be dealt with. I figured there were 
>>>>>>>>>> three ways to
>>>>>>>>>> handle it; just remove them, replace them with check that the 
>>>>>>>>>> GC ID from
>>>>>>>>>> the current thread is correct, or implement a new status 
>>>>>>>>>> field to keep
>>>>>>>>>> track of GC start/end state. Personally I'm not sure these 
>>>>>>>>>> asserts are
>>>>>>>>>> valuable enough to spend time on, so I went for the first 
>>>>>>>>>> approach:
>>>>>>>>>> removing them. I don't think making them use the thread GC ID 
>>>>>>>>>> will be
>>>>>>>>>> appropriate. But if anyone feels strongly about these asserts 
>>>>>>>>>> I can
>>>>>>>>>> implement a separate start/end state.
>>>>>>>>>> There are a few "Tasks" in the GC code that are executed by 
>>>>>>>>>> worker
>>>>>>>>>> threads. To make the worker threads use the correct GC ID I 
>>>>>>>>>> make sure
>>>>>>>>>> that the Task constructors copy the GC ID from the initiating 
>>>>>>>>>> thread
>>>>>>>>>> into a local variable. When the task is executed in its 
>>>>>>>>>> worker thread it
>>>>>>>>>> sets up the GC ID based on the local variable.
>>>>>>>>>> The proposed change does not alter any logging (except for 
>>>>>>>>>> the small bug
>>>>>>>>>> fix for
>>>>>>>>>> HeapDumpBefore(After)FullGC/PrintClassHistogramBefore(After)FullGC. 
>>>>>>>>>> This
>>>>>>>>>> means that no existing tests need to be updated. In 
>>>>>>>>>> particular the
>>>>>>>>>> test/gc/logging/TestGCId.java test passes even after these 
>>>>>>>>>> changes.
>>>>>>>>>> A big thanks to Per, who pre-reviewed these changes and came 
>>>>>>>>>> with some
>>>>>>>>>> really good feedback.
>>>>>>>>>> Thanks,
>>>>>>>>>> Bengt

More information about the hotspot-dev mailing list