RFR: JDK-8134953: Make the GC ID available in a central place

Jon Masamitsu jon.masamitsu at oracle.com
Wed Sep 30 16:52:24 UTC 2015



On 9/30/2015 12:20 AM, Bengt Rutisson wrote:
>
>
> On 2015-09-29 18:08, Jon Masamitsu wrote:
>>
>>
>> On 9/29/15 8:57 AM, Bengt Rutisson wrote:
>>>
>>>
>>> On 2015-09-29 17:59, Jon Masamitsu wrote:
>>>>
>>>>
>>>> On 9/29/15 1:09 AM, Bengt Rutisson wrote:
>>>>>
>>>>> Hi Jon,
>>>>>
>>>>> On 2015-09-28 19:19, Jon Masamitsu wrote:
>>>>>>
>>>>>>
>>>>>> On 09/28/2015 06:21 AM, Bengt Rutisson wrote:
>>>>>>>
>>>>>>> Hi Jon and Per,
>>>>>>>
>>>>>>> I've been trying to get the version proposed in webrev.02 to 
>>>>>>> work but I ran into a show stopper for this approach. The G1 
>>>>>>> concurrent mark thread is at times executing (and logging) at 
>>>>>>> the same time as a young or mixed GC is executing. There is no 
>>>>>>> good way of handling this with a common place to store the GC 
>>>>>>> ID. Instead I've decided to go back to storing the current GC ID 
>>>>>>> in the thread. That way the G1 concurrent marking and the 
>>>>>>> young/mixed GCs will have their own GC IDs.
>>>>>>>
>>>>>>> Here's an updated webrev:
>>>>>>> http://cr.openjdk.java.net/~brutisso/8134953/webrev.03/
>>>>>> Bengt,
>>>>>>
>>>>>> The _gc_id is needed in a JavaThread because a JavaThread
>>>>>> might do work for G1 (concurrent refinement for example) and
>>>>>> needs a GC ID?
>>>>>
>>>>> The Java threads currently don't log anything when they do GC 
>>>>> work. But in the future it is likely that they will.
>>>>>
>>>>>> But a WatcherThread would never need one,
>>>>>> right?
>>>>>
>>>>> Right. Initially I only added the _gc_id field to NamedThread. But 
>>>>> Per thought it would be better to have it in Thread. Both for 
>>>>> making it possible for future improvements (such as logging GC 
>>>>> work from JavaThreads) and because it seems like we normally keep 
>>>>> storage in the Thread class even for things that are not used by 
>>>>> all subclasses. For example TLABs and allocated_bytes() are only 
>>>>> relevant for JavaThreads but have their storage in Thread.
>>>>
>>>> Yes, Thread could be improved but I'm not sure I can stand adding 
>>>> GC data where
>>>> it's not needed.   Unless GC logging from JavaThread is something 
>>>> that's going to
>>>> happen soon, I'd rather add it when it's needed rather than right 
>>>> now.   I'd personally
>>>> rather duplicate the code in JavaThread if it is needed rather than 
>>>> have WatcherThreads
>>>> have a _gc_id.
>>>
>>> I tend to agree. I'll discuss this with Per tomorrow. If he agrees 
>>> that we can move the _gc_id down to NamedThread, are you ok with me 
>>> pushing the fix?
>>
>> Yes,  I'm good with the fix then.
>
> Thanks Jon!
>
> I talked to Per and moved the _gc_id field down to NamedThread. 
> Pushing it now.
>
>>
>> Would it be appropriate for me to file a CR to change NamedThread to 
>> something
>> like GCThread?  Or ???
>
>
> I don't think GCThread is the right name since VMThread is a 
> NamedThread and it can schedule tasks that are not GC related. I agree 
> that NamedThread is not a great name. But I actually think that the 
> naming in the Thread hierarchy is just part of the problem. The real 
> problem is that the division of responsibility is not really correctly 
> partitioned among the classes.
>
> If we should file a RFE I think it should be to clean up the Thread 
> hierarchy, including finding good names for the classes.

That's fine.  I created

https://bugs.openjdk.java.net/browse/JDK-8138655

Jon

>
> Bengt
>
>
>>
>> Jon
>>
>>>
>>> Thanks,
>>> Bengt
>>>
>>>>
>>>> Jon
>>>>
>>>>>
>>>>> Thanks,
>>>>> Bengt
>>>>>
>>>>>>
>>>>>> Jon
>>>>>>
>>>>>>>
>>>>>>> This is the same as webrev.01 that both of you already looked 
>>>>>>> at. I've made two minor adjustments:
>>>>>>>
>>>>>>> - I removed the currentNamedthread() function in gcId.cpp that 
>>>>>>> Per pointed out in his review of webrev.01. Instead I use 
>>>>>>> Thread::current() directly.
>>>>>>>
>>>>>>> - I made GCIdMark a StackObj.
>>>>>>>
>>>>>>> Otherwise the patch is the same as in webrev.01.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Bengt
>>>>>>>
>>>>>>> On 2015-09-10 14:37, Bengt Rutisson wrote:
>>>>>>>>
>>>>>>>> Hi Jon,
>>>>>>>>
>>>>>>>> Thanks for looking at this!
>>>>>>>>
>>>>>>>> On 2015-09-10 01:00, Jon Masamitsu wrote:
>>>>>>>>> Bengt,
>>>>>>>>>
>>>>>>>>> When a CMS concurrent collection hands off to a STW foreground 
>>>>>>>>> collection,
>>>>>>>>> (done in acquire_control_and_collect()), I expected a new GCId 
>>>>>>>>> would be
>>>>>>>>> used.  Did I miss it?   That STW collection does not go 
>>>>>>>>> through do_collection().
>>>>>>>>
>>>>>>>> The call to acquire_control_and_collect() originates from 
>>>>>>>> GenCollectedHeap::collect_generation() and there is a new 
>>>>>>>> GCIdMark in there that will create a new GCId for the STW 
>>>>>>>> collection. That's the "extra" GCIdMark that Per asked about in 
>>>>>>>> his review.
>>>>>>>>
>>>>>>>> *But* I wanted to verify that it worked properly since you 
>>>>>>>> asked about it and I realized that there is another bug.
>>>>>>>>
>>>>>>>> The GCId that is set up for the concurrent cycle, the one set 
>>>>>>>> up in ConcurrentMarkSweepThread::run(), will is still active 
>>>>>>>> over the complete STW foreground collection. That's fine in my 
>>>>>>>> model since the new GCIdMark in 
>>>>>>>> GenCollectedHeap::collect_generation() will cache that GCId. 
>>>>>>>> But the ConcurrentMarkSweep thread is not completely idle even 
>>>>>>>> though control has been left to the foreground collection in 
>>>>>>>> acquire_control_and_collect(). So, there is some logging done 
>>>>>>>> (by fore example CMSPhaseAccounting) that is done at the same 
>>>>>>>> time as the foreground collection. This logging will now use 
>>>>>>>> the foreground GCId instead of the CMS GCId.
>>>>>>>>
>>>>>>>> I haven't had time to dig in to the details of that just yet. 
>>>>>>>> But this is an unintended change of the logging, so I would 
>>>>>>>> like to fix it. Hopefully I'll have an updated webrev tomorrow.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Bengt
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Jon
>>>>>>>>>
>>>>>>>>> On 9/9/2015 7:40 AM, Bengt Rutisson wrote:
>>>>>>>>>>
>>>>>>>>>> Hi again,
>>>>>>>>>>
>>>>>>>>>> I found an issue with how G1 young collections kick off 
>>>>>>>>>> concurrent marks. There is a short window where we might have 
>>>>>>>>>> two active GC IDs at the same time. I've fixed this and 
>>>>>>>>>> updated the webrevs. In case anyone had already started 
>>>>>>>>>> looking at the webrevs you need to do a refresh now. The fix 
>>>>>>>>>> is in G1CollectedHeap::do_collection_pause_at_safepoint().
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Bengt
>>>>>>>>>>
>>>>>>>>>> On 09/09/15 13:38, Bengt Rutisson wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi Per,
>>>>>>>>>>>
>>>>>>>>>>> Thanks for looking at this!
>>>>>>>>>>>
>>>>>>>>>>> On 2015-09-08 15:23, Per Liden wrote:
>>>>>>>>>>>> Hi Bengt,
>>>>>>>>>>>>
>>>>>>>>>>>> On 2015-09-08 13:35, Bengt Rutisson wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is mostly a GC related patch, but it adds a field to 
>>>>>>>>>>>>> the Thread
>>>>>>>>>>>>> class, so I'm sending this out on the broader hotspot-dev 
>>>>>>>>>>>>> list.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Could I have a couple of reviews for this patch?
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~brutisso/8134953/webrev.01/
>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8134953
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Looks good. I think this is a nice simplification, 
>>>>>>>>>>>> especially for G1's concurrent parts. Just two comments:
>>>>>>>>>>>>
>>>>>>>>>>>> genCollectedHeap.cpp:
>>>>>>>>>>>> ---------------------
>>>>>>>>>>>> - GenCollectedHeap::do_collection() has two GCIdMarks, in 
>>>>>>>>>>>> different scopes. Do we really want that second mark?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> We potentially do two GCs in 
>>>>>>>>>>> GenCollectedHeap::do_collection(). First a young GC and then 
>>>>>>>>>>> potentially a full GC. These two should have different GC 
>>>>>>>>>>> IDs. So, yes, we need two GCIdMarks in this method. The 
>>>>>>>>>>> scoping could be better, but I think that is a refactoring 
>>>>>>>>>>> that should be done separately from my current patch. I'll 
>>>>>>>>>>> talk to Jesper about it since he has been cleaning up this 
>>>>>>>>>>> code lately.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> gcId.cpp:
>>>>>>>>>>>> --------
>>>>>>>>>>>> - I think you might have left currentNamedthread() in 
>>>>>>>>>>>> there. You probably just want to use Thread::current() 
>>>>>>>>>>>> instead.
>>>>>>>>>>>
>>>>>>>>>>> Yes, good catch. Thanks.
>>>>>>>>>>>
>>>>>>>>>>> However, after thinking some more about the implications of 
>>>>>>>>>>> moving the GC ID out from the GCTracer I figured it would be 
>>>>>>>>>>> possible to just store the previous GC ID in the GCIdMark 
>>>>>>>>>>> and then restore it in the destructor of that class. That 
>>>>>>>>>>> way we don't have to store anything in the Thread class but 
>>>>>>>>>>> can still have both a concurrent GC ID and a STW GC ID 
>>>>>>>>>>> active at the same time. This also removes the need to copy 
>>>>>>>>>>> the GC ID to the worker tasks.
>>>>>>>>>>>
>>>>>>>>>>> Here's an updated webrev with a solution that does not add 
>>>>>>>>>>> anything to the Thread class:
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~brutisso/8134953/webrev.02/
>>>>>>>>>>>
>>>>>>>>>>> And here's a diff compared to the last version:
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~brutisso/8134953/webrev.01-02.diff/
>>>>>>>>>>>
>>>>>>>>>>> Unfotunately the webrev tool had some bad luck when creating 
>>>>>>>>>>> the diff webrev so at least the g1CollectedHeap.cpp seem to 
>>>>>>>>>>> contain the complete changes rather than just the diff 
>>>>>>>>>>> compared to the 01 version. The rest of the diff looks 
>>>>>>>>>>> correct as far as I can tell.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Bengt
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> cheers,
>>>>>>>>>>>> /Per
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Currently the GC ID, that is used for event tracing and 
>>>>>>>>>>>>> logging, is
>>>>>>>>>>>>> stored in the GCTracer object. That has caused some minor 
>>>>>>>>>>>>> problems since
>>>>>>>>>>>>> there are multiple GCTracers active at the same time. The 
>>>>>>>>>>>>> correct GC IDs
>>>>>>>>>>>>> need to be passed around in many places.
>>>>>>>>>>>>>
>>>>>>>>>>>>> For some upcoming GC logging changes I would like to have 
>>>>>>>>>>>>> a more
>>>>>>>>>>>>> consistent way of finding the currently active GC ID. I've 
>>>>>>>>>>>>> played around
>>>>>>>>>>>>> with a couple of different solutions, but eventually I 
>>>>>>>>>>>>> found that the
>>>>>>>>>>>>> simplest solution is to store the current GC ID in the 
>>>>>>>>>>>>> thread. That way
>>>>>>>>>>>>> there is a single way to look it up in all places. It is 
>>>>>>>>>>>>> also fairly
>>>>>>>>>>>>> straight forward to set it up.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've reworked the GCId class a bit to support this and 
>>>>>>>>>>>>> I've introduced a
>>>>>>>>>>>>> GCIdMark class that is a scoped object that helps to set 
>>>>>>>>>>>>> up and tear
>>>>>>>>>>>>> down a current GC ID. By moving the GC ID out from the 
>>>>>>>>>>>>> GCTracer class I
>>>>>>>>>>>>> got rid of a few minor issues as well. One is that we no 
>>>>>>>>>>>>> longer need to
>>>>>>>>>>>>> keep track of the G1 "aborted concurrent GC ID". That was 
>>>>>>>>>>>>> necessary
>>>>>>>>>>>>> before since we did logging *after* we had told the 
>>>>>>>>>>>>> corresponding
>>>>>>>>>>>>> GCTracer that the concurrent cycle was aborted and it had 
>>>>>>>>>>>>> thrown its GC
>>>>>>>>>>>>> ID away. Now the GC ID can stay in scope until we have 
>>>>>>>>>>>>> completed all
>>>>>>>>>>>>> logging.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Also the HeapDumpBeforeFullGC and 
>>>>>>>>>>>>> PrintClassHistogramBeforeFullGC used
>>>>>>>>>>>>> to have to create a new GC ID since their logging was done 
>>>>>>>>>>>>> before we had
>>>>>>>>>>>>> the correct GCTracer set up. Now the GC ID can be 
>>>>>>>>>>>>> available early enough
>>>>>>>>>>>>> for this logging to be using the same (and correct) GC ID 
>>>>>>>>>>>>> as the rest of
>>>>>>>>>>>>> the GC. Same for HeapDumpAfterFullGC and 
>>>>>>>>>>>>> PrintClassHistogramAfterFullGC.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've added an uint to the Thread class to keep track of 
>>>>>>>>>>>>> the currently
>>>>>>>>>>>>> active GC ID. In the current code there are actually only 
>>>>>>>>>>>>> NamedThreads
>>>>>>>>>>>>> that need the GC ID. So, I'm not sure where the best place 
>>>>>>>>>>>>> is for this
>>>>>>>>>>>>> field is. But it seems like the Thread class contains most 
>>>>>>>>>>>>> of the data,
>>>>>>>>>>>>> even data that is only used by some subclasses, so I opted 
>>>>>>>>>>>>> for putting
>>>>>>>>>>>>> the field in Thread as opposed to in NamedThread. This 
>>>>>>>>>>>>> opens up for
>>>>>>>>>>>>> possible future extensions when Java threads may do part 
>>>>>>>>>>>>> of the GC work.
>>>>>>>>>>>>> However, I'm open to moving the field down to NamedThread 
>>>>>>>>>>>>> if that is
>>>>>>>>>>>>> seen as better.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In the GCTracer class there were many asserts that were 
>>>>>>>>>>>>> checking that
>>>>>>>>>>>>> the GC ID was properly set up. Mostly these assert verify 
>>>>>>>>>>>>> that start/end
>>>>>>>>>>>>> is called correctly and that the other methods are called 
>>>>>>>>>>>>> inside of the
>>>>>>>>>>>>> start/end scope. Now that the GC ID is moved out of the 
>>>>>>>>>>>>> GCTracer class
>>>>>>>>>>>>> these asserts had to be dealt with. I figured there were 
>>>>>>>>>>>>> three ways to
>>>>>>>>>>>>> handle it; just remove them, replace them with check that 
>>>>>>>>>>>>> the GC ID from
>>>>>>>>>>>>> the current thread is correct, or implement a new status 
>>>>>>>>>>>>> field to keep
>>>>>>>>>>>>> track of GC start/end state. Personally I'm not sure these 
>>>>>>>>>>>>> asserts are
>>>>>>>>>>>>> valuable enough to spend time on, so I went for the first 
>>>>>>>>>>>>> approach:
>>>>>>>>>>>>> removing them. I don't think making them use the thread GC 
>>>>>>>>>>>>> ID will be
>>>>>>>>>>>>> appropriate. But if anyone feels strongly about these 
>>>>>>>>>>>>> asserts I can
>>>>>>>>>>>>> implement a separate start/end state.
>>>>>>>>>>>>>
>>>>>>>>>>>>> There are a few "Tasks" in the GC code that are executed 
>>>>>>>>>>>>> by worker
>>>>>>>>>>>>> threads. To make the worker threads use the correct GC ID 
>>>>>>>>>>>>> I make sure
>>>>>>>>>>>>> that the Task constructors copy the GC ID from the 
>>>>>>>>>>>>> initiating thread
>>>>>>>>>>>>> into a local variable. When the task is executed in its 
>>>>>>>>>>>>> worker thread it
>>>>>>>>>>>>> sets up the GC ID based on the local variable.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The proposed change does not alter any logging (except for 
>>>>>>>>>>>>> the small bug
>>>>>>>>>>>>> fix for
>>>>>>>>>>>>> HeapDumpBefore(After)FullGC/PrintClassHistogramBefore(After)FullGC. 
>>>>>>>>>>>>> This
>>>>>>>>>>>>> means that no existing tests need to be updated. In 
>>>>>>>>>>>>> particular the
>>>>>>>>>>>>> test/gc/logging/TestGCId.java test passes even after these 
>>>>>>>>>>>>> changes.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> A big thanks to Per, who pre-reviewed these changes and 
>>>>>>>>>>>>> came with some
>>>>>>>>>>>>> really good feedback.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Bengt
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>



More information about the hotspot-dev mailing list