RFR: JDK-8134953: Make the GC ID available in a central place
Bengt Rutisson
bengt.rutisson at oracle.com
Thu Oct 1 05:47:55 UTC 2015
On 30/09/15 18:52, Jon Masamitsu wrote:
>
>
> On 9/30/2015 12:20 AM, Bengt Rutisson wrote:
>>
>>
>> On 2015-09-29 18:08, Jon Masamitsu wrote:
>>>
>>>
>>> On 9/29/15 8:57 AM, Bengt Rutisson wrote:
>>>>
>>>>
>>>> On 2015-09-29 17:59, Jon Masamitsu wrote:
>>>>>
>>>>>
>>>>> On 9/29/15 1:09 AM, Bengt Rutisson wrote:
>>>>>>
>>>>>> Hi Jon,
>>>>>>
>>>>>> On 2015-09-28 19:19, Jon Masamitsu wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 09/28/2015 06:21 AM, Bengt Rutisson wrote:
>>>>>>>>
>>>>>>>> Hi Jon and Per,
>>>>>>>>
>>>>>>>> I've been trying to get the version proposed in webrev.02 to
>>>>>>>> work but I ran into a show stopper for this approach. The G1
>>>>>>>> concurrent mark thread is at times executing (and logging) at
>>>>>>>> the same time as a young or mixed GC is executing. There is no
>>>>>>>> good way of handling this with a common place to store the GC
>>>>>>>> ID. Instead I've decided to go back to storing the current GC
>>>>>>>> ID in the thread. That way the G1 concurrent marking and the
>>>>>>>> young/mixed GCs will have their own GC IDs.
>>>>>>>>
>>>>>>>> Here's an updated webrev:
>>>>>>>> http://cr.openjdk.java.net/~brutisso/8134953/webrev.03/
>>>>>>> Bengt,
>>>>>>>
>>>>>>> The _gc_id is needed in a JavaThread because a JavaThread
>>>>>>> might do work for G1 (concurrent refinement for example) and
>>>>>>> needs a GC ID?
>>>>>>
>>>>>> The Java threads currently don't log anything when they do GC
>>>>>> work. But in the future it is likely that they will.
>>>>>>
>>>>>>> But a WatcherThread would never need one,
>>>>>>> right?
>>>>>>
>>>>>> Right. Initially I only added the _gc_id field to NamedThread.
>>>>>> But Per thought it would be better to have it in Thread. Both for
>>>>>> making it possible for future improvements (such as logging GC
>>>>>> work from JavaThreads) and because it seems like we normally keep
>>>>>> storage in the Thread class even for things that are not used by
>>>>>> all subclasses. For example TLABs and allocated_bytes() are only
>>>>>> relevant for JavaThreads but have their storage in Thread.
>>>>>
>>>>> Yes, Thread could be improved but I'm not sure I can stand adding
>>>>> GC data where
>>>>> it's not needed. Unless GC logging from JavaThread is something
>>>>> that's going to
>>>>> happen soon, I'd rather add it when it's needed rather than right
>>>>> now. I'd personally
>>>>> rather duplicate the code in JavaThread if it is needed rather
>>>>> than have WatcherThreads
>>>>> have a _gc_id.
>>>>
>>>> I tend to agree. I'll discuss this with Per tomorrow. If he agrees
>>>> that we can move the _gc_id down to NamedThread, are you ok with me
>>>> pushing the fix?
>>>
>>> Yes, I'm good with the fix then.
>>
>> Thanks Jon!
>>
>> I talked to Per and moved the _gc_id field down to NamedThread.
>> Pushing it now.
>>
>>>
>>> Would it be appropriate for me to file a CR to change NamedThread to
>>> something
>>> like GCThread? Or ???
>>
>>
>> I don't think GCThread is the right name since VMThread is a
>> NamedThread and it can schedule tasks that are not GC related. I
>> agree that NamedThread is not a great name. But I actually think that
>> the naming in the Thread hierarchy is just part of the problem. The
>> real problem is that the division of responsibility is not really
>> correctly partitioned among the classes.
>>
>> If we should file a RFE I think it should be to clean up the Thread
>> hierarchy, including finding good names for the classes.
>
> That's fine. I created
>
> https://bugs.openjdk.java.net/browse/JDK-8138655
Thanks, Jon. Good description in the bug.
Bengt
>
> Jon
>
>>
>> Bengt
>>
>>
>>>
>>> Jon
>>>
>>>>
>>>> Thanks,
>>>> Bengt
>>>>
>>>>>
>>>>> Jon
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Bengt
>>>>>>
>>>>>>>
>>>>>>> Jon
>>>>>>>
>>>>>>>>
>>>>>>>> This is the same as webrev.01 that both of you already looked
>>>>>>>> at. I've made two minor adjustments:
>>>>>>>>
>>>>>>>> - I removed the currentNamedthread() function in gcId.cpp that
>>>>>>>> Per pointed out in his review of webrev.01. Instead I use
>>>>>>>> Thread::current() directly.
>>>>>>>>
>>>>>>>> - I made GCIdMark a StackObj.
>>>>>>>>
>>>>>>>> Otherwise the patch is the same as in webrev.01.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Bengt
>>>>>>>>
>>>>>>>> On 2015-09-10 14:37, Bengt Rutisson wrote:
>>>>>>>>>
>>>>>>>>> Hi Jon,
>>>>>>>>>
>>>>>>>>> Thanks for looking at this!
>>>>>>>>>
>>>>>>>>> On 2015-09-10 01:00, Jon Masamitsu wrote:
>>>>>>>>>> Bengt,
>>>>>>>>>>
>>>>>>>>>> When a CMS concurrent collection hands off to a STW
>>>>>>>>>> foreground collection,
>>>>>>>>>> (done in acquire_control_and_collect()), I expected a new
>>>>>>>>>> GCId would be
>>>>>>>>>> used. Did I miss it? That STW collection does not go
>>>>>>>>>> through do_collection().
>>>>>>>>>
>>>>>>>>> The call to acquire_control_and_collect() originates from
>>>>>>>>> GenCollectedHeap::collect_generation() and there is a new
>>>>>>>>> GCIdMark in there that will create a new GCId for the STW
>>>>>>>>> collection. That's the "extra" GCIdMark that Per asked about
>>>>>>>>> in his review.
>>>>>>>>>
>>>>>>>>> *But* I wanted to verify that it worked properly since you
>>>>>>>>> asked about it and I realized that there is another bug.
>>>>>>>>>
>>>>>>>>> The GCId that is set up for the concurrent cycle, the one set
>>>>>>>>> up in ConcurrentMarkSweepThread::run(), will is still active
>>>>>>>>> over the complete STW foreground collection. That's fine in my
>>>>>>>>> model since the new GCIdMark in
>>>>>>>>> GenCollectedHeap::collect_generation() will cache that GCId.
>>>>>>>>> But the ConcurrentMarkSweep thread is not completely idle even
>>>>>>>>> though control has been left to the foreground collection in
>>>>>>>>> acquire_control_and_collect(). So, there is some logging done
>>>>>>>>> (by fore example CMSPhaseAccounting) that is done at the same
>>>>>>>>> time as the foreground collection. This logging will now use
>>>>>>>>> the foreground GCId instead of the CMS GCId.
>>>>>>>>>
>>>>>>>>> I haven't had time to dig in to the details of that just yet.
>>>>>>>>> But this is an unintended change of the logging, so I would
>>>>>>>>> like to fix it. Hopefully I'll have an updated webrev tomorrow.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Bengt
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Jon
>>>>>>>>>>
>>>>>>>>>> On 9/9/2015 7:40 AM, Bengt Rutisson wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi again,
>>>>>>>>>>>
>>>>>>>>>>> I found an issue with how G1 young collections kick off
>>>>>>>>>>> concurrent marks. There is a short window where we might
>>>>>>>>>>> have two active GC IDs at the same time. I've fixed this and
>>>>>>>>>>> updated the webrevs. In case anyone had already started
>>>>>>>>>>> looking at the webrevs you need to do a refresh now. The fix
>>>>>>>>>>> is in G1CollectedHeap::do_collection_pause_at_safepoint().
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Bengt
>>>>>>>>>>>
>>>>>>>>>>> On 09/09/15 13:38, Bengt Rutisson wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Per,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for looking at this!
>>>>>>>>>>>>
>>>>>>>>>>>> On 2015-09-08 15:23, Per Liden wrote:
>>>>>>>>>>>>> Hi Bengt,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2015-09-08 13:35, Bengt Rutisson wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is mostly a GC related patch, but it adds a field to
>>>>>>>>>>>>>> the Thread
>>>>>>>>>>>>>> class, so I'm sending this out on the broader hotspot-dev
>>>>>>>>>>>>>> list.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Could I have a couple of reviews for this patch?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~brutisso/8134953/webrev.01/
>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8134953
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Looks good. I think this is a nice simplification,
>>>>>>>>>>>>> especially for G1's concurrent parts. Just two comments:
>>>>>>>>>>>>>
>>>>>>>>>>>>> genCollectedHeap.cpp:
>>>>>>>>>>>>> ---------------------
>>>>>>>>>>>>> - GenCollectedHeap::do_collection() has two GCIdMarks, in
>>>>>>>>>>>>> different scopes. Do we really want that second mark?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> We potentially do two GCs in
>>>>>>>>>>>> GenCollectedHeap::do_collection(). First a young GC and
>>>>>>>>>>>> then potentially a full GC. These two should have different
>>>>>>>>>>>> GC IDs. So, yes, we need two GCIdMarks in this method. The
>>>>>>>>>>>> scoping could be better, but I think that is a refactoring
>>>>>>>>>>>> that should be done separately from my current patch. I'll
>>>>>>>>>>>> talk to Jesper about it since he has been cleaning up this
>>>>>>>>>>>> code lately.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> gcId.cpp:
>>>>>>>>>>>>> --------
>>>>>>>>>>>>> - I think you might have left currentNamedthread() in
>>>>>>>>>>>>> there. You probably just want to use Thread::current()
>>>>>>>>>>>>> instead.
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, good catch. Thanks.
>>>>>>>>>>>>
>>>>>>>>>>>> However, after thinking some more about the implications of
>>>>>>>>>>>> moving the GC ID out from the GCTracer I figured it would
>>>>>>>>>>>> be possible to just store the previous GC ID in the
>>>>>>>>>>>> GCIdMark and then restore it in the destructor of that
>>>>>>>>>>>> class. That way we don't have to store anything in the
>>>>>>>>>>>> Thread class but can still have both a concurrent GC ID and
>>>>>>>>>>>> a STW GC ID active at the same time. This also removes the
>>>>>>>>>>>> need to copy the GC ID to the worker tasks.
>>>>>>>>>>>>
>>>>>>>>>>>> Here's an updated webrev with a solution that does not add
>>>>>>>>>>>> anything to the Thread class:
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~brutisso/8134953/webrev.02/
>>>>>>>>>>>>
>>>>>>>>>>>> And here's a diff compared to the last version:
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~brutisso/8134953/webrev.01-02.diff/
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Unfotunately the webrev tool had some bad luck when
>>>>>>>>>>>> creating the diff webrev so at least the
>>>>>>>>>>>> g1CollectedHeap.cpp seem to contain the complete changes
>>>>>>>>>>>> rather than just the diff compared to the 01 version. The
>>>>>>>>>>>> rest of the diff looks correct as far as I can tell.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Bengt
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> cheers,
>>>>>>>>>>>>> /Per
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Currently the GC ID, that is used for event tracing and
>>>>>>>>>>>>>> logging, is
>>>>>>>>>>>>>> stored in the GCTracer object. That has caused some minor
>>>>>>>>>>>>>> problems since
>>>>>>>>>>>>>> there are multiple GCTracers active at the same time. The
>>>>>>>>>>>>>> correct GC IDs
>>>>>>>>>>>>>> need to be passed around in many places.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> For some upcoming GC logging changes I would like to have
>>>>>>>>>>>>>> a more
>>>>>>>>>>>>>> consistent way of finding the currently active GC ID.
>>>>>>>>>>>>>> I've played around
>>>>>>>>>>>>>> with a couple of different solutions, but eventually I
>>>>>>>>>>>>>> found that the
>>>>>>>>>>>>>> simplest solution is to store the current GC ID in the
>>>>>>>>>>>>>> thread. That way
>>>>>>>>>>>>>> there is a single way to look it up in all places. It is
>>>>>>>>>>>>>> also fairly
>>>>>>>>>>>>>> straight forward to set it up.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've reworked the GCId class a bit to support this and
>>>>>>>>>>>>>> I've introduced a
>>>>>>>>>>>>>> GCIdMark class that is a scoped object that helps to set
>>>>>>>>>>>>>> up and tear
>>>>>>>>>>>>>> down a current GC ID. By moving the GC ID out from the
>>>>>>>>>>>>>> GCTracer class I
>>>>>>>>>>>>>> got rid of a few minor issues as well. One is that we no
>>>>>>>>>>>>>> longer need to
>>>>>>>>>>>>>> keep track of the G1 "aborted concurrent GC ID". That was
>>>>>>>>>>>>>> necessary
>>>>>>>>>>>>>> before since we did logging *after* we had told the
>>>>>>>>>>>>>> corresponding
>>>>>>>>>>>>>> GCTracer that the concurrent cycle was aborted and it had
>>>>>>>>>>>>>> thrown its GC
>>>>>>>>>>>>>> ID away. Now the GC ID can stay in scope until we have
>>>>>>>>>>>>>> completed all
>>>>>>>>>>>>>> logging.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Also the HeapDumpBeforeFullGC and
>>>>>>>>>>>>>> PrintClassHistogramBeforeFullGC used
>>>>>>>>>>>>>> to have to create a new GC ID since their logging was
>>>>>>>>>>>>>> done before we had
>>>>>>>>>>>>>> the correct GCTracer set up. Now the GC ID can be
>>>>>>>>>>>>>> available early enough
>>>>>>>>>>>>>> for this logging to be using the same (and correct) GC ID
>>>>>>>>>>>>>> as the rest of
>>>>>>>>>>>>>> the GC. Same for HeapDumpAfterFullGC and
>>>>>>>>>>>>>> PrintClassHistogramAfterFullGC.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've added an uint to the Thread class to keep track of
>>>>>>>>>>>>>> the currently
>>>>>>>>>>>>>> active GC ID. In the current code there are actually only
>>>>>>>>>>>>>> NamedThreads
>>>>>>>>>>>>>> that need the GC ID. So, I'm not sure where the best
>>>>>>>>>>>>>> place is for this
>>>>>>>>>>>>>> field is. But it seems like the Thread class contains
>>>>>>>>>>>>>> most of the data,
>>>>>>>>>>>>>> even data that is only used by some subclasses, so I
>>>>>>>>>>>>>> opted for putting
>>>>>>>>>>>>>> the field in Thread as opposed to in NamedThread. This
>>>>>>>>>>>>>> opens up for
>>>>>>>>>>>>>> possible future extensions when Java threads may do part
>>>>>>>>>>>>>> of the GC work.
>>>>>>>>>>>>>> However, I'm open to moving the field down to NamedThread
>>>>>>>>>>>>>> if that is
>>>>>>>>>>>>>> seen as better.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In the GCTracer class there were many asserts that were
>>>>>>>>>>>>>> checking that
>>>>>>>>>>>>>> the GC ID was properly set up. Mostly these assert verify
>>>>>>>>>>>>>> that start/end
>>>>>>>>>>>>>> is called correctly and that the other methods are called
>>>>>>>>>>>>>> inside of the
>>>>>>>>>>>>>> start/end scope. Now that the GC ID is moved out of the
>>>>>>>>>>>>>> GCTracer class
>>>>>>>>>>>>>> these asserts had to be dealt with. I figured there were
>>>>>>>>>>>>>> three ways to
>>>>>>>>>>>>>> handle it; just remove them, replace them with check that
>>>>>>>>>>>>>> the GC ID from
>>>>>>>>>>>>>> the current thread is correct, or implement a new status
>>>>>>>>>>>>>> field to keep
>>>>>>>>>>>>>> track of GC start/end state. Personally I'm not sure
>>>>>>>>>>>>>> these asserts are
>>>>>>>>>>>>>> valuable enough to spend time on, so I went for the first
>>>>>>>>>>>>>> approach:
>>>>>>>>>>>>>> removing them. I don't think making them use the thread
>>>>>>>>>>>>>> GC ID will be
>>>>>>>>>>>>>> appropriate. But if anyone feels strongly about these
>>>>>>>>>>>>>> asserts I can
>>>>>>>>>>>>>> implement a separate start/end state.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> There are a few "Tasks" in the GC code that are executed
>>>>>>>>>>>>>> by worker
>>>>>>>>>>>>>> threads. To make the worker threads use the correct GC ID
>>>>>>>>>>>>>> I make sure
>>>>>>>>>>>>>> that the Task constructors copy the GC ID from the
>>>>>>>>>>>>>> initiating thread
>>>>>>>>>>>>>> into a local variable. When the task is executed in its
>>>>>>>>>>>>>> worker thread it
>>>>>>>>>>>>>> sets up the GC ID based on the local variable.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The proposed change does not alter any logging (except
>>>>>>>>>>>>>> for the small bug
>>>>>>>>>>>>>> fix for
>>>>>>>>>>>>>> HeapDumpBefore(After)FullGC/PrintClassHistogramBefore(After)FullGC.
>>>>>>>>>>>>>> This
>>>>>>>>>>>>>> means that no existing tests need to be updated. In
>>>>>>>>>>>>>> particular the
>>>>>>>>>>>>>> test/gc/logging/TestGCId.java test passes even after
>>>>>>>>>>>>>> these changes.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> A big thanks to Per, who pre-reviewed these changes and
>>>>>>>>>>>>>> came with some
>>>>>>>>>>>>>> really good feedback.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Bengt
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
More information about the hotspot-dev
mailing list