RFR: JDK-8134953: Make the GC ID available in a central place
Jon Masamitsu
jon.masamitsu at oracle.com
Tue Sep 29 16:08:05 UTC 2015
On 9/29/15 8:57 AM, Bengt Rutisson wrote:
>
>
> On 2015-09-29 17:59, Jon Masamitsu wrote:
>>
>>
>> On 9/29/15 1:09 AM, Bengt Rutisson wrote:
>>>
>>> Hi Jon,
>>>
>>> On 2015-09-28 19:19, Jon Masamitsu wrote:
>>>>
>>>>
>>>> On 09/28/2015 06:21 AM, Bengt Rutisson wrote:
>>>>>
>>>>> Hi Jon and Per,
>>>>>
>>>>> I've been trying to get the version proposed in webrev.02 to work
>>>>> but I ran into a show stopper for this approach. The G1 concurrent
>>>>> mark thread is at times executing (and logging) at the same time
>>>>> as a young or mixed GC is executing. There is no good way of
>>>>> handling this with a common place to store the GC ID. Instead I've
>>>>> decided to go back to storing the current GC ID in the thread.
>>>>> That way the G1 concurrent marking and the young/mixed GCs will
>>>>> have their own GC IDs.
>>>>>
>>>>> Here's an updated webrev:
>>>>> http://cr.openjdk.java.net/~brutisso/8134953/webrev.03/
>>>> Bengt,
>>>>
>>>> The _gc_id is needed in a JavaThread because a JavaThread
>>>> might do work for G1 (concurrent refinement for example) and
>>>> needs a GC ID?
>>>
>>> The Java threads currently don't log anything when they do GC work.
>>> But in the future it is likely that they will.
>>>
>>>> But a WatcherThread would never need one,
>>>> right?
>>>
>>> Right. Initially I only added the _gc_id field to NamedThread. But
>>> Per thought it would be better to have it in Thread. Both for making
>>> it possible for future improvements (such as logging GC work from
>>> JavaThreads) and because it seems like we normally keep storage in
>>> the Thread class even for things that are not used by all
>>> subclasses. For example TLABs and allocated_bytes() are only
>>> relevant for JavaThreads but have their storage in Thread.
>>
>> Yes, Thread could be improved but I'm not sure I can stand adding GC
>> data where
>> it's not needed. Unless GC logging from JavaThread is something
>> that's going to
>> happen soon, I'd rather add it when it's needed rather than right
>> now. I'd personally
>> rather duplicate the code in JavaThread if it is needed rather than
>> have WatcherThreads
>> have a _gc_id.
>
> I tend to agree. I'll discuss this with Per tomorrow. If he agrees
> that we can move the _gc_id down to NamedThread, are you ok with me
> pushing the fix?
Yes, I'm good with the fix then.
Would it be appropriate for me to file a CR to change NamedThread to
something
like GCThread? Or ???
Jon
>
> Thanks,
> Bengt
>
>>
>> Jon
>>
>>>
>>> Thanks,
>>> Bengt
>>>
>>>>
>>>> Jon
>>>>
>>>>>
>>>>> This is the same as webrev.01 that both of you already looked at.
>>>>> I've made two minor adjustments:
>>>>>
>>>>> - I removed the currentNamedthread() function in gcId.cpp that Per
>>>>> pointed out in his review of webrev.01. Instead I use
>>>>> Thread::current() directly.
>>>>>
>>>>> - I made GCIdMark a StackObj.
>>>>>
>>>>> Otherwise the patch is the same as in webrev.01.
>>>>>
>>>>> Thanks,
>>>>> Bengt
>>>>>
>>>>> On 2015-09-10 14:37, Bengt Rutisson wrote:
>>>>>>
>>>>>> Hi Jon,
>>>>>>
>>>>>> Thanks for looking at this!
>>>>>>
>>>>>> On 2015-09-10 01:00, Jon Masamitsu wrote:
>>>>>>> Bengt,
>>>>>>>
>>>>>>> When a CMS concurrent collection hands off to a STW foreground
>>>>>>> collection,
>>>>>>> (done in acquire_control_and_collect()), I expected a new GCId
>>>>>>> would be
>>>>>>> used. Did I miss it? That STW collection does not go through
>>>>>>> do_collection().
>>>>>>
>>>>>> The call to acquire_control_and_collect() originates from
>>>>>> GenCollectedHeap::collect_generation() and there is a new
>>>>>> GCIdMark in there that will create a new GCId for the STW
>>>>>> collection. That's the "extra" GCIdMark that Per asked about in
>>>>>> his review.
>>>>>>
>>>>>> *But* I wanted to verify that it worked properly since you asked
>>>>>> about it and I realized that there is another bug.
>>>>>>
>>>>>> The GCId that is set up for the concurrent cycle, the one set up
>>>>>> in ConcurrentMarkSweepThread::run(), will is still active over
>>>>>> the complete STW foreground collection. That's fine in my model
>>>>>> since the new GCIdMark in GenCollectedHeap::collect_generation()
>>>>>> will cache that GCId. But the ConcurrentMarkSweep thread is not
>>>>>> completely idle even though control has been left to the
>>>>>> foreground collection in acquire_control_and_collect(). So, there
>>>>>> is some logging done (by fore example CMSPhaseAccounting) that is
>>>>>> done at the same time as the foreground collection. This logging
>>>>>> will now use the foreground GCId instead of the CMS GCId.
>>>>>>
>>>>>> I haven't had time to dig in to the details of that just yet. But
>>>>>> this is an unintended change of the logging, so I would like to
>>>>>> fix it. Hopefully I'll have an updated webrev tomorrow.
>>>>>>
>>>>>> Thanks,
>>>>>> Bengt
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Jon
>>>>>>>
>>>>>>> On 9/9/2015 7:40 AM, Bengt Rutisson wrote:
>>>>>>>>
>>>>>>>> Hi again,
>>>>>>>>
>>>>>>>> I found an issue with how G1 young collections kick off
>>>>>>>> concurrent marks. There is a short window where we might have
>>>>>>>> two active GC IDs at the same time. I've fixed this and updated
>>>>>>>> the webrevs. In case anyone had already started looking at the
>>>>>>>> webrevs you need to do a refresh now. The fix is in
>>>>>>>> G1CollectedHeap::do_collection_pause_at_safepoint().
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Bengt
>>>>>>>>
>>>>>>>> On 09/09/15 13:38, Bengt Rutisson wrote:
>>>>>>>>>
>>>>>>>>> Hi Per,
>>>>>>>>>
>>>>>>>>> Thanks for looking at this!
>>>>>>>>>
>>>>>>>>> On 2015-09-08 15:23, Per Liden wrote:
>>>>>>>>>> Hi Bengt,
>>>>>>>>>>
>>>>>>>>>> On 2015-09-08 13:35, Bengt Rutisson wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>
>>>>>>>>>>> This is mostly a GC related patch, but it adds a field to
>>>>>>>>>>> the Thread
>>>>>>>>>>> class, so I'm sending this out on the broader hotspot-dev list.
>>>>>>>>>>>
>>>>>>>>>>> Could I have a couple of reviews for this patch?
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~brutisso/8134953/webrev.01/
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8134953
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Looks good. I think this is a nice simplification, especially
>>>>>>>>>> for G1's concurrent parts. Just two comments:
>>>>>>>>>>
>>>>>>>>>> genCollectedHeap.cpp:
>>>>>>>>>> ---------------------
>>>>>>>>>> - GenCollectedHeap::do_collection() has two GCIdMarks, in
>>>>>>>>>> different scopes. Do we really want that second mark?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> We potentially do two GCs in
>>>>>>>>> GenCollectedHeap::do_collection(). First a young GC and then
>>>>>>>>> potentially a full GC. These two should have different GC IDs.
>>>>>>>>> So, yes, we need two GCIdMarks in this method. The scoping
>>>>>>>>> could be better, but I think that is a refactoring that should
>>>>>>>>> be done separately from my current patch. I'll talk to Jesper
>>>>>>>>> about it since he has been cleaning up this code lately.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> gcId.cpp:
>>>>>>>>>> --------
>>>>>>>>>> - I think you might have left currentNamedthread() in there.
>>>>>>>>>> You probably just want to use Thread::current() instead.
>>>>>>>>>
>>>>>>>>> Yes, good catch. Thanks.
>>>>>>>>>
>>>>>>>>> However, after thinking some more about the implications of
>>>>>>>>> moving the GC ID out from the GCTracer I figured it would be
>>>>>>>>> possible to just store the previous GC ID in the GCIdMark and
>>>>>>>>> then restore it in the destructor of that class. That way we
>>>>>>>>> don't have to store anything in the Thread class but can still
>>>>>>>>> have both a concurrent GC ID and a STW GC ID active at the
>>>>>>>>> same time. This also removes the need to copy the GC ID to the
>>>>>>>>> worker tasks.
>>>>>>>>>
>>>>>>>>> Here's an updated webrev with a solution that does not add
>>>>>>>>> anything to the Thread class:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~brutisso/8134953/webrev.02/
>>>>>>>>>
>>>>>>>>> And here's a diff compared to the last version:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~brutisso/8134953/webrev.01-02.diff/
>>>>>>>>>
>>>>>>>>> Unfotunately the webrev tool had some bad luck when creating
>>>>>>>>> the diff webrev so at least the g1CollectedHeap.cpp seem to
>>>>>>>>> contain the complete changes rather than just the diff
>>>>>>>>> compared to the 01 version. The rest of the diff looks correct
>>>>>>>>> as far as I can tell.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Bengt
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> cheers,
>>>>>>>>>> /Per
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Currently the GC ID, that is used for event tracing and
>>>>>>>>>>> logging, is
>>>>>>>>>>> stored in the GCTracer object. That has caused some minor
>>>>>>>>>>> problems since
>>>>>>>>>>> there are multiple GCTracers active at the same time. The
>>>>>>>>>>> correct GC IDs
>>>>>>>>>>> need to be passed around in many places.
>>>>>>>>>>>
>>>>>>>>>>> For some upcoming GC logging changes I would like to have a
>>>>>>>>>>> more
>>>>>>>>>>> consistent way of finding the currently active GC ID. I've
>>>>>>>>>>> played around
>>>>>>>>>>> with a couple of different solutions, but eventually I found
>>>>>>>>>>> that the
>>>>>>>>>>> simplest solution is to store the current GC ID in the
>>>>>>>>>>> thread. That way
>>>>>>>>>>> there is a single way to look it up in all places. It is
>>>>>>>>>>> also fairly
>>>>>>>>>>> straight forward to set it up.
>>>>>>>>>>>
>>>>>>>>>>> I've reworked the GCId class a bit to support this and I've
>>>>>>>>>>> introduced a
>>>>>>>>>>> GCIdMark class that is a scoped object that helps to set up
>>>>>>>>>>> and tear
>>>>>>>>>>> down a current GC ID. By moving the GC ID out from the
>>>>>>>>>>> GCTracer class I
>>>>>>>>>>> got rid of a few minor issues as well. One is that we no
>>>>>>>>>>> longer need to
>>>>>>>>>>> keep track of the G1 "aborted concurrent GC ID". That was
>>>>>>>>>>> necessary
>>>>>>>>>>> before since we did logging *after* we had told the
>>>>>>>>>>> corresponding
>>>>>>>>>>> GCTracer that the concurrent cycle was aborted and it had
>>>>>>>>>>> thrown its GC
>>>>>>>>>>> ID away. Now the GC ID can stay in scope until we have
>>>>>>>>>>> completed all
>>>>>>>>>>> logging.
>>>>>>>>>>>
>>>>>>>>>>> Also the HeapDumpBeforeFullGC and
>>>>>>>>>>> PrintClassHistogramBeforeFullGC used
>>>>>>>>>>> to have to create a new GC ID since their logging was done
>>>>>>>>>>> before we had
>>>>>>>>>>> the correct GCTracer set up. Now the GC ID can be available
>>>>>>>>>>> early enough
>>>>>>>>>>> for this logging to be using the same (and correct) GC ID as
>>>>>>>>>>> the rest of
>>>>>>>>>>> the GC. Same for HeapDumpAfterFullGC and
>>>>>>>>>>> PrintClassHistogramAfterFullGC.
>>>>>>>>>>>
>>>>>>>>>>> I've added an uint to the Thread class to keep track of the
>>>>>>>>>>> currently
>>>>>>>>>>> active GC ID. In the current code there are actually only
>>>>>>>>>>> NamedThreads
>>>>>>>>>>> that need the GC ID. So, I'm not sure where the best place
>>>>>>>>>>> is for this
>>>>>>>>>>> field is. But it seems like the Thread class contains most
>>>>>>>>>>> of the data,
>>>>>>>>>>> even data that is only used by some subclasses, so I opted
>>>>>>>>>>> for putting
>>>>>>>>>>> the field in Thread as opposed to in NamedThread. This opens
>>>>>>>>>>> up for
>>>>>>>>>>> possible future extensions when Java threads may do part of
>>>>>>>>>>> the GC work.
>>>>>>>>>>> However, I'm open to moving the field down to NamedThread if
>>>>>>>>>>> that is
>>>>>>>>>>> seen as better.
>>>>>>>>>>>
>>>>>>>>>>> In the GCTracer class there were many asserts that were
>>>>>>>>>>> checking that
>>>>>>>>>>> the GC ID was properly set up. Mostly these assert verify
>>>>>>>>>>> that start/end
>>>>>>>>>>> is called correctly and that the other methods are called
>>>>>>>>>>> inside of the
>>>>>>>>>>> start/end scope. Now that the GC ID is moved out of the
>>>>>>>>>>> GCTracer class
>>>>>>>>>>> these asserts had to be dealt with. I figured there were
>>>>>>>>>>> three ways to
>>>>>>>>>>> handle it; just remove them, replace them with check that
>>>>>>>>>>> the GC ID from
>>>>>>>>>>> the current thread is correct, or implement a new status
>>>>>>>>>>> field to keep
>>>>>>>>>>> track of GC start/end state. Personally I'm not sure these
>>>>>>>>>>> asserts are
>>>>>>>>>>> valuable enough to spend time on, so I went for the first
>>>>>>>>>>> approach:
>>>>>>>>>>> removing them. I don't think making them use the thread GC
>>>>>>>>>>> ID will be
>>>>>>>>>>> appropriate. But if anyone feels strongly about these
>>>>>>>>>>> asserts I can
>>>>>>>>>>> implement a separate start/end state.
>>>>>>>>>>>
>>>>>>>>>>> There are a few "Tasks" in the GC code that are executed by
>>>>>>>>>>> worker
>>>>>>>>>>> threads. To make the worker threads use the correct GC ID I
>>>>>>>>>>> make sure
>>>>>>>>>>> that the Task constructors copy the GC ID from the
>>>>>>>>>>> initiating thread
>>>>>>>>>>> into a local variable. When the task is executed in its
>>>>>>>>>>> worker thread it
>>>>>>>>>>> sets up the GC ID based on the local variable.
>>>>>>>>>>>
>>>>>>>>>>> The proposed change does not alter any logging (except for
>>>>>>>>>>> the small bug
>>>>>>>>>>> fix for
>>>>>>>>>>> HeapDumpBefore(After)FullGC/PrintClassHistogramBefore(After)FullGC.
>>>>>>>>>>> This
>>>>>>>>>>> means that no existing tests need to be updated. In
>>>>>>>>>>> particular the
>>>>>>>>>>> test/gc/logging/TestGCId.java test passes even after these
>>>>>>>>>>> changes.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> A big thanks to Per, who pre-reviewed these changes and came
>>>>>>>>>>> with some
>>>>>>>>>>> really good feedback.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Bengt
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
More information about the hotspot-dev
mailing list