RFR: 8008916 G1: Evacuation failed tracing event

Jesper Wilhelmsson jesper.wilhelmsson at oracle.com
Wed Apr 10 20:37:29 UTC 2013


Hi,

An updated webrev based on the feedback from John an Bengt is available here:
http://cr.openjdk.java.net/~jwilhelm/8008916/webrev/

Thanks for reviewing!
/Jesper


John Cuthbertson skrev 9/4/13 2:50 AM:
> Hi Jesper,
>
> A quick question before going on with the actual review....
>
> Do you need the native thread ID in the event or just the worker IDs?
>
> As you have seen tying something to specific worker native IDs across multiple
> gang tasks is not easy. If you actually need the native thread IDs then I'm not
> sure what the best approach is. The most obvious is to have some sequential
> number in each worker that is fixed for the life of that worker (i.e. the life
> the JVM) and use that value to index into the evacuation_failed_info_array.
> Fortunately, I think we have that - it's the WorkerThread::_id field:
>
>>   for (uint worker = 0; worker < total_workers(); worker += 1) {
>>     GangWorker* new_worker = allocate_worker(worker);
>
> and:
>
>> GangWorker* WorkGang::allocate_worker(uint which) {
>>   GangWorker* new_worker = new GangWorker(this, which);
>>   return new_worker;
>> }
>> GangWorker::GangWorker(AbstractWorkGang* gang, uint id) {
>>   _gang = gang;
>>   set_id(id);
>>   set_name("Gang worker#%d (%s)", id, gang->name());
>> }
>
> So instead of associating the entries in the evacuation_failed_info_array with
> worker_id/queue_num, they are associated with
> Thread::current()->as_Worker_thread()->id(). Note if ParallelGCThreads=0 then
> the worker threads are not created but you are using ParallelGCThreads as the
> index for the VM thread - that's OK (but is a break with convention). You
> comment it (and why).
>
> Now on to the actual review:
>
> g1CollectedHeap.hpp:
>
> Line 888:
> I think you can make this an array of EvacFailureInfo instances rather than
> pointers. I think it's safer to do this. Once you establish the mapping between
> worker thread and EvacFailureInfo instance - you don't want it to change.
>
> Line 930:
> The change is unnecessary - but it's OK to leave it.
>
> Line 1817:
> Again since the mapping between worker thread and evac failed info instance is
> fixed, there's no need to embed a pointer to the object into the PSS but it's OK
> to leave it but I would initialize in the constructor, i.e. have something like
> the following in the PSS constructor:
>
> // The value of native_id is used to index into the evacuation_failed_info array
> Thread* me = Thread::current();
> uint native_id = 0;
> if (me->is_Worker_thread()) {
>    native_id = me->as_Worker_thread()->id();
> } else {
>    assert(me->is_VMThread(), "who else?");
>    native_id = ParallelGCThreads;
> }
> _evac_failed_info = _g1h->evacuation_failed_instance_for(native_id);
>
> Alternatively store the native worker ID as well as queue_num in the PSS - you
> can use that to access the instance.
>
> Line 1863:
> This line is unnecessary. You no longer need the routine.
>
> Line 1951:
> This line may be unnecessary. You could return the _native_id stored in the PSS
> and fetch the evac_failed_info object for that _native_id from the array. I'll
> leave it up to you.
>
> g1CollectedHeap.cpp
>
> Line 1970 - 1976
> I really don't like that you  allocate ParallelGCThreads+1 entries - I would
> prefer to use n_queues but I can see why you need to - if parallel reference
> processing is not enabled. The convention is that if ParallelGCThreads==0,
> n_queues will be 1 and we use queue[0] and worker_id 0. I want lots of comments
> here since you're breaking with convention.
>
> Fold the initialization of the evac_failed_info array into one of the other
> loops or fold all three loops into one:
>
>> 1954   for (int i = 0; i < n_queues; i++) {
>> 1955     iter_arr[i] = new HeapRegionRemSetIterator();
>> 1956   }
>
>> 1962   for (int i = 0; i < n_queues; i++) {
>> 1963     RefToScanQueue* q = new RefToScanQueue();
>> 1964     q->initialize();
>> 1965     _task_queues->register_queue(i, q);
>> 1966   }
>> 1971   for (uint n = 0; n < ParallelGCThreads; n++) {
>> 1972     _evacuation_failed_info_array[n] = NULL;
>> 1973   }
>
> I think you also want to allocate/reset/initialize the evac_failure_info
> objects. You could also bind to the worker thread:
>
> if (G1CollectedHeap::use_parallel_gc_threads()) {
>    assert(workers()->total_workers == n_queues(), "check this code otherwise");
>    for (uint k = 0; k < workers()->total_workers()l k += 1) {
>      GangWorker *worker = workers()->gang_worker(k);
>      uint id = worker->id();
>      _evac_failed_info[id].bind_to_thread(worker);
>    }
> }
>
> // The following is unconditional.
>   _evac_failed_info[ParallelGCThreads].bind_to_thread(VMThread::vm_thread());
>
> But binding as the first thing in G1ParTask::work() would be OK too.
>
> Line 3813:
> You don't need  to do this here. It can be done in G1ParTask::work() or during
> initialization. Again lot's of comments about why the evac_failure_info instance
> for the VMThread is not encoded at index 0.
>
> 4051-4056:
> Use the use_parallel_gc_threads idiom:
>
> if (G1CollectedHeap::use_parallel_gc_threads()) {
>    // Since there are multiple parallel tasks where an evacuation failure could
> occur
>    // and (in the future) each task may have been executed by a different number
> of workers,
>    // we need to check them all.
>    for (uint i = 0; i < workers()->total_workers(); i += 1) {
>      ....
>    }
> }
> // Now check the VM thread as it could have seen an evacuation failure during
> serial
> // reference processing
> ...
>
> Line 4480:
> This line may not be necessary.
>
> Lines 4507-4533:
> Routine may not be necessary. The loop in particular:
>
>> 4521     while (_evacuation_failed_info == NULL) {
>> 4522       EvacuationFailedInfo* current =
>> _g1h->_evacuation_failed_info_array[n];
>> 4523       assert(current != NULL, "All info objects should have been
>> allocated already");
>> 4524       if (current->thread() == Thread::current()->osthread()) {
>> 4525         _evacuation_failed_info = current;
>> 4526       } else {
>> 4527         n++;
>> 4528         assert(n <= ParallelGCThreads, "There shouldn't be more threads
>> than ParallelGThreads + the main GC thread");
>> 4529       }
>> 4530     }
>
> is a big concern.This is an O(n) lookup for each of n threads, for each of the
> distinct parallel tasks (3).
>
> The other files look OK.
>
> JohnC
>
> On 4/6/2013 11:54 PM, Jesper Wilhelmsson wrote:
>> Bengt, all,
>>
>> A new webrev is now available:
>>
>> http://cr.openjdk.java.net/~jwilhelm/8008916/webrev/
>>
>> The main difference is that now we have an array in G1CollectedHeap that keeps
>> track the EvacuationFailedInfo-objects between the different parallel parts of
>> G1 so that each thread only sends a single EvacuationFailed event in the end.
>>
>> To achieve this some logic was added to find the correct info object in the
>> array since the threads does not have the same queue number each time.
>>
>> Thanks,
>> /Jesper
>>
>>
>> Bengt Rutisson skrev 28/3/13 10:20 PM:
>>>
>>> Jesper,
>>>
>>> This looks much better!
>>>
>>> Have you tested it to see that you get a maximum of one event per GC thread and
>>> GC? It looks to me like you risk getting multiple events sent for each GC thread
>>> every GC.
>>>
>>> You send the events in the G1ParScanThreadState destructor. I am not that
>>> familiar with G1ParScanThreadState but it looks to me like we create new
>>> G1ParScanThreadState instances for the different phases of a GC. Here are the
>>> places where we create new instances of G1ParScanThreadState:
>>>
>>> G1CollectedHeap::process_discovered_references()
>>> G1ParPreserveCMReferentsTask::work()
>>> G1ParTask::work()
>>> G1STWRefProcTaskProxy::work()
>>>
>>> So, it looks to me like we risk getting 4 events sent per thread and GC.
>>>
>>>
>>> Some minor comments:
>>>
>>> In G1CollectedHeap::handle_evacuation_failure_par() you pass the ef_info on to
>>> G1CollectedHeap::handle_evacuation_failure_common(). But the ef_info is now
>>> thread local so you don't have to take the lock to use it. One way to simplify
>>> this would be to do "ef_info->register_copy_failure(old->size());" in
>>> G1CollectedHeap::handle_evacuation_failure_par() just before "if
>>> (_evac_failure_closure != cl)" instead. Then you don't have to pass it on to
>>> handle_evacuation_failure_common().
>>>
>>> (As a side note, I don't think setting _evacuation_failed to true has to be
>>> protected by the lock either. It is a shared variable but all writes to it from
>>> different threads will be to set it to true, so I think racing on it is benign.
>>> But you have it in the same place as before, which might be good.)
>>>
>>> In G1ParCopyClosure::copy_to_survivor_space() you are now picking out two thing
>>> from the _par_scan_state and passing them on to
>>> G1CollectedHeap::handle_evacuation_failure_par(). Have you considered passing a
>>> reference to _par_scan_state instead and let handle_evacuation_failure_par()
>>> extract the data it needs?
>>>
>>> If you move the implementation of the destructor of G1ParScanThreadState in to
>>> the cpp file you don't have to add the include for gcTrace.hpp to
>>> g1CollectedHeap.hpp.
>>>
>>> Why did you decide to remove set_evactuation_failed()?
>>>
>>> Finally, you have several quite unrelated spelling corrections in this change.
>>> This makes it a bit hard to review. I'm not sure I think it is worth the extra
>>> review time to get those fixed.
>>>
>>> Bengt
>>>
>>>
>>> On 3/28/13 7:50 PM, Jesper Wilhelmsson wrote:
>>>> Hi,
>>>>
>>>> A new version of the evacuation failed event for G1. Now, each thread sends an
>>>> event with the info it has collected during the collection in the same way as
>>>> the other collectors do with promotion failed events.
>>>>
>>>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8008916/webrev.2/
>>>>
>>>> Thanks,
>>>> /Jesper
>>>
>



More information about the hotspot-gc-dev mailing list