[Bug] Load imbalance of Tasks among GCTaskThreads during the Young GC

Tony S suokunstar at gmail.com
Mon Nov 21 17:15:47 UTC 2016


Hi Thomas,

Thank you for your reply.

On Mon, Nov 21, 2016 at 3:22 AM, Thomas Schatzl
<thomas.schatzl at oracle.com> wrote:
> Hi,
>
> On Fri, 2016-11-18 at 22:54 -0600, Tony S wrote:
>> Hi,
>>
>> [Description]
>> src/share/vm/gc_implementation/parallelScavenge/psScavenge.cpp
>>
>> PSPromotionManager enqueues various tasks (ThreadRootsTask /
>> StealTask / ScavengeRootsTask / OldToYoungRootsTask) into
>> GCTaskQueue.
>> GCTaskThread asked GCTaskManager to call get_task() to obtain tasks
>> from the GCTaskQueue. However, I found the tasks are not distributed
>> evenly among the GCTaskThreads. Some GCTaskThreads get more tasks
>> than others.
>
> This imbalance could have several reasons:
>
> - startup of some tasks is delayed, e.g. due to scheduling threads(s) X
> gets activated a significant amount of time later than other threads.
>

I am not sure. I tried to bind all the GCTaskThread to different cores
to avoid these threads interfered with each other (like below).

GCTaskManager::initialize() {
...
for (uint t = 0; t < workers(); t += 1) {
set_thread(t, GCTaskThread::create(this, t, processor_assignment[t]));
}
...
}

However, the issue still exists there.


> - the work unit per task is so small, completing some of these tasks is
> so quick that a thread immediately competes for more work with the
> other tasks. Due to scheduling that task might somehow get favorable
> treatment (e.g. OldToYoungRootsTask) by the scheduler (like: it is
> already running and has not used up its time slice).
>

The time execution time for some task is small. That is right. For
example, the below tasks only need hundreds to thousands of
nanoseconds.

 OldToYoungRootsTask time:3961  GCTaskThread ID:13
 ScavengeRootsTask time:1492  GCTaskThread ID:14
 ThreadRootsTask time:995  GCTaskThread ID:14

I do not think it is due to the scheduler. I changed the linux
scheduler to FIFO instead of CFS. Here are the results. Under FIFO,
the OldToYoungRootsTasks still are not distributed evenly. See,
GCTaskThread 0,3,6 execute them.

  OldToYoungRootsTask time:26748 GCTaskThread ID:6
  OldToYoungRootsTask time:9327 GCTaskThread ID:3
  OldToYoungRootsTask time:20358 GCTaskThread ID:0
  OldToYoungRootsTask time:3413 GCTaskThread ID:6
  OldToYoungRootsTask time:609 GCTaskThread ID:0
  OldToYoungRootsTask time:3354 GCTaskThread ID:3
  OldToYoungRootsTask time:947 GCTaskThread ID:6
  OldToYoungRootsTask time:2007 GCTaskThread ID:0
  OldToYoungRootsTask time:861 GCTaskThread ID:6
  OldToYoungRootsTask time:3364 GCTaskThread ID:3
  OldToYoungRootsTask time:1493 GCTaskThread ID:0
  OldToYoungRootsTask time:740 GCTaskThread ID:6
  OldToYoungRootsTask time:349 GCTaskThread ID:3
  OldToYoungRootsTask time:73 GCTaskThread ID:0


> - there is no 1:1 correspondence between threads and tasks (e.g.
> ScavengeRootsTask), as the work split is done across tasks and not
> threads for these. So they can't be evenly distributed (with larger
> thread counts).

Yes. For ScavengeRootsTask, there is no 1:1 correspondence. But how
about OldToYoungRootsTask and ThreadRootsTask? Their number are either
equal or greater than the number of threads, still not distributed
evenly.

Another potential reason might be the lock on the GCTaskQueue, but I
am not sure right now.

>
>>  Here are my trace log:
>>
>>
>> [Tracing Log]
>>
>>   OldToYoungRootsTask   GCTaskThread ID:4
>>   OldToYoungRootsTask   GCTaskThread ID:3
>>   OldToYoungRootsTask   GCTaskThread ID:3
>>   OldToYoungRootsTask   GCTaskThread ID:3
>>   OldToYoungRootsTask   GCTaskThread ID:3
>>   OldToYoungRootsTask   GCTaskThread ID:3
>>   OldToYoungRootsTask   GCTaskThread ID:3
>>   OldToYoungRootsTask   GCTaskThread ID:10
>>   OldToYoungRootsTask   GCTaskThread ID:3
>>   OldToYoungRootsTask   GCTaskThread ID:10
>>   OldToYoungRootsTask   GCTaskThread ID:3
>>   OldToYoungRootsTask   GCTaskThread ID:10
>>   OldToYoungRootsTask   GCTaskThread ID:4
>>   OldToYoungRootsTask   GCTaskThread ID:10
>>   ScavengeRootsTask      GCTaskThread ID:4
>>   ScavengeRootsTask      GCTaskThread ID:10
>>   ThreadRootsTask           GCTaskThread ID:10
>>   OldToYoungRootsTask   GCTaskThread ID:3
>>   ThreadRootsTask           GCTaskThread ID:10
>>   ThreadRootsTask           GCTaskThread ID:10
>>   ThreadRootsTask           GCTaskThread ID:3
>>   ThreadRootsTask           GCTaskThread ID:10
>>   ThreadRootsTask           GCTaskThread ID:3
>>   ThreadRootsTask           GCTaskThread ID:10
>>   ThreadRootsTask           GCTaskThread ID:3
>>   ThreadRootsTask           GCTaskThread ID:3
>>   ThreadRootsTask           GCTaskThread ID:10
>>   ThreadRootsTask           GCTaskThread ID:3
>>   ThreadRootsTask           GCTaskThread ID:10
>>   ThreadRootsTask           GCTaskThread ID:10
>>   ThreadRootsTask           GCTaskThread ID:3
>>   ThreadRootsTask           GCTaskThread ID:10
>>   ThreadRootsTask           GCTaskThread ID:3
>>   ThreadRootsTask           GCTaskThread ID:3
>>   ScavengeRootsTask      GCTaskThread ID:3
>>   ScavengeRootsTask       GCTaskThread ID:3
>>   ScavengeRootsTask       GCTaskThread ID:3
>>   ThreadRootsTask            GCTaskThread ID:10
>>   ScavengeRootsTask        GCTaskThread ID:3
>>   ScavengeRootsTask        GCTaskThread ID:3
>>   ScavengeRootsTask        GCTaskThread ID:3
>>   ThreadRootsTask            GCTaskThread ID:4
>>   ScavengeRootsTask        GCTaskThread ID:10
>>   StealTask                         GCTaskThread ID:6
>>   StealTask                         GCTaskThread ID:9
>>   StealTask                         GCTaskThread ID:3
>>   StealTask                         GCTaskThread ID:14
>>   StealTask                         GCTaskThread ID:7
>>   StealTask                         GCTaskThread ID:5
>>   StealTask                         GCTaskThread ID:8
>>   StealTask                         GCTaskThread ID:0
>>   StealTask                         GCTaskThread ID:12
>>   StealTask                         GCTaskThread ID:13
>>   StealTask                         GCTaskThread ID:2
>>   StealTask                         GCTaskThread ID:4
>>   StealTask                         GCTaskThread ID:11
>>   StealTask                         GCTaskThread ID:1
>>   StealTask                         GCTaskThread ID:10
>>
>>
>> I use dacapo benchmark, running lusearch on JDK 1.8 on ubuntu 16.10.
>> I have totally 15 GCTaskThreads on the above test. We can see that
>
> Afair (many of) the DaCapo benchmarks are a bit on the small side for
> modern machines. Anyway, due to various reasons you might even get
> better gc pauses with less threads.
>
>> only StealTasks are evenly executed on different GCTaskThread. The
>> other tasks are executed on few GCTaskThreads, not evenly
>> distributed.
>>
>> When I increase the number of mutator threads and thus have more
>> tasks (ThreadRootsTask / StealTask / ScavengeRootsTask /
>> OldToYoungRootsTask), however, those tasks are still not distributed
>> evenly among the GCTaskThreads, which causes the load imbalance and
>> thus prolongs the Young GC time.
>
> The trace above does not include start/end times of the threads and
> tasks so it is hard to diagnose any actual problem, its source, or if
> there is actually one. I.e. StealTasks of different threads might take
> a different amount of time, so they may actually finish at
> approximately the same time anyway? Did you measure how long the
> threads are actually waiting for work (or completion of other threads)?
>

I add the start and end timestamp in the JDK to measure the execution
time of different task, here are the results(using ubuntu 16.10,
scheduler: default CFS, machine: dell T430, benchmark dapaco
lusearch):

  OldToYoungRootsTask time:19213  GCTaskThread ID:10
  OldToYoungRootsTask time:5680  GCTaskThread ID:13
  OldToYoungRootsTask time:13489  GCTaskThread ID:14
  OldToYoungRootsTask time:944  GCTaskThread ID:14
  OldToYoungRootsTask time:821  GCTaskThread ID:14
  OldToYoungRootsTask time:6227  GCTaskThread ID:13
  OldToYoungRootsTask time:768  GCTaskThread ID:14
  OldToYoungRootsTask time:2729  GCTaskThread ID:13
  OldToYoungRootsTask time:745  GCTaskThread ID:13
  OldToYoungRootsTask time:4095  GCTaskThread ID:14
  OldToYoungRootsTask time:774  GCTaskThread ID:13
  OldToYoungRootsTask time:828  GCTaskThread ID:14
  OldToYoungRootsTask time:384  GCTaskThread ID:14
  ScavengeRootsTask time:1248  GCTaskThread ID:14
  OldToYoungRootsTask time:3961  GCTaskThread ID:13
  ScavengeRootsTask time:1492  GCTaskThread ID:14
  ThreadRootsTask time:995  GCTaskThread ID:14
  ThreadRootsTask time:590  GCTaskThread ID:14
  ThreadRootsTask time:437  GCTaskThread ID:14
  ThreadRootsTask time:1341  GCTaskThread ID:2
  ThreadRootsTask time:377  GCTaskThread ID:14
  ThreadRootsTask time:310  GCTaskThread ID:2
  ThreadRootsTask time:431  GCTaskThread ID:14
  ThreadRootsTask time:376  GCTaskThread ID:2
  ThreadRootsTask time:277  GCTaskThread ID:2
  ThreadRootsTask time:400  GCTaskThread ID:2
  ThreadRootsTask time:3354  GCTaskThread ID:14
  ThreadRootsTask time:951  GCTaskThread ID:2
  ThreadRootsTask time:584  GCTaskThread ID:14
  ThreadRootsTask time:1592  GCTaskThread ID:8
  OldToYoungRootsTask time:50418  GCTaskThread ID:10
  ThreadRootsTask time:1255  GCTaskThread ID:8
  ScavengeRootsTask time:1836  GCTaskThread ID:8
  ScavengeRootsTask time:1735  GCTaskThread ID:8
  ScavengeRootsTask time:1128  GCTaskThread ID:8
  ThreadRootsTask time:14132  GCTaskThread ID:2
  ThreadRootsTask time:16539  GCTaskThread ID:14
  ScavengeRootsTask time:510  GCTaskThread ID:14
  ScavengeRootsTask time:6407  GCTaskThread ID:8
  StealTask time:3160638  GCTaskThread ID:1
  StealTask time:3929547  GCTaskThread ID:13
  StealTask time:3359946  GCTaskThread ID:4
  StealTask time:3469719  GCTaskThread ID:2
  StealTask time:3216089  GCTaskThread ID:11
  StealTask time:3863544  GCTaskThread ID:14
  StealTask time:3465927  GCTaskThread ID:9
  StealTask time:3507085  GCTaskThread ID:3
  StealTask time:3624188  GCTaskThread ID:12
  StealTask time:3704221  GCTaskThread ID:7
  StealTask time:3565777  GCTaskThread ID:6
  StealTask time:3204696  GCTaskThread ID:0
  StealTask time:4006233  GCTaskThread ID:8
  StealTask time:3997742  GCTaskThread ID:10
  StealTask time:3530225  GCTaskThread ID:5


Also, I found the StealTask dominates the total time, contributes the
most part of the young GC. However, not all the StealTask really needs
so much time. If one of the StealTask takes much time, many other
StealTask will be prolonged. This is another issue.

>> Did anyone know the reason?
>> Thanks.
>
> Thanks,
>   Thomas



--



More information about the hotspot-gc-dev mailing list