RFR: 8087324: Use semaphores when starting and stopping GC task threads

Stefan Karlsson stefan.karlsson at oracle.com
Thu Jul 2 11:48:48 UTC 2015



On 2015-07-01 18:31, Jon Masamitsu wrote:
>
>
> On 6/12/2015 7:52 AM, Stefan Karlsson wrote:
>> Hi all,
>>
>> The current implementation to distribute tasks to GC worker threads 
>> often cause long latencies (multiple milliseconds) when the threads 
>> are started and stopped.
>>
>> The main reason is that the worker threads have to fight over the 
>> Monitor lock when they are woken up from the call to Monitor::wait. 
>> Another reason is that all worker threads call notify_all when they 
>> finish a task and there wakes all all sleeping worker threads, which 
>> will yet again force the worker threads to fight over the lock.
>>
>> I propose that we use semaphores instead, so that the worker threads 
>> don't have to fight over a lock when they are woken up.
>>
>>
>> The patches build upon the following patch which introduces a 
>> Semaphore utility class. This patch will sent out for review on the 
>> hotspot-dev, since it affects non-GC parts of the code:
>> http://cr.openjdk.java.net/~stefank/8087322/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8087322
>>
>>
>> The first patch that I would like to get reviewed is:
>> http://cr.openjdk.java.net/~stefank/8087323/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8087323 - Unify and split 
>> the work gang classes
>>
>> It prepares for JDK-8087324, by separating the generic WorkGang 
>> implementation from the more elaborate YieldingFlexibleWorkGang (CMS) 
>> implementation. By having this part as a separate patch, I hope it 
>> will be easier to review JDK-8087324. The patch changes the work gang 
>> inheritance from:
>>
>> AbstractWorkGang
>>  WorkGang
>>   FlexibleWorkGang
>>    YieldingFlexibleWorkGang
>>
>> to:
>>
>> AbstractWorkGang
>>  WorkGang
>>  YieldingFlexibleWorkGang
>>
>> Parts of the FlexibleWorkGang and WorkGang code that is going to be 
>> used by both concrete work gang classes, has been moved into 
>> AbstractWorkGang. I've duplicated some code in WorkGang and 
>> YieldingFlexibleWorkGang, but that code will be removed from WorkGang 
>> in the following patch.
>
> http://cr.openjdk.java.net/~stefank/8087323/webrev.00/src/share/vm/gc/cms/yieldingWorkgroup.hpp.frames.html
>
> There seems to be only one definition of 
> is_YieldingFlexibleGang_task() now.  Is that right?  Is that useful?
>
>   131   NOT_PRODUCT(virtual bool is_YieldingFlexibleGang_task() const {
>   132     return true;
>   133   })

I agree. I don't think we need it anymore.


>
> Not a change  in your patch but
>
>    86   AbstractWorkGang(const char* name, uint workers, bool are_GC_task_threads, bool are_ConcurrentGC_threads) :
>    87       _name(name),
>    88       _total_workers(workers),
>    89       _active_workers(UseDynamicNumberOfGCThreads ? 1U : workers),
>    90       _are_GC_task_threads(are_GC_task_threads),
>    91       _are_Concurren
>
> _active_workers is always calculated as >= 2 unless _total_workers is 
> only 1.
> So line 89 should be
>
> _active_workers(UseDynamicNumberOfGCThreads ? MIN2(2, workers) : workers)
>
> Should I file a CR for that?  Or do you want to include it.

I'm not sure that what is proposed above is correct. I see that 
AdaptiveSizePolicy::calc_active_workers returns 2 as a minimum, but both 
ConcurrentMark::calc_parallel_marking_threads and 
AdaptiveSizePolicy::calc_active_conc_workers can return 1.

I also don't think it should be AbstractWorkGang's responsibility to 
have the knowledge about the minimum number of worker threads that are 
used when UseDynamicNumberOfGCThreads are turned on. Maybe we should set 
it to 0, and let the calc_*_active_workers setup the default value.

I would prefer to handle any changes, to this part of the code, as 
separate RFEs.

> Have you considered (maybe for a later patch) changing 
> YieldingFlexibleWorkGang to
> simply YieldingWorkGang?  The "Flexible" attribute of 
> YieldingFlexibleWorkGang having
> been moved into AbstractWorkGang.

I thought about it, but didn't think it was important enough to warrant 
that change in this patch. I wouldn't mind if a RFE was created to 
change the name.

>
> http://cr.openjdk.java.net/~stefank/8087323/webrev.00/src/share/vm/gc/cms/yieldingWorkgroup.cpp.frames.html
>
> Is the cast at 53 necessary? I see it in the original code too.
>
>    50 AbstractGangWorker* YieldingFlexibleWorkGang::allocate_worker(uint which) {
>    51   YieldingFlexibleGangWorker* new_member =
>    52       new YieldingFlexibleGangWorker(this, which);
>    53   return (YieldingFlexibleGangWorker*) new_member;
>    54 }

Yes, this is unnecessary.

>
> The rest looks good.

Thanks.

>
> I'll do the second patch next.

Great.

StefanK

>
> Jon
>
>>
>>
>> The second patch I'd like to get reviewed is:
>> http://cr.openjdk.java.net/~stefank/8087324/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8087324 - Use semaphores 
>> when starting and stopping GC task threads
>>
>> It first simplifies the way we distribute the tasks to the GC worker 
>> threads. For example, the coordinator thread dispatches a task to a 
>> specific number of workers, and then waits for all work to be 
>> completed. There's no risk that multiple tasks will be scheduled 
>> simultaneously, so there's no need for the sequences number that is 
>> used in the current implementation.
>>
>> The patch contains two task dispatch / thread synchronization 
>> implementations:
>>
>> The first implementation uses Monitors, similar to what we did before 
>> the patch, but with a slightly lower overhead since the code calls 
>> notify_all less often. It still suffers from the "thundering heard" 
>> problem. When the coordinator thread signals that the worker threads 
>> should start, they all wake up from Monitor::wait and they all try to 
>> lock the Monitor.
>>
>> The second, and the more interesting, implementation uses semaphores. 
>> When the worker threads wake up from the semaphore wait, they don't 
>> have to serialize the execution by taking a lock. This greatly 
>> decreases the time it takes to start and stop the worker threads.
>>
>> The semaphore implementation is used on all platforms where the 
>> Semaphore class has been implemented in JDK-8087322. So, on some 
>> OS:es the code will revert to the Monitor-based solution until a 
>> Semaphore class has been implemented for that OS. So, porters might 
>> want to consider implementing the Sempahore class.
>>
>> There's also a diagnostic vm option 
>> (-XX:+/-UseSemaphoreGCThreadsSynchronization) to turn off the 
>> Semaphore-based implementation, which can be used to debug this new 
>> code. It's mainly targeted towards support and sustaining engineering.
>>
>>
>> The patches have been performance tested on Linux, Solaris, OSX, and 
>> Windows.
>>
>> The effects of the patch can be seen by running benchmarks with small 
>> young gen sizes, which triggers frequent and short GCs.
>>
>> For example, here are runs from the SPECjvm2008 xml.transform 
>> benchmark with:
>> -Xmx1g -Xms1g -Xmn64m -XX:+PrintGC -XX:+UseG1GC -jar SPECjvm2008.jar 
>> -ikv xml.transform -it 30 -wt 30
>>
>> I got the following GC times:
>>
>>             Average    Median    99.9 percentile   Max
>> Baseline: 8.76ms    8.44 ms   25.9 ms 34.7 ms
>> Monitor:   6.17 ms 5.88 ms   26.0 ms 49.1 ms
>> Semaphore: 3.43 ms 3.26 ms   13.4 ms           33.4 ms
>>
>> If I run an empty GC task 10 times per GC, by running the following code:
>> http://cr.openjdk.java.net/~stefank/8087324/timedTask/
>>
>> I get the following numbers to complete the empty GC tasks:
>>
>>             Average    Median    99.9 percentile   Max
>> Baseline: 1.43 ms    0.92 ms   3.43 ms           9.30ms
>> Monitor:    0.75ms 0.72 ms   1.74 ms           2.78ms
>> Semaphore: 0.07 ms 0.07 ms   0.17 ms           0.26 ms
>>
>>
>>
>> The code has been tested with JPRT and our nightly testing suites.
>>
>> I've created a unit test to run a small test with both the semaphore 
>> implementation and the monitor implementation:
>> http://cr.openjdk.java.net/~stefank/8087324/workgangTest/
>>
>> But since we currently don't have code to shutdown worker threads 
>> after they have been started, I don't want to push this test (or 
>> clean it up) until we have that in place. I created this bug for that:
>> https://bugs.openjdk.java.net/browse/JDK-8087340
>>
>> Thanks,
>> StefanK
>>
>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20150702/7d186501/attachment.htm>


More information about the hotspot-gc-dev mailing list