RFR: 8087324: Use semaphores when starting and stopping GC task threads

Stefan Karlsson stefan.karlsson at oracle.com
Mon Aug 17 07:12:25 UTC 2015


On 2015-08-14 14:38, Stefan Johansson wrote:
> Hi Stefan,
>
> Nice change, looks really good,

Thanks.

> but I have a few small comments.
>
> In workgroup.hpp:
>  177   GangTaskDispatcher* dispatcher()const {
> Add a space before const.

Fixed.

> ---
>
> In workgroup.cpp:
>  122       // Limit the semaphore value to the number of workers.
>  123       _start_semaphore(new Semaphore()),
>  124       _end_semaphore(new Semaphore())
> Seems like the limit has been lost. I would prefer to have it added, 
> but if that's somehow problematic just remove the comment.

Removed the comment.

>
>  140     // Wait for the last worker to signal the coordinator.
>  141     _end_semaphore->wait();
>  142
>  143     // No workers are allowed to read the state variables after 
> the coordinator has been signaled.
>  144     _task         = NULL;
>  145     _started      = 0;
>  146     _not_finished = 0;
> Do we need to set _not_finished to 0 or could we instead assert that 
> it already is 0 since that's when _end_semaphore should have been 
> signaled.

Changed into an assert.

> ---
>
> I don't need to see a new webrev for the above changes. Reviewed.

Thanks,
StefanK

>
> Thanks,
> Stefan
>
> On 2015-08-13 19:50, Stefan Karlsson wrote:
>> Hi all,
>>
>> Could I get a second review for the patch below (plus the suggested 
>> removal)?
>>
>> Thanks,
>> StefanK
>>
>> On 2015-07-02 19:58, Stefan Karlsson wrote:
>>> On 2015-07-02 19:43, Jon Masamitsu wrote:
>>>>
>>>>
>>>> On 6/29/2015 2:38 AM, Stefan Karlsson wrote:
>>>>> Hi all,
>>>>>
>>>>> "8087322: Implement a Semaphore utility class" has now been 
>>>>> pushed, so I've updated the patch to reflect the changes.
>>>>>
>>>>> http://cr.openjdk.java.net/~stefank/8087324/webrev.01.delta
>>>>> http://cr.openjdk.java.net/~stefank/8087324/webrev.01
>>>> http://cr.openjdk.java.net/~stefank/8087324/webrev.01/src/share/vm/gc/shared/workgroup.hpp.frames.html
>>>>
>>>> Are these used?
>>>>
>>>>   194   void print_worker_started_task(AbstractGangTask* task, uint worker_id);
>>>>   195   void print_worker_finished_task(AbstractGangTask* task, uint worker_id);
>>>
>>> No. I'll remove them.
>>>
>>>>
>>>>
>>>> Rest looks good.  Just one question (just for my education).
>>>>   
>>>> http://cr.openjdk.java.net/~stefank/8087324/webrev.01/src/share/vm/gc/shared/workgroup.cpp.frames.html
>>>>
>>>>   336 void GangWorker::loop() {
>>>>   337   while (true) {
>>>>   338     WorkData data = wait_for_task();
>>>>   339
>>>>   340     run_task(data);
>>>>   341
>>>>   342     signal_task_done();
>>>>   343   }
>>>>   344 }
>>>>
>>>> Does this allow the same thread to execute more than 1
>>>> task ("data" here)?  Meaning if 2 threads are  requested but
>>>> 1 thread is not scheduled to a cpu, will the other thread
>>>> do both chunks of work?
>>>
>>> Yes, that's correct.
>>>
>>> Thanks for reviewing!
>>> StefanK
>>>
>>>> Jon
>>>>
>>>>
>>>>>
>>>>> - The IMPLEMENTS_SEMAPHORE_CLASS define was removed, since all 
>>>>> platforms need to provide a Semaphore implementation.
>>>>>
>>>>> - Removed the need to pass down "max number of workers" to the 
>>>>> Semaphore constructor.
>>>>>
>>>>> - Updated semaphore.hpp include path
>>>>>
>>>>> Thanks,
>>>>> StefanK
>>>>>
>>>>>
>>>>> On 2015-06-12 16:52, Stefan Karlsson wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> The current implementation to distribute tasks to GC worker 
>>>>>> threads often cause long latencies (multiple milliseconds) when 
>>>>>> the threads are started and stopped.
>>>>>>
>>>>>> The main reason is that the worker threads have to fight over the 
>>>>>> Monitor lock when they are woken up from the call to 
>>>>>> Monitor::wait. Another reason is that all worker threads call 
>>>>>> notify_all when they finish a task and there wakes all all 
>>>>>> sleeping worker threads, which will yet again force the worker 
>>>>>> threads to fight over the lock.
>>>>>>
>>>>>> I propose that we use semaphores instead, so that the worker 
>>>>>> threads don't have to fight over a lock when they are woken up.
>>>>>>
>>>>>>
>>>>>> The patches build upon the following patch which introduces a 
>>>>>> Semaphore utility class. This patch will sent out for review on 
>>>>>> the hotspot-dev, since it affects non-GC parts of the code:
>>>>>> http://cr.openjdk.java.net/~stefank/8087322/webrev.00/
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8087322
>>>>>>
>>>>>>
>>>>>> The first patch that I would like to get reviewed is:
>>>>>> http://cr.openjdk.java.net/~stefank/8087323/webrev.00/
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8087323 - Unify and 
>>>>>> split the work gang classes
>>>>>>
>>>>>> It prepares for JDK-8087324, by separating the generic WorkGang 
>>>>>> implementation from the more elaborate YieldingFlexibleWorkGang 
>>>>>> (CMS) implementation. By having this part as a separate patch, I 
>>>>>> hope it will be easier to review JDK-8087324. The patch changes 
>>>>>> the work gang inheritance from:
>>>>>>
>>>>>> AbstractWorkGang
>>>>>>  WorkGang
>>>>>>   FlexibleWorkGang
>>>>>>    YieldingFlexibleWorkGang
>>>>>>
>>>>>> to:
>>>>>>
>>>>>> AbstractWorkGang
>>>>>>  WorkGang
>>>>>>  YieldingFlexibleWorkGang
>>>>>>
>>>>>> Parts of the FlexibleWorkGang and WorkGang code that is going to 
>>>>>> be used by both concrete work gang classes, has been moved into 
>>>>>> AbstractWorkGang. I've duplicated some code in WorkGang and 
>>>>>> YieldingFlexibleWorkGang, but that code will be removed from 
>>>>>> WorkGang in the following patch.
>>>>>>
>>>>>>
>>>>>> The second patch I'd like to get reviewed is:
>>>>>> http://cr.openjdk.java.net/~stefank/8087324/webrev.00/
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8087324 - Use semaphores 
>>>>>> when starting and stopping GC task threads
>>>>>>
>>>>>> It first simplifies the way we distribute the tasks to the GC 
>>>>>> worker threads. For example, the coordinator thread dispatches a 
>>>>>> task to a specific number of workers, and then waits for all work 
>>>>>> to be completed. There's no risk that multiple tasks will be 
>>>>>> scheduled simultaneously, so there's no need for the sequences 
>>>>>> number that is used in the current implementation.
>>>>>>
>>>>>> The patch contains two task dispatch / thread synchronization 
>>>>>> implementations:
>>>>>>
>>>>>> The first implementation uses Monitors, similar to what we did 
>>>>>> before the patch, but with a slightly lower overhead since the 
>>>>>> code calls notify_all less often. It still suffers from the 
>>>>>> "thundering heard" problem. When the coordinator thread signals 
>>>>>> that the worker threads should start, they all wake up from 
>>>>>> Monitor::wait and they all try to lock the Monitor.
>>>>>>
>>>>>> The second, and the more interesting, implementation uses 
>>>>>> semaphores. When the worker threads wake up from the semaphore 
>>>>>> wait, they don't have to serialize the execution by taking a 
>>>>>> lock. This greatly decreases the time it takes to start and stop 
>>>>>> the worker threads.
>>>>>>
>>>>>> The semaphore implementation is used on all platforms where the 
>>>>>> Semaphore class has been implemented in JDK-8087322. So, on some 
>>>>>> OS:es the code will revert to the Monitor-based solution until a 
>>>>>> Semaphore class has been implemented for that OS. So, porters 
>>>>>> might want to consider implementing the Sempahore class.
>>>>>>
>>>>>> There's also a diagnostic vm option 
>>>>>> (-XX:+/-UseSemaphoreGCThreadsSynchronization) to turn off the 
>>>>>> Semaphore-based implementation, which can be used to debug this 
>>>>>> new code. It's mainly targeted towards support and sustaining 
>>>>>> engineering.
>>>>>>
>>>>>>
>>>>>> The patches have been performance tested on Linux, Solaris, OSX, 
>>>>>> and Windows.
>>>>>>
>>>>>> The effects of the patch can be seen by running benchmarks with 
>>>>>> small young gen sizes, which triggers frequent and short GCs.
>>>>>>
>>>>>> For example, here are runs from the SPECjvm2008 xml.transform 
>>>>>> benchmark with:
>>>>>> -Xmx1g -Xms1g -Xmn64m -XX:+PrintGC -XX:+UseG1GC -jar 
>>>>>> SPECjvm2008.jar -ikv xml.transform -it 30 -wt 30
>>>>>>
>>>>>> I got the following GC times:
>>>>>>
>>>>>>             Average    Median    99.9 percentile Max
>>>>>> Baseline: 8.76ms    8.44 ms   25.9 ms 34.7 ms
>>>>>> Monitor:   6.17 ms 5.88 ms   26.0 ms 49.1 ms
>>>>>> Semaphore: 3.43 ms 3.26 ms   13.4 ms           33.4 ms
>>>>>>
>>>>>> If I run an empty GC task 10 times per GC, by running the 
>>>>>> following code:
>>>>>> http://cr.openjdk.java.net/~stefank/8087324/timedTask/
>>>>>>
>>>>>> I get the following numbers to complete the empty GC tasks:
>>>>>>
>>>>>>             Average    Median    99.9 percentile Max
>>>>>> Baseline: 1.43 ms    0.92 ms   3.43 ms           9.30ms
>>>>>> Monitor:    0.75ms 0.72 ms   1.74 ms           2.78ms
>>>>>> Semaphore: 0.07 ms 0.07 ms   0.17 ms 0.26 ms
>>>>>>
>>>>>>
>>>>>>
>>>>>> The code has been tested with JPRT and our nightly testing suites.
>>>>>>
>>>>>> I've created a unit test to run a small test with both the 
>>>>>> semaphore implementation and the monitor implementation:
>>>>>> http://cr.openjdk.java.net/~stefank/8087324/workgangTest/
>>>>>>
>>>>>> But since we currently don't have code to shutdown worker threads 
>>>>>> after they have been started, I don't want to push this test (or 
>>>>>> clean it up) until we have that in place. I created this bug for 
>>>>>> that:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8087340
>>>>>>
>>>>>> Thanks,
>>>>>> StefanK
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20150817/35dc02ca/attachment.htm>


More information about the hotspot-gc-dev mailing list