RFR: 8087324: Use semaphores when starting and stopping GC task threads
Stefan Johansson
stefan.johansson at oracle.com
Fri Aug 14 12:38:27 UTC 2015
Hi Stefan,
Nice change, looks really good, but I have a few small comments.
In workgroup.hpp:
177 GangTaskDispatcher* dispatcher()const {
Add a space before const.
---
In workgroup.cpp:
122 // Limit the semaphore value to the number of workers.
123 _start_semaphore(new Semaphore()),
124 _end_semaphore(new Semaphore())
Seems like the limit has been lost. I would prefer to have it added, but
if that's somehow problematic just remove the comment.
140 // Wait for the last worker to signal the coordinator.
141 _end_semaphore->wait();
142
143 // No workers are allowed to read the state variables after
the coordinator has been signaled.
144 _task = NULL;
145 _started = 0;
146 _not_finished = 0;
Do we need to set _not_finished to 0 or could we instead assert that it
already is 0 since that's when _end_semaphore should have been signaled.
---
I don't need to see a new webrev for the above changes. Reviewed.
Thanks,
Stefan
On 2015-08-13 19:50, Stefan Karlsson wrote:
> Hi all,
>
> Could I get a second review for the patch below (plus the suggested
> removal)?
>
> Thanks,
> StefanK
>
> On 2015-07-02 19:58, Stefan Karlsson wrote:
>> On 2015-07-02 19:43, Jon Masamitsu wrote:
>>>
>>>
>>> On 6/29/2015 2:38 AM, Stefan Karlsson wrote:
>>>> Hi all,
>>>>
>>>> "8087322: Implement a Semaphore utility class" has now been pushed,
>>>> so I've updated the patch to reflect the changes.
>>>>
>>>> http://cr.openjdk.java.net/~stefank/8087324/webrev.01.delta
>>>> http://cr.openjdk.java.net/~stefank/8087324/webrev.01
>>> http://cr.openjdk.java.net/~stefank/8087324/webrev.01/src/share/vm/gc/shared/workgroup.hpp.frames.html
>>>
>>> Are these used?
>>>
>>> 194 void print_worker_started_task(AbstractGangTask* task, uint worker_id);
>>> 195 void print_worker_finished_task(AbstractGangTask* task, uint worker_id);
>>
>> No. I'll remove them.
>>
>>>
>>>
>>> Rest looks good. Just one question (just for my education).
>>>
>>> http://cr.openjdk.java.net/~stefank/8087324/webrev.01/src/share/vm/gc/shared/workgroup.cpp.frames.html
>>>
>>> 336 void GangWorker::loop() {
>>> 337 while (true) {
>>> 338 WorkData data = wait_for_task();
>>> 339
>>> 340 run_task(data);
>>> 341
>>> 342 signal_task_done();
>>> 343 }
>>> 344 }
>>>
>>> Does this allow the same thread to execute more than 1
>>> task ("data" here)? Meaning if 2 threads are requested but
>>> 1 thread is not scheduled to a cpu, will the other thread
>>> do both chunks of work?
>>
>> Yes, that's correct.
>>
>> Thanks for reviewing!
>> StefanK
>>
>>> Jon
>>>
>>>
>>>>
>>>> - The IMPLEMENTS_SEMAPHORE_CLASS define was removed, since all
>>>> platforms need to provide a Semaphore implementation.
>>>>
>>>> - Removed the need to pass down "max number of workers" to the
>>>> Semaphore constructor.
>>>>
>>>> - Updated semaphore.hpp include path
>>>>
>>>> Thanks,
>>>> StefanK
>>>>
>>>>
>>>> On 2015-06-12 16:52, Stefan Karlsson wrote:
>>>>> Hi all,
>>>>>
>>>>> The current implementation to distribute tasks to GC worker
>>>>> threads often cause long latencies (multiple milliseconds) when
>>>>> the threads are started and stopped.
>>>>>
>>>>> The main reason is that the worker threads have to fight over the
>>>>> Monitor lock when they are woken up from the call to
>>>>> Monitor::wait. Another reason is that all worker threads call
>>>>> notify_all when they finish a task and there wakes all all
>>>>> sleeping worker threads, which will yet again force the worker
>>>>> threads to fight over the lock.
>>>>>
>>>>> I propose that we use semaphores instead, so that the worker
>>>>> threads don't have to fight over a lock when they are woken up.
>>>>>
>>>>>
>>>>> The patches build upon the following patch which introduces a
>>>>> Semaphore utility class. This patch will sent out for review on
>>>>> the hotspot-dev, since it affects non-GC parts of the code:
>>>>> http://cr.openjdk.java.net/~stefank/8087322/webrev.00/
>>>>> https://bugs.openjdk.java.net/browse/JDK-8087322
>>>>>
>>>>>
>>>>> The first patch that I would like to get reviewed is:
>>>>> http://cr.openjdk.java.net/~stefank/8087323/webrev.00/
>>>>> https://bugs.openjdk.java.net/browse/JDK-8087323 - Unify and split
>>>>> the work gang classes
>>>>>
>>>>> It prepares for JDK-8087324, by separating the generic WorkGang
>>>>> implementation from the more elaborate YieldingFlexibleWorkGang
>>>>> (CMS) implementation. By having this part as a separate patch, I
>>>>> hope it will be easier to review JDK-8087324. The patch changes
>>>>> the work gang inheritance from:
>>>>>
>>>>> AbstractWorkGang
>>>>> WorkGang
>>>>> FlexibleWorkGang
>>>>> YieldingFlexibleWorkGang
>>>>>
>>>>> to:
>>>>>
>>>>> AbstractWorkGang
>>>>> WorkGang
>>>>> YieldingFlexibleWorkGang
>>>>>
>>>>> Parts of the FlexibleWorkGang and WorkGang code that is going to
>>>>> be used by both concrete work gang classes, has been moved into
>>>>> AbstractWorkGang. I've duplicated some code in WorkGang and
>>>>> YieldingFlexibleWorkGang, but that code will be removed from
>>>>> WorkGang in the following patch.
>>>>>
>>>>>
>>>>> The second patch I'd like to get reviewed is:
>>>>> http://cr.openjdk.java.net/~stefank/8087324/webrev.00/
>>>>> https://bugs.openjdk.java.net/browse/JDK-8087324 - Use semaphores
>>>>> when starting and stopping GC task threads
>>>>>
>>>>> It first simplifies the way we distribute the tasks to the GC
>>>>> worker threads. For example, the coordinator thread dispatches a
>>>>> task to a specific number of workers, and then waits for all work
>>>>> to be completed. There's no risk that multiple tasks will be
>>>>> scheduled simultaneously, so there's no need for the sequences
>>>>> number that is used in the current implementation.
>>>>>
>>>>> The patch contains two task dispatch / thread synchronization
>>>>> implementations:
>>>>>
>>>>> The first implementation uses Monitors, similar to what we did
>>>>> before the patch, but with a slightly lower overhead since the
>>>>> code calls notify_all less often. It still suffers from the
>>>>> "thundering heard" problem. When the coordinator thread signals
>>>>> that the worker threads should start, they all wake up from
>>>>> Monitor::wait and they all try to lock the Monitor.
>>>>>
>>>>> The second, and the more interesting, implementation uses
>>>>> semaphores. When the worker threads wake up from the semaphore
>>>>> wait, they don't have to serialize the execution by taking a lock.
>>>>> This greatly decreases the time it takes to start and stop the
>>>>> worker threads.
>>>>>
>>>>> The semaphore implementation is used on all platforms where the
>>>>> Semaphore class has been implemented in JDK-8087322. So, on some
>>>>> OS:es the code will revert to the Monitor-based solution until a
>>>>> Semaphore class has been implemented for that OS. So, porters
>>>>> might want to consider implementing the Sempahore class.
>>>>>
>>>>> There's also a diagnostic vm option
>>>>> (-XX:+/-UseSemaphoreGCThreadsSynchronization) to turn off the
>>>>> Semaphore-based implementation, which can be used to debug this
>>>>> new code. It's mainly targeted towards support and sustaining
>>>>> engineering.
>>>>>
>>>>>
>>>>> The patches have been performance tested on Linux, Solaris, OSX,
>>>>> and Windows.
>>>>>
>>>>> The effects of the patch can be seen by running benchmarks with
>>>>> small young gen sizes, which triggers frequent and short GCs.
>>>>>
>>>>> For example, here are runs from the SPECjvm2008 xml.transform
>>>>> benchmark with:
>>>>> -Xmx1g -Xms1g -Xmn64m -XX:+PrintGC -XX:+UseG1GC -jar
>>>>> SPECjvm2008.jar -ikv xml.transform -it 30 -wt 30
>>>>>
>>>>> I got the following GC times:
>>>>>
>>>>> Average Median 99.9 percentile Max
>>>>> Baseline: 8.76ms 8.44 ms 25.9 ms 34.7 ms
>>>>> Monitor: 6.17 ms 5.88 ms 26.0 ms 49.1 ms
>>>>> Semaphore: 3.43 ms 3.26 ms 13.4 ms 33.4 ms
>>>>>
>>>>> If I run an empty GC task 10 times per GC, by running the
>>>>> following code:
>>>>> http://cr.openjdk.java.net/~stefank/8087324/timedTask/
>>>>>
>>>>> I get the following numbers to complete the empty GC tasks:
>>>>>
>>>>> Average Median 99.9 percentile Max
>>>>> Baseline: 1.43 ms 0.92 ms 3.43 ms 9.30ms
>>>>> Monitor: 0.75ms 0.72 ms 1.74 ms 2.78ms
>>>>> Semaphore: 0.07 ms 0.07 ms 0.17 ms 0.26 ms
>>>>>
>>>>>
>>>>>
>>>>> The code has been tested with JPRT and our nightly testing suites.
>>>>>
>>>>> I've created a unit test to run a small test with both the
>>>>> semaphore implementation and the monitor implementation:
>>>>> http://cr.openjdk.java.net/~stefank/8087324/workgangTest/
>>>>>
>>>>> But since we currently don't have code to shutdown worker threads
>>>>> after they have been started, I don't want to push this test (or
>>>>> clean it up) until we have that in place. I created this bug for that:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8087340
>>>>>
>>>>> Thanks,
>>>>> StefanK
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20150814/8cdb4a9f/attachment.htm>
More information about the hotspot-gc-dev
mailing list