Better default for ParallelGCThreads and ConcGCThreads by using number of physical cores and CPU mask.
Jungwoo Ha
jwha at google.com
Wed Jan 29 23:39:50 UTC 2014
I renamed the flag and the updated webrev is posted.
http://cr.openjdk.java.net/~jmasa/8028554/webrev.02/
>
> In CMSCollector there is still this code to change the value for
> ConcGCThreads based on AdjustGCThreadsToCores.
>
>
> 639 if (AdjustGCThreadsToCores) {
> 640 FLAG_SET_DEFAULT(ConcGCThreads, ParallelGCThreads / 2);
> 641 } else {
> 642 FLAG_SET_DEFAULT(ConcGCThreads, (3 + ParallelGCThreads) / 4);
> 643 }
>
> Do you think that is needed or can we use the same logic in both cases
> given that ParallelGCThreads has a different value if
> AdjustGCThreadsToCores is enabled.
>
For this part, I still kept the flag. Otherwise, ConcGCThreads number will
be too big when the flag is off & hyperthreading is on.
Jungwoo
On Thu, Jan 16, 2014 at 5:35 AM, Bengt Rutisson
<bengt.rutisson at oracle.com>wrote:
>
> On 2014-01-15 22:29, Jungwoo Ha wrote:
>
> Sounds good to me.
>
>
> Yes, I like ScaleGCThreadsByCores better too.
>
> Bengt
>
>
>
>
> On Wed, Jan 15, 2014 at 10:52 AM, Jon Masamitsu <jon.masamitsu at oracle.com>wrote:
>
>>
>> On 1/15/2014 4:51 AM, Bengt Rutisson wrote:
>>
>>
>> On 2014-01-13 22:39, Jungwoo Ha wrote:
>>
>>
>>
>>>
>>> In CMSCollector there is still this code to change the value for
>>> ConcGCThreads based on AdjustGCThreadsToCores.
>>>
>>>
>>> 639 if (AdjustGCThreadsToCores) {
>>> 640 FLAG_SET_DEFAULT(ConcGCThreads, ParallelGCThreads / 2);
>>> 641 } else {
>>> 642 FLAG_SET_DEFAULT(ConcGCThreads, (3 + ParallelGCThreads) /
>>> 4);
>>> 643 }
>>>
>>> Do you think that is needed or can we use the same logic in both cases
>>> given that ParallelGCThreads has a different value if
>>> AdjustGCThreadsToCores is enabled.
>>>
>>
>> I am happy to just use FLAG_SET_DEFAULT(ConcGCThreads,
>> ParallelGCThreads / 2);
>> The original hotspot code used FLAG_SET_DEFAULT(ConcGCThreads, (3 +
>> ParallelGCThreads) / 4); which I think is somewhat arbitrary.
>> Now that ParallelGCThreads will reduce on some configuration, dividing it
>> into 4 seems to make the ConcGCThreads too small.
>>
>>
>> Hm. Changing to FLAG_SET_DEFAULT(ConcGCThreads, ParallelGCThreads / 2)
>> might be the way to go, but I think that should probably done as a separate
>> change. That way we can performance test it more thoroughly.
>>
>>
>>
>>>
>>> Also, I don't fully understand the name AdjustGCThreadsToCores. In
>>> VM_Version::calc_parallel_worker_threads() for x86 we simply
>>> active_core_count with 2 if this flag is enabled. So, the flag does not
>>> really adjust to the cores. It seems like it is reduces the number of GC
>>> threads. How about calling the flag ReduceGCThreads or something like that?
>>>
>>
>> The flag can be named better. However, ReduceGCThreads doesn't seem to
>> reflect what this flag does.
>> I am pretty bad at naming, so let me summarize what this flag is actually
>> doing.
>>
>> The flag adjusts the GC threads to the number of "available" physical
>> cores reported by /proc filesystem and the CPU mask set by
>> sched_setaffinity.
>> For example, ParallelGCThreads will remain the same regardless of whether
>> hyperthreading is turned on/off.
>> Current hotspot code will have twice more GC threads if hyperthreading is
>> on.
>> Usually, GC causes huge number of cache misses, thus having two GC
>> threads competing for the same physical core hurts the GC throughput.
>> Current hotspot code doesn't consider CPU mask at all.
>> For example, even though the machine has 64 cores, if CPU mask is set for
>> 2 cores, current hotspot calculates the number of GC threads based on 64.
>> Thus, this flag is actually evaluating the number of GC threads to the
>> number of physical cores available for the JVM process.
>>
>>
>> Right. In VM_Version::calc_parallel_worker_threads() we take the value of
>> os::active_core_count() and divide it by 2. I guess this is to reduce the
>> cache issues. But if the flag is called AdjustGCThreadsToCores I would have
>> expected that we set the number of GC threads to be equal to the core
>> count. That's why I suggested "Reduce" in the name.
>>
>>
>> Naming is hard and I am not particularly fond of the name ReduceGCThreads
>> either. But maybe we can try to come up with something else?
>>
>>
>> How about ScaleGCThreadsByCores?
>>
>> Jon
>>
>>
>>
>>
>> I think I pointed this out earlier, but I don't feel comfortable
>>> reviewing the changes in os_linux_x86.cpp. I hope someone from the Runtime
>>> team can review that.
>>>
>>
>> Can you clarify what you meant? /proc & cpu mask is dependent on Linux
>> & x86, and I only tested on that platform.
>> The assumptions I used here is based on the x86 cache architecture.
>>
>>
>> What I was trying to say was that I don't know enough about Linux to be
>> confident that your implementation of os::active_core_count() is the
>> simplest and most stable way to retrieve that information. I'm sure it is
>> good, I am just not the right person to review this piece of the code.
>> That's why I think it would be good if someone from the Runtime team looked
>> at this.
>>
>> Thanks,
>> Bengt
>>
>>
>>
>> Jungwoo
>>
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140129/86d2ee63/attachment.htm>
More information about the hotspot-gc-dev
mailing list