Better default for ParallelGCThreads and ConcGCThreads by using number of physical cores and CPU mask.
Bengt Rutisson
bengt.rutisson at oracle.com
Thu Jan 16 13:35:52 UTC 2014
On 2014-01-15 22:29, Jungwoo Ha wrote:
> Sounds good to me.
Yes, I like ScaleGCThreadsByCores better too.
Bengt
>
>
> On Wed, Jan 15, 2014 at 10:52 AM, Jon Masamitsu
> <jon.masamitsu at oracle.com <mailto:jon.masamitsu at oracle.com>> wrote:
>
>
> On 1/15/2014 4:51 AM, Bengt Rutisson wrote:
>>
>> On 2014-01-13 22:39, Jungwoo Ha wrote:
>>>
>>>
>>> In CMSCollector there is still this code to change the value
>>> for ConcGCThreads based on AdjustGCThreadsToCores.
>>>
>>>
>>> 639 if (AdjustGCThreadsToCores) {
>>> 640 FLAG_SET_DEFAULT(ConcGCThreads, ParallelGCThreads / 2);
>>> 641 } else {
>>> 642 FLAG_SET_DEFAULT(ConcGCThreads, (3 + ParallelGCThreads)
>>> / 4);
>>> 643 }
>>>
>>> Do you think that is needed or can we use the same logic in
>>> both cases given that ParallelGCThreads has a different
>>> value if AdjustGCThreadsToCores is enabled.
>>>
>>>
>>> I am happy to just use FLAG_SET_DEFAULT(ConcGCThreads,
>>> ParallelGCThreads / 2);
>>> The original hotspot code used FLAG_SET_DEFAULT(ConcGCThreads,
>>> (3 + ParallelGCThreads) / 4); which I think is somewhat arbitrary.
>>> Now that ParallelGCThreads will reduce on some configuration,
>>> dividing it into 4 seems to make the ConcGCThreads too small.
>>
>> Hm. Changing to FLAG_SET_DEFAULT(ConcGCThreads, ParallelGCThreads
>> / 2) might be the way to go, but I think that should probably
>> done as a separate change. That way we can performance test it
>> more thoroughly.
>>
>>>
>>>
>>> Also, I don't fully understand the name
>>> AdjustGCThreadsToCores. In
>>> VM_Version::calc_parallel_worker_threads() for x86 we simply
>>> active_core_count with 2 if this flag is enabled. So, the
>>> flag does not really adjust to the cores. It seems like it
>>> is reduces the number of GC threads. How about calling the
>>> flag ReduceGCThreads or something like that?
>>>
>>>
>>> The flag can be named better. However, ReduceGCThreads doesn't
>>> seem to reflect what this flag does.
>>> I am pretty bad at naming, so let me summarize what this flag is
>>> actually doing.
>>>
>>> The flag adjusts the GC threads to the number of "available"
>>> physical cores reported by /proc filesystem and the CPU mask set
>>> by sched_setaffinity.
>>> For example, ParallelGCThreads will remain the same regardless
>>> of whether hyperthreading is turned on/off.
>>> Current hotspot code will have twice more GC threads if
>>> hyperthreading is on.
>>> Usually, GC causes huge number of cache misses, thus having two
>>> GC threads competing for the same physical core hurts the GC
>>> throughput.
>>> Current hotspot code doesn't consider CPU mask at all.
>>> For example, even though the machine has 64 cores, if CPU mask
>>> is set for 2 cores, current hotspot calculates the number of GC
>>> threads based on 64.
>>> Thus, this flag is actually evaluating the number of GC threads
>>> to the number of physical cores available for the JVM process.
>>
>> Right. In VM_Version::calc_parallel_worker_threads() we take the
>> value of os::active_core_count() and divide it by 2. I guess this
>> is to reduce the cache issues. But if the flag is called
>> AdjustGCThreadsToCores I would have expected that we set the
>> number of GC threads to be equal to the core count. That's why I
>> suggested "Reduce" in the name.
>>
>> Naming is hard and I am not particularly fond of the name
>> ReduceGCThreads either. But maybe we can try to come up with
>> something else?
>
> How about ScaleGCThreadsByCores?
>
> Jon
>
>
>>
>>>
>>> I think I pointed this out earlier, but I don't feel
>>> comfortable reviewing the changes in os_linux_x86.cpp. I
>>> hope someone from the Runtime team can review that.
>>>
>>>
>>> Can you clarify what you meant? /proc & cpu mask is dependent on
>>> Linux & x86, and I only tested on that platform.
>>> The assumptions I used here is based on the x86 cache architecture.
>>
>> What I was trying to say was that I don't know enough about Linux
>> to be confident that your implementation of
>> os::active_core_count() is the simplest and most stable way to
>> retrieve that information. I'm sure it is good, I am just not the
>> right person to review this piece of the code. That's why I think
>> it would be good if someone from the Runtime team looked at this.
>>
>> Thanks,
>> Bengt
>>
>>
>>>
>>> Jungwoo
>>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20140116/c465155e/attachment.htm>
More information about the hotspot-gc-dev
mailing list