RFR: 2178143: VM crashes if the number of bound CPUs changed during runtime

Yumin Qi yumin.qi at oracle.com
Wed Mar 20 21:37:05 PDT 2013


Hi, David

On 3/20/2013 6:02 PM, David Holmes wrote:
> On 21/03/2013 7:27 AM, Yumin Qi wrote:
>> Hi, can I have your code review of a small change?
>
> Not really small conceptually. :)
>
> I don't think this form of the fix addresses the underlying issue as 
> discussed in the bug report. If the variable was renamed 
> MinimumNumberOfProcessors, or MinimumAssumedProcessors, then simply 
> using it to turn on is_MP would be okay. Such a flag would suit the 
> initial problem perfectly. Of course any value >2 would be 
> semantically indistinct, so this really acts as a boolean flag - 
> AssumeMP.
>
> The more general NumberOfProcessors approach, which I'm still unsure 
> of, should to me control what is reported for available-processors. 
> That way it would affect everything in the VM, libraries and 
> application code that configures itself based on the number of 
> available processors. The main usecase for that, in my opinion, would 
> be for apps running on large systems but you want to constrain it to 
> using a subset of the physical CPUs (without having to configure 
> processor sets). That is a different kind of problem and a different 
> kind of flag. Using NumberOfProcessors but not having it control 
> anything except is_MP just seems wrong - and using it to replace 
> available_processors is not a complex change.
>
> The VM is not designed for dynamic adaptation of threads/pools so if 
> the number of processors does change dynamically neither of the above 
> options are going to provide solutions to the potential performance 
> problems that will be encountered (too many or too few threads). Any 
> apps that starts on single core (as reported by the OS) is going to be 
> under-provisioned.
>
> David
> -----
>
I think this is only a workaround and not a solution for the specific 
use case, or  there is no perfect solution for it. If customers decided 
to use this flag,  they should be aware that they are ready not to 
consider performance at first.  For using number of available 
processors, we need to add code to get that number, not the one in 
hotspot os::active_processor_count() which will return the number of 
live processors.  So do you think I could just use a flag -XX:+AssumeMP 
work around  this problem? Since GC threads will not be based on this 
assumption, I agree (Harold pointed this out either) that one flag is 
simpler.  In fact, the code for is_MP() is obsolete for today's 
computers, all with multi-cores (even for cell chips) so I think better 
solution is remove call to is_MP() in all places in hotspot. I will 
prepare another webrev with -XX:+AssumeMP for next codereview.

For number of GC Threads:
unsigned int Abstract_VM_Version::nof_parallel_worker_threads(
                                                       unsigned int num,
                                                       unsigned int den,
                                                       unsigned int 
switch_pt) {
   if (FLAG_IS_DEFAULT(ParallelGCThreads)) {
     assert(ParallelGCThreads == 0, "Default ParallelGCThreads is not 0");
     // For very large machines, there are diminishing returns
     // for large numbers of worker threads.  Instead of
     // hogging the whole system, use a fraction of the workers for every
     // processor after the first 8.  For example, on a 72 cpu machine
     // and a chosen fraction of 5/8
     // use 8 + (72 - 8) * (5/8) == 48 worker threads.
     unsigned int ncpus = (unsigned int) os::active_processor_count();
     return (ncpus <= switch_pt) ?
            ncpus :
           (switch_pt + ((ncpus - switch_pt) * num) / den);
   } else {
     return ParallelGCThreads;
   }
}

the call to this function is
unsigned int Abstract_VM_Version::calc_parallel_worker_threads() {
   return nof_parallel_worker_threads(5, 8, 8);
}

We can see that, if active_processor_count is 1, it will return 1, and 
the VM will run with single GC thread. So the better choices maybe:

1) get processor count not active processor count for ParallelGCThreads, 
that is up to decision from GC team.
2) Recommend usage is

   -XX:+AssumeMP -XX:ParallelGCThreads=<number>

Thanks
Yumin

>> 2178143:  VM crashes if the number of bound CPUs changed during runtime.
>>
>> Situation: Customer first configure only one CPU online and turn others
>> offline to run java application, after java program started, bring more
>> CPUs back online. Since VM started on a single CPU, os::is_MP() will
>> return false, but after more CPUs available, OS will schedule the app
>> run on multiple CPUs, this caused SEGV in various places where data
>> consistency was broken. The solution is supply a flag to assume it is
>> running on MP, so lock is forced to be called.
>>
>> http://cr.openjdk.java.net/~minqi/2178143/
>>
>> Thanks
>> Yumin



More information about the hotspot-runtime-dev mailing list