RFR: 2178143: VM crashes if the number of bound CPUs changed during runtime

Yumin Qi yumin.qi at oracle.com
Wed Mar 27 16:41:11 UTC 2013


so the warning should be

   if (AssumeMP && !UseSerialGC) {
     if (FLAG_IS_DEFAULT(ParallelGCThreads) && ParallelGCThreads < 2) {
       warning("If the number of processors is expected to increase from 
one, then"
               " you should configure the number of parallel GC threads 
appropriately"
               " using -XX:ParallelGCThreads=N");
     }
   }

So the warning will only be issued when AssumeMP turned on and the ncpus 
=1. If user add cmdline -XX:ParallelGCThreads=1 which is not the default 
value 0, so the warning will not be triggered. I will do a test.

Is this better?

Thanks
Yumin

On 3/26/2013 9:54 PM, David Holmes wrote:
> On 27/03/2013 2:06 PM, Yumin Qi wrote:
>> David,
>>
>> On 3/26/2013 7:49 PM, David Holmes wrote:
>>> Well now you have confused me. I thought checking
>>> FLAG_IS_DEFAULT(ParallelGCThreads) was the right thing to do (the user
>>> has not attempted manage the number of GC threads even though they
>>> recognize the number of processors may start at 1 and increase later).
>>> I do not understand the new <2 test - maybe I really only want 1, if I
>>> asked for that why should I get a warning ???
>>>
>> The previous code is issuing the warning before gc set flags (includes
>> ParallelGCThreads), we don't want with AssumeMP turned on, running on a
>> MP issues the warning since ParallelGCThreads will be not 1 but the
>> number of calculation based on ncpus.  So I moved the check code after
>> GC set flags.  After GC set flags, ParallelGCThreads is not the default
>> value (0) so I check if it is < 2.
>
> I was overlooking the case where you use +AssumeMP but have multiple 
> processors anyway. But if I explicitly set ParallelGCThreads=1 then I 
> don't expect to get a warning. Not sure we can distinguish these cases 
> though.
>
> Again I leave this to you and Jon to finalize.
>
> Thanks,
> David
>
>> If users really want to run with single CPU with -XX:+AssumeMP, giving a
>> warning is reasonable (this is our intention here) --- there is
>> possibility they will increase number of cpus even they don't, that 
>> is OK.
>>
>> Thanks
>> Yumin
>>> But, this is a GC warning (and I think it is misplaced anyway) so if
>>> Jon is happy with this behaviour so be it.
>>>
>>> Cheers,
>>> David
>>>
>>> On 27/03/2013 5:44 AM, Yumin Qi wrote:
>>>> Sorry, copied wrong link. It is at same link.
>>>>
>>>> http://cr.openjdk.java.net/~minqi/2178143/
>>>>
>>>> Thanks
>>>> Yumin
>>>>
>>>> On 3/26/2013 12:42 PM, Daniel D. Daugherty wrote:
>>>>>
>>>>> On 3/26/13 1:17 PM, Yumin Qi wrote:
>>>>>> Hi, Jon
>>>>>>
>>>>>>   I made a change to the warning condition and move it after GC 
>>>>>> flags
>>>>>> are set. I think at this time, ParallelGCThreads has a value != 0.
>>>>>> Please take a look if it is right.  The This is the new webrev:
>>>>>>
>>>>>>   http://javaweb.us.oracle.com/~yqi/webrev/2178143/
>>>>>
>>>>> This webrev isn't accessible outside Oracle...
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>>>
>>>>>>   Tested with platform of ncpus > 1 and no ParallelGCThreads set 
>>>>>> with
>>>>>> turning on AssumeMP.  The warning is gone.
>>>>>>    Also tested with ncpus = 1 and the warning is on too.
>>>>>>
>>>>>> Thanks
>>>>>> Yumin
>>>>>>
>>>>>> On 3/26/2013 8:03 AM, Jon Masamitsu wrote:
>>>>>>> If I set AssumeMP on the command line and nothing else
>>>>>>> and am running on a platform where I get ParallelGCThreads > 1,
>>>>>>> am I going to see the warning?    If yes, that seems too intrusive.
>>>>>>> I can see users hearing about the flag and deciding to include
>>>>>>> it always just to be safe.
>>>>>>>
>>>>>>> Jon
>>>>>>>
>>>>>>> On 3/25/13 11:21 PM, Yumin Qi wrote:
>>>>>>>> You are right, thanks!  I misread the INCLUDE.
>>>>>>>> new web in the same link.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Yumin
>>>>>>>>
>>>>>>>> On 3/25/2013 11:00 PM, David Holmes wrote:
>>>>>>>>> On 26/03/2013 3:56 PM, Yumin Qi wrote:
>>>>>>>>>> David,
>>>>>>>>>>
>>>>>>>>>> On 3/25/2013 10:41 PM, David Holmes wrote:
>>>>>>>>>>> On 26/03/2013 3:30 PM, Yumin Qi wrote:
>>>>>>>>>>>> David and All,
>>>>>>>>>>>>
>>>>>>>>>>>> Changed as suggested by David.
>>>>>>>>>>>>
>>>>>>>>>>>> David, I did not move the warning into set_parallel_gc_flags()
>>>>>>>>>>>> since
>>>>>>>>>>>> except for serial gc, all other GCs will check this flag, so I
>>>>>>>>>>>> moved it
>>>>>>>>>>>> into
>>>>>>>>>>>> #if INCLUDE_ALL_GCS
>>>>>>>>>>>>
>>>>>>>>>>>> before call set GC flags.
>>>>>>>>>>>
>>>>>>>>>>> I didn't realize all the other GCs would utilize
>>>>>>>>>>> ParallelGCThreads.
>>>>>>>>>>> Can we at least exclude it for serial GC case?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Serial GC is set here:
>>>>>>>>>
>>>>>>>>> INCLUDE_ALL_GCS allows all GCs including Serial GC. Otherwise we
>>>>>>>>> only have SerialGC. So you have excluded it for the case where
>>>>>>>>> SerialGC is the only GC built into the VM, but not for the case
>>>>>>>>> where all GCs are available and the user requests UseSerialGC.
>>>>>>>>>
>>>>>>>>> David
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> 3253 #if !INCLUDE_ALL_GCS
>>>>>>>>>> 3254   force_serial_gc();
>>>>>>>>>> 3255 #endif // INCLUDE_ALL_GCS
>>>>>>>>>>
>>>>>>>>>> So after it moved into
>>>>>>>>>> 3283 #if INCLUDE_ALL_GCS
>>>>>>>>>> 3284   if (AssumeMP && FLAG_IS_DEFAULT(ParallelGCThreads)) {
>>>>>>>>>> 3285     warning("If the number of processors is expected to
>>>>>>>>>> increase
>>>>>>>>>> from one, then"
>>>>>>>>>> 3286             " you should configure the number of 
>>>>>>>>>> parallel GC
>>>>>>>>>> threads appropriately"
>>>>>>>>>> 3287             " using -XX:ParallelGCThreads=N");
>>>>>>>>>> 3288   }
>>>>>>>>>> 3289   // Set per-collector flags
>>>>>>>>>> 3290   if (UseParallelGC || UseParallelOldGC) {
>>>>>>>>>> 3291     set_parallel_gc_flags();
>>>>>>>>>> 3292   } else if (UseConcMarkSweepGC) { // should be done before
>>>>>>>>>> ParNew
>>>>>>>>>> check below
>>>>>>>>>> 3293     set_cms_and_parnew_gc_flags();
>>>>>>>>>> 3294   } else if (UseParNewGC) {  // skipped if CMS is set above
>>>>>>>>>> 3295     set_parnew_gc_flags();
>>>>>>>>>> 3296   } else if (UseG1GC) {
>>>>>>>>>> 3297     set_g1_gc_flags();
>>>>>>>>>> 3298   }
>>>>>>>>>> 3299   check_deprecated_gcs();
>>>>>>>>>> 3300 #else // INCLUDE_ALL_GCS
>>>>>>>>>> 3301   assert(verify_serial_gc_flags(), "SerialGC unset");
>>>>>>>>>> 3302 #endif // INCLUDE_ALL_GCS
>>>>>>>>>>
>>>>>>>>>> It is excluded for serial GC.
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Yumin
>>>>>>>>>>
>>>>>>>>>>> Otherwise looks good.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> David
>>>>>>>>>>>
>>>>>>>>>>>> new webrev: http://cr.openjdk.java.net/~minqi/2178143/
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>> Yumin
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 3/25/2013 5:44 PM, David Holmes wrote:
>>>>>>>>>>>>> Hi Yumin,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have a few suggested changes.
>>>>>>>>>>>>>
>>>>>>>>>>>>> First in os.hpp I suggest
>>>>>>>>>>>>>
>>>>>>>>>>>>>  return _processor_count > 1 || AssumeMP ;
>>>>>>>>>>>>>
>>>>>>>>>>>>> That way we don't bypass the assertion and don't encounter
>>>>>>>>>>>>> unnecessary
>>>>>>>>>>>>> overhead on the common path.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In arguments.cpp:
>>>>>>>>>>>>>
>>>>>>>>>>>>> +   if (AssumeMP && FLAG_IS_DEFAULT(ParallelGCThreads)) {
>>>>>>>>>>>>> +     warning("With AssumeMP, recommend run with
>>>>>>>>>>>>> -XX:ParallelGCThreads=<num_of_gc_threads>, where"
>>>>>>>>>>>>> +             "  num_of_gc_threads can be set to number of
>>>>>>>>>>>>> cores");
>>>>>>>>>>>>> +   }
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm not convinced issuing a warning is necessary or desirable
>>>>>>>>>>>>> here.
>>>>>>>>>>>>> With a uniprocessor startup will we even default to using
>>>>>>>>>>>>> parallel GC?
>>>>>>>>>>>>> The above should only be issued if using parallel GC - so
>>>>>>>>>>>>> better to
>>>>>>>>>>>>> move it into set_parallel_gc_flags().
>>>>>>>>>>>>>
>>>>>>>>>>>>> In any case it isn't clear what the user should supply here.
>>>>>>>>>>>>> If they
>>>>>>>>>>>>> use the maximum expected number of cores initially then while
>>>>>>>>>>>>> they
>>>>>>>>>>>>> only have 1 core their VM may not even function adequately 
>>>>>>>>>>>>> due
>>>>>>>>>>>>> to GC
>>>>>>>>>>>>> overhead. I think all you can say here is something like:
>>>>>>>>>>>>>
>>>>>>>>>>>>> warning("If the number of processors is expected to increase
>>>>>>>>>>>>> from one,
>>>>>>>>>>>>> then you should configure the number of parallel GC threads
>>>>>>>>>>>>> appropriately using -XX:ParallelGCThreads=N");
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> In globals.hpp:
>>>>>>>>>>>>>
>>>>>>>>>>>>> +   product(bool, AssumeMP, false,       \
>>>>>>>>>>>>> +           "Assume run on multiple processors always") \
>>>>>>>>>>>>>
>>>>>>>>>>>>> Was there any discussion on making this default to true
>>>>>>>>>>>>> instead?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Suggested re-wording:
>>>>>>>>>>>>>
>>>>>>>>>>>>>  "Instruct the VM to assume multiple processors are 
>>>>>>>>>>>>> available"
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> David
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 26/03/2013 9:15 AM, Yumin Qi wrote:
>>>>>>>>>>>>>> It should be "AssumeMP".
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> /Yumin
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 3/25/2013 3:32 PM, Yumin Qi wrote:
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   New webrev to use "-XX:+AssumMP" flag. Also add 
>>>>>>>>>>>>>>> warning to
>>>>>>>>>>>>>>> recommend
>>>>>>>>>>>>>>> use this flag with "-XX:ParallelGCThreads" to remind 
>>>>>>>>>>>>>>> user to
>>>>>>>>>>>>>>> avoid
>>>>>>>>>>>>>>> running with only one GC thread. This is verified by
>>>>>>>>>>>>>>> Oleksandr with
>>>>>>>>>>>>>>> the test case running on Linux which is not Zone 
>>>>>>>>>>>>>>> configured.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   Same link.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>> Yumin
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 3/20/2013 2:27 PM, Yumin Qi wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2178143:  VM crashes if the number of bound CPUs changed
>>>>>>>>>>>>>>>> during
>>>>>>>>>>>>>>>> runtime.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Situation: Customer first configure only one CPU online 
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> turn
>>>>>>>>>>>>>>>> others offline to run java application, after java program
>>>>>>>>>>>>>>>> started,
>>>>>>>>>>>>>>>> bring more CPUs back online. Since VM started on a single
>>>>>>>>>>>>>>>> CPU,
>>>>>>>>>>>>>>>> os::is_MP() will return false, but after more CPUs
>>>>>>>>>>>>>>>> available, OS
>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>> schedule the app run on multiple CPUs, this caused SEGV in
>>>>>>>>>>>>>>>> various
>>>>>>>>>>>>>>>> places where data consistency was broken. The solution is
>>>>>>>>>>>>>>>> supply a
>>>>>>>>>>>>>>>> flag to assume it is running on MP, so lock is forced 
>>>>>>>>>>>>>>>> to be
>>>>>>>>>>>>>>>> called.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~minqi/2178143/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>> Yumin
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>




More information about the hotspot-gc-dev mailing list