RFR (S): 8161993: 8161993: G1 crashes if active_processor_count changes during startup

Thomas Schatzl thomas.schatzl at oracle.com
Thu Jul 21 13:05:05 UTC 2016


Hi David,

  thanks for your review.

On Thu, 2016-07-21 at 22:31 +1000, David Holmes wrote:
> Hi Thomas,
> 
> On 21/07/2016 9:28 PM, Thomas Schatzl wrote:
> > 
> > Hi all,
> > 
> >   can I have reviews for this small change that fixes crashes with
> > G1 when changing the number of available processors while running?
> > 
> > The problem is that some data structures are allocated using the
> > number
> > of available processors at startup, and if you later change that
> > number
> > (i.e. increase it), g1 uses indices to access these data structures
> > beyond their capacity.
> > 
> > The problem is in DirtyCardQueueSet::num_par_ids() that re-queries
> > the number of active processors, and then is used e.g. in
> > G1FromCardCache to index an array.
> To be clear num_par_ids() queries the number of configured processors
> which is constant across the life of the VM. But the number of
> parallel GC threads is determined by the number of active processors
> available to the VM - that value may be less than the number of
> configured processors. (The fact it can change over time doesn't seem
> to be a factor here - only that it may be smaller.)
> 
> > 
> > 
> > The proposed solution is to introduce an
> > initial_active_processor_count() in the runtime that stays fixed
> > after initialization. This corresponds to the suggestion in JDK-
> > 8147910, but only fixing the immediate crash here.
> I'm not really seeing what else JDK-8147910 would want to add other
> than reporting the initial count in the error reporting, so I'd
> rather see this split into two parts - and have both issues closed
> out. All that would involve is committing the os.hpp and os.cpp and
> any change to error reporting under 8147910, and the rest under
> 8161993. A single review thread and combined push is still fine.

Okay, can do the split of this CR into two.

I briefly looked through the hs_err log file, but could not find any
particular section where this information would fit well. Any
suggestions?

> That said I would not hide this inside os::set_processor_count the
> way you have. I can see you have done it to avoid changing all the 
> os_xxx.cpp files, but the name no longer really fits as it is not a 
> simple setter operation any more, and it is odd to pass in one value
> but then internally pull the other one. Albeit more work I would have
> added set_initial_active_processor_count(int initial) and 
> initial_active_processor_count(), then have the OS initialization 
> methods call both set_processor_count and 
> set_initial_active_processor_count. Or it may be there is somewhere
> in the shared os initialization code we could add that, once, before
> it needs to be used for the worker thread calculation. I'd have to
> look at that in more detail tomorrow.

The call could be put in the os::init_before_ergo() (or
VMVersion::init_before_ergo()) call.

I would prefer to avoid code duplication.

> Also if we log the initial_active_count we should log the 
> _processor_count too.

I can log both in log messages and the error log - although processor
count information is already in the log.

Thanks,
  Thomas




More information about the hotspot-gc-dev mailing list