RFR (S): 8161993: 8161993: G1 crashes if active_processor_count changes during startup

David Holmes david.holmes at oracle.com
Thu Jul 21 12:31:29 UTC 2016


Hi Thomas,

On 21/07/2016 9:28 PM, Thomas Schatzl wrote:
> Hi all,
>
>   can I have reviews for this small change that fixes crashes with G1
> when changing the number of available processors while running?
>
> The problem is that some data structures are allocated using the number
> of available processors at startup, and if you later change that number
> (i.e. increase it), g1 uses indices to access these data structures
> beyond their capacity.
>
> The problem is in DirtyCardQueueSet::num_par_ids() that re-queries the
> number of active processors, and then is used e.g. in G1FromCardCache
> to index an array.

To be clear num_par_ids() queries the number of configured processors 
which is constant across the life of the VM. But the number of parallel 
GC threads is determined by the number of active processors available to 
the VM - that value may be less than the number of configured 
processors. (The fact it can change over time doesn't seem to be a 
factor here - only that it may be smaller.)

>
> The proposed solution is to introduce an
> initial_active_processor_count() in the runtime that stays fixed after
> initialization. This corresponds to the suggestion in JDK-8147910, but
> only fixing the immediate crash here.

I'm not really seeing what else JDK-8147910 would want to add other than 
reporting the initial count in the error reporting, so I'd rather see 
this split into two parts - and have both issues closed out. All that 
would involve is committing the os.hpp and os.cpp and any change to 
error reporting under 8147910, and the rest under 8161993. A single 
review thread and combined push is still fine.

That said I would not hide this inside os::set_processor_count the way 
you have. I can see you have done it to avoid changing all the 
os_xxx.cpp files, but the name no longer really fits as it is not a 
simple setter operation any more, and it is odd to pass in one value but 
then internally pull the other one. Albeit more work I would have added 
set_initial_active_processor_count(int initial) and 
initial_active_processor_count(), then have the OS initialization 
methods call both set_processor_count and 
set_initial_active_processor_count. Or it may be there is somewhere in 
the shared os initialization code we could add that, once, before it 
needs to be used for the worker thread calculation. I'd have to look at 
that in more detail tomorrow.

Also if we log the initial_active_count we should log the 
_processor_count too.

Thanks,
David

> I would like to have reviewers from both the gc and runtime team for
> this change; I cc'ed both mailing lists.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8161993
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8161993/webrev/
> Testing:
> manual testing, jprt, running vm.gc right now
>
> Thanks,
>   Thomas
>


More information about the hotspot-runtime-dev mailing list