RFR: 8058935: CPU detection gives 0 cores per cpu, 2 threads per core in Amazon EC2 environment

Vladimir Kempik vladimir.kempik at oracle.com
Fri Nov 21 15:31:20 UTC 2014


Hello

Thanks for looking into this.

It's impossible to collect needed data at the moment, the bug isn't 
reproducible now. And cpuid dump I've collected from ec2 virtual machine 
says that supports_processor_topology() should report false now:

static bool supports_processor_topology() {
   return (_cpuid_info.std_max_function >= 0xB) &&
   // eax[4:0] | ebx[0:15] == 0 indicates invalid topology level.
   // Some cpus have max cpuid >= 0xB but do not support processor topology.
   (((_cpuid_info.tpl_cpuidB0_eax & 0x1f) | _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus) != 0);
}


  which comes from this being false:

(((_cpuid_info.tpl_cpuidB0_eax & 0x1f) | _cpuid_info.tpl_cpuidB0_ebx.bits.logical_cpus) != 0);

The check I've added is sanity check to prevent same crashes in future.

Thanks. Vladimir


On 17.11.2014 22:47, Vladimir Kozlov wrote:
> According to next document the cpu has 10 cores (and 2 threads per core):
>
> http://ark.intel.com/products/75275/Intel-Xeon-Processor-E5-2670-v2-25M-Cache-2_50-GHz 
>
>
> hs_err in the bug report reports only 2 processors and next lines are 
> missing:
>
> physical id    : 0
> siblings    : 4
> core id        : 0
> cpu cores    : 4
> apicid        : 0
> initial apicid    : 0
>
> I assume it is some kind of virtual environment with which cpuid 
> topology is not working (at least our code does not work).
> We may missing some checks which indicates that topology is not 
> supported.
> It would be nice if you can put all topology and related cpuid bits 
> from amazon ec2 in bug report.
> Checking for 0 could be fine but if it is not 0 it could be still 
> wrong if topology info is not supported.
>
> Thanks,
> Vladimir
>
> On 11/17/14 8:20 AM, Vladimir Kempik wrote:
>> Hi,
>>
>> Please review patch adding sanity check to cores_per_cpu():
>>
>> http://cr.openjdk.java.net/~vkempik/8058935/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8058935
>>
>> Few months ago we've got reports of java crashing in amazon ec2
>> enviroment (they use Xen).
>> https://bugs.openjdk.java.net/browse/JDK-8058935
>> https://bugs.openjdk.java.net/browse/JDK-8058937
>>
>> JVM args was used to make the crash: -XX:+UnlockCommercialFeatures
>> -XX:+FlightRecorder
>>
>> After investigation I think the crash could only have happened if
>> support_processor_topology() returned true and
>> _cpuid_info.tpl_cpuidB1_ebx.bits.logical_cpus was zero.
>>
>> I wasn't able to reproduce the bug on amazon ec2 cloud in present days.
>>
>> The patch adds sanity check, if cpu topology was used and resulted in 0
>> cores per cpu, then fallback to non-topology variant, which can't result
>> in 0 cores per cpu.
>>
>> Testing: JPRT.
>>
>> Thanks,
>> Vladimir.



More information about the hotspot-runtime-dev mailing list