CPU-feature detection is broken for new Fujitsu Sparc64-X CPUs
Vladimir Kozlov
vladimir.kozlov at oracle.com
Wed Nov 27 11:22:09 PST 2013
On 11/27/13 10:02 AM, Volker Simonis wrote:
> Hi Vladimir,
>
> thanks a lot for commenting on this issue in the bug report.
>
>
> - Regarding 'UseMemSetInBOT' I agree that we should know how memset() is
> implemented on Solaris/Fujitsu Sparc-X and/or how BIS is working on Fujitsu
> Sparc-X. Could you please find this out. I don't see how I could get this
> information:(
I will try :)
>
> But in order to be on the safe side, I think we should reset UseMemSetInBOT
> until then:
>
> if (is_niagara() || (is_sun4v() && is_sparc64())) {
> // When using CMS or G1, we cannot use memset() in BOT updates
> // because the sun4v/CMT version in libc_psr uses BIS which
> // exposes "phantom zeros" to concurrent readers. See 6948537.
> if (FLAG_IS_DEFAULT(UseMemSetInBOT) && (UseConcMarkSweepGC || UseG1GC))
> {
> FLAG_SET_DEFAULT(UseMemSetInBOT, false);
> }
>
> Currently, this is not done, that's why the second assertion fires. But you
> don't see it in a product build and this really bothers me most because I
> don't want to run into these problems on new Fujitsu machines. What do you
> think?
Totally agree. Why you need 'is_sun4v() &&' ? Can we have sparc64 which
are not sun4v?
> - Regarding 'is_niagara(int features)', would it be fine if I'd change:
>
> static bool is_niagara(int features) { return (features & sun4v_m) != 0; }
>
> to:
>
> static bool is_niagara(int features) { return (features & sun4v_m) != 0 &&
> (features & sparc64_family_m) == 0; }
I prefer this change. Add comment that sun4v_m is true on both.
>
> With this change, I could leave the assertion:
>
> assert((is_T_family(features) == is_niagara(features)), "Niagara should
> be T series");
>
> untouched and it would work on both, Oracle and Fujitsu CPUs.
>
> On the other hand, maybe it would be clearer to remove is_niagara(features)
> altogether and replace create two new function:
>
> static bool is_sparc64(int features) { return (features &
> sparc64_family_m) != 0; }
> static bool is_sun4v(int features) { return (features &
> sun4v_m) != 0; }
>
> Then we could change the assertion to:
>
> assert((is_T_family(features) == is_sun4v(features) ||
> is_sparc64(features)), "Niagara should be T series");
>
> What do you think?
>
>
>
> - Regarding UseBlockCopy and UseBlockZeroing: what do you mean by "'block
> init instructions' could be implemented differently on Fujitsu Sparc-X"?
> Currently we only check 'has_block_zeroing()' for both options, but
> 'has_block_zeroing()' only returns true on T4. On the other hand, Fujitsu
> Sparc64-X supports the 'blk_init' extension - it's just not clear if it
> works like for T1/T2 or like for T4. So depending on this answer we should
> change 'has_block_zeroing()' to also work correctly on Sparc64-X, right?
BIS was since T1. But T4 implementation of BIS instructions GUARANTEE
zeroing of cacheline before storing value (STXA) without fetching data
from memory. It is used only for big arrays (BlockZeroingLowLimit)
because it still need StoreLoad membar (which is expensive). See
MacroAssembler::bis_zeroing(). It used for initializing new arrays
before they are visible to other threads since the state of memory
before membar is undefined.
And I don't know how particular STXA ASI_ST_BLKINIT_PRIMARY works on
Sparc64-X.
Regards,
Vladimir
>
>
>
> On Tue, Nov 26, 2013 at 6:35 PM, Volker Simonis <volker.simonis at gmail.com>wrote:
>
>> Hi,
>>
>> I've just opened "JDK-8029190: VM_Version::determine_features()
>> asserts on Fujitsu Sparc64 CPUs"
>> (https://bugs.openjdk.java.net/browse/JDK-8029190) and would like to
>> get the input of some real SPARC experts on how this could be best
>> fixed:
>>
>> The whole CPU detection for SPARC CPUs seems to be targeted solely to
>> Sun/Oracle SPARC CPUs.
>>
>> If running on a recent Fujitsu Sparc64-X CPU, debug builds immediately
>> crash with:
>>
>> assert(is_T_family(features) == is_niagara(features)) failed: Niagara
>> should be T series
>>
>> This test can not be successful on a Fujitsu Sparc64 CPU because
>> is_niagara(features) tests for the sysinfo string 'SI_MACHINE' which
>> is 'sun4v' on both, Oracle and Fujitsu CPUs while
>> is_T_family(features) checks that the kernel statistics module
>> 'cpu_info.implementation' contains the name "SPARC-T" or "SPARC-M".
>> However, on Fujitsu SPARC64-X CPUs, 'cpu_info.implementation' contains
>> the value "SPARC64-X".
>>
>> This problem can easily be worked around by changing the above assertion
>> to:
>>
>> assert((is_T_family(features) == is_niagara(features)) || (features &
>> sparc64_family_m), "Niagara should be T series");
>>
>> But if we are using a CMS or G1, the VM will immediately crash with
>> the next assertion in arguments.cpp:
>>
>> if (VM_Version::is_sun4v() && UseMemSetInBOT) {
>> assert(!FLAG_IS_DEFAULT(UseMemSetInBOT), "Error");
>>
>> This is again because 'is_sun4v()' is used with the semantics of
>> 'is_niagara()' and in VM_Version::initialize() in vm_version_sparc.cpp
>> we reset 'UseMemSetInBOT' to false if 'is_niagara()' is true:
>>
>> if (is_niagara()) {
>> // When using CMS or G1, we cannot use memset() in BOT updates
>> // because the sun4v/CMT version in libc_psr uses BIS which
>> // exposes "phantom zeros" to concurrent readers. See 6948537.
>> if (FLAG_IS_DEFAULT(UseMemSetInBOT) && (UseConcMarkSweepGC ||
>> UseG1GC)) {
>> FLAG_SET_DEFAULT(UseMemSetInBOT, false);
>> }
>>
>> We could actually fix this in the same way like before, by replacing
>> 'is_sun4v()' with 'is_niagara()' in the assertion, but this would
>> require to make at least the following functions public in
>> vm_version_sparc.hpp:
>>
>> static bool is_T_family(int features) { return (features & T_family_m)
>> != 0; }
>> static bool is_niagara() { return is_T_family(_features); }
>>
>> However the bigger problem is that I have no idea of what the
>> semantics of these attributes (i.e. is_niagara, is_T_family,
>> is_M_family) is for Fujitsu Sparc-X CPUs? E.g. do we need to switch of
>> 'UseMemSetInBOT' as well?
>>
>> There's are also other CPU-features like 'UseBlockCopy' and
>> 'UseBlockZeroing' which are actually detected by looking at the
>> instruction set extensions AND the CPU family. E.g. the above options
>> are only enabled if the CPU supports 'block init instructions' (which
>> is true on both Fujitsu Sparc64-X and new Oracle Sparc-T4 CPUs) AND
>> the CPU family is Sparc-T4 (which is of course only true for Oracle
>> SParc CPUs).
>>
>> Altogether, I think we need a real SPARC expert which knows both, the
>> Oracle and the Fujitsu Sparc CPUs and who can tell us which of the
>> Oracle Sparc features currently used in the HotSpot are available in
>> Fujitsu CPUs as well. Moreover we should unify the various 'has_xxx()'
>> and 'is_xxx()' functions from vm_version_sparc.hpp to work equally
>> well for both CPU types.
>>
>> Thank you and best regards,
>> Volker
>>
>> PS:
>>
>> - the Fujitsu Sparc64-X I have access to returns the following
>> instruction set extensions vector and CPU features:
>>
>> getisax(2) returned: 0x400081f0
>> cpu_info.implementation: SPARC64-X (chipid 0, clock 2800 MHz)
>> Allocation prefetching: PREFETCH at distance 512, 3 lines of 64 bytes
>> PrefetchCopyIntervalInBytes 512
>> PrefetchScanIntervalInBytes 512
>> ContendedPaddingWidth 128
>> CPU:total 16 v9, popc, vis1, vis2, blk_init, sun4v, sparc64
>>
>> - for the Sparc-T4 I've acess to this information looks as follows:
>>
>> getisax(2) returned: 0x3ffe8df0
>> cpu_info.implementation: SPARC-T4 (chipid 0, clock 2848 MHz)
>> Allocation prefetching: BIS at distance 64, 6 lines of 32 bytes
>> PrefetchCopyIntervalInBytes 512
>> PrefetchScanIntervalInBytes 512
>> ContendedPaddingWidth 128
>> CPU:total 64 v9, popc, vis1, vis2, vis3, blk_init, cbcond, sun4v,
>> niagara_plus
>>
>> As you probably know, the full meaning of the instruction set
>> extensions vector can be derived by inspecting
>> /usr/includ/sys/auxv_SPARC.h
>>
More information about the hotspot-dev
mailing list