CPU-feature detection is broken for new Fujitsu Sparc64-X CPUs

Volker Simonis volker.simonis at gmail.com
Wed Nov 27 10:02:53 PST 2013


Hi Vladimir,

thanks a lot for commenting on this issue in the bug report.


- Regarding 'UseMemSetInBOT' I agree that we should know how memset() is
implemented on Solaris/Fujitsu Sparc-X and/or how BIS is working on Fujitsu
Sparc-X. Could you please find this out. I don't see how I could get this
information:(

But in order to be on the safe side, I think we should reset UseMemSetInBOT
until then:

  if (is_niagara() || (is_sun4v() && is_sparc64())) {
    // When using CMS or G1, we cannot use memset() in BOT updates
    // because the sun4v/CMT version in libc_psr uses BIS which
    // exposes "phantom zeros" to concurrent readers. See 6948537.
    if (FLAG_IS_DEFAULT(UseMemSetInBOT) && (UseConcMarkSweepGC || UseG1GC))
{
      FLAG_SET_DEFAULT(UseMemSetInBOT, false);
    }

Currently, this is not done, that's why the second assertion fires. But you
don't see it in a product build and this really bothers me most because I
don't want to run into these problems on new Fujitsu machines. What do you
think?


- Regarding 'is_niagara(int features)', would it be fine if I'd change:

static bool is_niagara(int features)  { return (features & sun4v_m) != 0; }

to:

static bool is_niagara(int features)  { return (features & sun4v_m) != 0 &&
(features & sparc64_family_m) == 0; }

With this change, I could leave the assertion:

  assert((is_T_family(features) == is_niagara(features)), "Niagara should
be T series");

untouched and it would work on both, Oracle and Fujitsu CPUs.

On the other hand, maybe it would be clearer to remove is_niagara(features)
altogether and replace create two new function:

  static bool is_sparc64(int features)              { return (features &
sparc64_family_m) != 0; }
  static bool is_sun4v(int features)                { return (features &
sun4v_m) != 0; }

Then we could change the assertion to:

  assert((is_T_family(features) == is_sun4v(features) ||
is_sparc64(features)), "Niagara should be T series");

What do you think?



- Regarding UseBlockCopy and UseBlockZeroing: what do you mean by "'block
init instructions' could be implemented differently on Fujitsu Sparc-X"?
Currently we only check 'has_block_zeroing()' for both options, but
'has_block_zeroing()' only returns true on T4. On the other hand, Fujitsu
Sparc64-X supports the 'blk_init' extension - it's just not clear if it
works like for T1/T2 or like for T4. So depending on this answer we should
change 'has_block_zeroing()' to also work correctly on Sparc64-X, right?



On Tue, Nov 26, 2013 at 6:35 PM, Volker Simonis <volker.simonis at gmail.com>wrote:

> Hi,
>
> I've just opened "JDK-8029190: VM_Version::determine_features()
> asserts on Fujitsu Sparc64 CPUs"
> (https://bugs.openjdk.java.net/browse/JDK-8029190) and would like to
> get the input of some real SPARC experts on how this could be best
> fixed:
>
> The whole CPU detection for SPARC CPUs seems to be targeted solely to
> Sun/Oracle SPARC CPUs.
>
> If running on a recent Fujitsu Sparc64-X CPU, debug builds immediately
> crash with:
>
>  assert(is_T_family(features) == is_niagara(features)) failed: Niagara
> should be T series
>
> This test can not be successful on a Fujitsu Sparc64 CPU because
> is_niagara(features) tests for the sysinfo string 'SI_MACHINE' which
> is 'sun4v' on both, Oracle and Fujitsu CPUs while
> is_T_family(features) checks that the kernel statistics module
> 'cpu_info.implementation' contains the name "SPARC-T" or "SPARC-M".
> However, on Fujitsu SPARC64-X CPUs, 'cpu_info.implementation' contains
> the value "SPARC64-X".
>
> This problem can easily be worked around by changing the above assertion
> to:
>
> assert((is_T_family(features) == is_niagara(features)) || (features &
> sparc64_family_m), "Niagara should be T series");
>
> But if we are using a CMS or G1, the VM will immediately crash with
> the next assertion in arguments.cpp:
>
>     if (VM_Version::is_sun4v() && UseMemSetInBOT) {
>       assert(!FLAG_IS_DEFAULT(UseMemSetInBOT), "Error");
>
> This is again because 'is_sun4v()' is used with the semantics of
> 'is_niagara()' and in VM_Version::initialize() in vm_version_sparc.cpp
> we reset 'UseMemSetInBOT' to false if 'is_niagara()' is true:
>
>   if (is_niagara()) {
>     // When using CMS or G1, we cannot use memset() in BOT updates
>     // because the sun4v/CMT version in libc_psr uses BIS which
>     // exposes "phantom zeros" to concurrent readers. See 6948537.
>     if (FLAG_IS_DEFAULT(UseMemSetInBOT) && (UseConcMarkSweepGC ||
> UseG1GC)) {
>       FLAG_SET_DEFAULT(UseMemSetInBOT, false);
>     }
>
> We could actually fix this in the same way like before, by replacing
> 'is_sun4v()' with 'is_niagara()' in the assertion, but this would
> require to make at least the following functions public in
> vm_version_sparc.hpp:
>
>   static bool is_T_family(int features) { return (features & T_family_m)
> != 0; }
>   static bool is_niagara() { return is_T_family(_features); }
>
> However the bigger problem is that I have no idea of what the
> semantics of these attributes (i.e. is_niagara, is_T_family,
> is_M_family) is for Fujitsu Sparc-X CPUs? E.g. do we need to switch of
> 'UseMemSetInBOT' as well?
>
> There's are also other CPU-features like 'UseBlockCopy' and
> 'UseBlockZeroing' which are actually detected by looking at the
> instruction set extensions AND the CPU family. E.g. the above options
> are only enabled if the CPU supports 'block init instructions' (which
> is true on both Fujitsu Sparc64-X and new Oracle Sparc-T4 CPUs) AND
> the CPU family is Sparc-T4 (which is of course only true for Oracle
> SParc CPUs).
>
> Altogether, I think we need a real SPARC expert which knows both, the
> Oracle and the Fujitsu Sparc CPUs and who can tell us which of the
> Oracle Sparc features currently used in the HotSpot are available in
> Fujitsu CPUs as well. Moreover we should unify the various 'has_xxx()'
> and 'is_xxx()' functions from vm_version_sparc.hpp to work equally
> well for both CPU types.
>
> Thank you and best regards,
> Volker
>
> PS:
>
> - the Fujitsu Sparc64-X I have access to returns the following
> instruction set extensions vector and CPU features:
>
> getisax(2) returned: 0x400081f0
> cpu_info.implementation: SPARC64-X (chipid 0, clock 2800 MHz)
> Allocation prefetching: PREFETCH at distance 512, 3 lines of 64 bytes
> PrefetchCopyIntervalInBytes 512
> PrefetchScanIntervalInBytes 512
> ContendedPaddingWidth 128
> CPU:total 16 v9, popc, vis1, vis2, blk_init, sun4v, sparc64
>
> - for the Sparc-T4 I've acess to this information looks as follows:
>
> getisax(2) returned: 0x3ffe8df0
> cpu_info.implementation: SPARC-T4 (chipid 0, clock 2848 MHz)
> Allocation prefetching: BIS at distance 64, 6 lines of 32 bytes
> PrefetchCopyIntervalInBytes 512
> PrefetchScanIntervalInBytes 512
> ContendedPaddingWidth 128
> CPU:total 64 v9, popc, vis1, vis2, vis3, blk_init, cbcond, sun4v,
> niagara_plus
>
> As you probably know, the full meaning of the instruction set
> extensions vector can be derived by inspecting
> /usr/includ/sys/auxv_SPARC.h
>


More information about the hotspot-dev mailing list