RFR (S/M) expose L1_data_cache_line_size for diagnostic/sanity checks (8049717)

Daniel D. Daugherty daniel.daugherty at oracle.com
Mon Jul 14 17:09:51 UTC 2014


On 7/14/14 9:30 AM, Daniel D. Daugherty wrote:
> On 7/14/14 12:18 AM, David Holmes wrote:
>
>> - I noticed this comment in relation to the GlobalVars is out of date:
>>
>> // Performance concern:
>> // OrderAccess::storestore() calls release() which STs 0 into the 
>> global volatile
>> // OrderAccess::Dummy variable.
>>
>> We got rid of the global dummy quite some time ago now.
>
> Will delete the stale comment.

I think we need to do more than delete those three lines. The remaining
comment doesn't read well after just that change

Here's the current comment block:

src/share/vm/runtime/synchronizer.cpp:

394 // Performance concern:
395 // OrderAccess::storestore() calls release() which STs 0 into the 
global volatile
396 // OrderAccess::Dummy variable.  This store is unnecessary for 
correctness.
397 // Many threads STing into a common location causes considerable 
cache migration
398 // or "sloshing" on large SMP system.  As such, I avoid using 
OrderAccess::storestore()
399 // until it's repaired.  In some cases OrderAccess::fence() -- which 
incurs local
400 // latency on the executing processor -- is a better choice as it 
scales on SMP
401 // systems.  See 
http://blogs.sun.com/dave/entry/biased_locking_in_hotspot for a
402 // discussion of coherency costs.  Note that all our current 
reference platforms
403 // provide strong ST-ST order, so the issue is moot on IA32, x64, 
and SPARC.

I think it should be rewritten as:

// Performance concern:
// Many threads STing into a common location can cause considerable cache
// migration or "sloshing" on large SMP system. If a common location cannot
// be avoided, OrderAccess::fence() should be considered -- it incurs local
// latency on the executing processor and is a better choice as it scales
// on SMP systems.
//
// See http://blogs.oracle.com/dave/entry/biased_locking_in_hotspot for
// a discussion of coherency costs. Note that all our current reference
// platforms provide strong ST-ST order, so the issue is moot on IA32,
// x64, and SPARC.


What do you think?

Dan


>
> History:
>
> The OrderAccess::dummy field was created in the original version
> of the src/share/vm/runtime/orderAccess.cpp file:
>
> src/share/vm/runtime/SCCS/s.orderAccess.cpp:
>
> D 1.1 03/07/23 19:01:39 phh 1 0 00012/00000/00000
> MRs:
> COMMENTS:
> date and time created 03/07/23 19:01:39 by phh
>
>
> The field was removed using this changeset:
>
> changeset:   1631:a6bff45449bc
> user:        ysr
> date:        Tue Aug 10 14:53:35 2010 -0700
> summary:     6973570: OrderAccess::storestore() scales poorly on 
> multi-socket x6
> 4 and sparc: cache-line ping-ponging
>
>
> $ hg diff -r 1630 -r 1631 src/share/vm/runtime/orderAccess.hpp
> diff -r 94251661de76 -r a6bff45449bc src/share/vm/runtime/orderAccess.hpp
> --- a/src/share/vm/runtime/orderAccess.hpp      Mon Aug 09 18:03:50 
> 2010 -0700
> +++ b/src/share/vm/runtime/orderAccess.hpp      Tue Aug 10 14:53:35 
> 2010 -0700
> @@ -1,5 +1,5 @@
>  /*
> - * Copyright (c) 2003, 2009, Oracle and/or its affiliates. All rights 
> reserved.
> + * Copyright (c) 2003, 2010, Oracle and/or its affiliates. All rights 
> reserved.
>   * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
>   *
>   * This code is free software; you can redistribute it and/or modify it
> @@ -166,6 +166,12 @@
>  // and release must include a sequence point, usually via a volatile 
> memory
>  // access.  Other ways to guarantee a sequence point are, e.g., use of
>  // indirect calls and linux's __asm__ volatile.
> +// Note: as of 6973570, we have replaced the originally static 
> "dummy" field
> +// (see above) by a volatile store to the stack. All of the versions 
> of the
> +// compilers that we currently use (SunStudio, gcc and VC++) respect the
> +// semantics of volatile here. If you build HotSpot using other
> +// compilers, you may need to verify that no compiler reordering occurs
> +// across the sequence point respresented by the volatile access.
>  //
>  //
>  //                os::is_MP Considered Redundant
> @@ -297,10 +303,6 @@ class OrderAccess : AllStatic {
>    static void     release_store_ptr_fence(volatile intptr_t* p, 
> intptr_t v);
>    static void     release_store_ptr_fence(volatile void*     p, 
> void*    v);
>
> -  // In order to force a memory access, implementations may
> -  // need a volatile externally visible dummy variable.
> -  static volatile intptr_t dummy;
> -
>   private:
>    // This is a helper that invokes the StubRoutines::fence_entry()
>    // routine if it exists, It should only be used by platforms that
>
> 680 2014.07.14 09:25:57 $ hg diff -r 1630 -r 1631 
> src/share/vm/runtime/orderAccess.cpp
> diff -r 94251661de76 -r a6bff45449bc src/share/vm/runtime/orderAccess.cpp
> --- a/src/share/vm/runtime/orderAccess.cpp      Mon Aug 09 18:03:50 
> 2010 -0700
> +++ b/src/share/vm/runtime/orderAccess.cpp      Tue Aug 10 14:53:35 
> 2010 -0700
> @@ -1,5 +1,5 @@
>  /*
> - * Copyright (c) 2003, 2009, Oracle and/or its affiliates. All rights 
> reserved.
> + * Copyright (c) 2003, 2010, Oracle and/or its affiliates. All rights 
> reserved.
>   * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
>   *
>   * This code is free software; you can redistribute it and/or modify it
> @@ -25,8 +25,6 @@
>  # include "incls/_precompiled.incl"
>  # include "incls/_orderAccess.cpp.incl"
>
> -volatile intptr_t OrderAccess::dummy = 0;
> -
>  void OrderAccess::StubRoutines_fence() {
>    // Use a stub if it exists.  It may not exist during bootstrap so do
>    // nothing in that case but assert if no fence code exists after 
> threads have been created



More information about the hotspot-runtime-dev mailing list