RFR (S/M) expose L1_data_cache_line_size for diagnostic/sanity checks (8049717)

Daniel D. Daugherty daniel.daugherty at oracle.com
Mon Jul 14 15:30:35 UTC 2014


David,

Thanks for the review! Replies embedded below...


On 7/14/14 12:18 AM, David Holmes wrote:
> Hi Dan,
>
> Looks okay generally, a couple of comments:
>
> - when you print the uint values use %u instead of %d.

Will fix.


> - I noticed this comment in relation to the GlobalVars is out of date:
>
> // Performance concern:
> // OrderAccess::storestore() calls release() which STs 0 into the 
> global volatile
> // OrderAccess::Dummy variable.
>
> We got rid of the global dummy quite some time ago now.

Will delete the stale comment.

History:

The OrderAccess::dummy field was created in the original version
of the src/share/vm/runtime/orderAccess.cpp file:

src/share/vm/runtime/SCCS/s.orderAccess.cpp:

D 1.1 03/07/23 19:01:39 phh 1 0 00012/00000/00000
MRs:
COMMENTS:
date and time created 03/07/23 19:01:39 by phh


The field was removed using this changeset:

changeset:   1631:a6bff45449bc
user:        ysr
date:        Tue Aug 10 14:53:35 2010 -0700
summary:     6973570: OrderAccess::storestore() scales poorly on 
multi-socket x6
4 and sparc: cache-line ping-ponging


$ hg diff -r 1630 -r 1631 src/share/vm/runtime/orderAccess.hpp
diff -r 94251661de76 -r a6bff45449bc src/share/vm/runtime/orderAccess.hpp
--- a/src/share/vm/runtime/orderAccess.hpp      Mon Aug 09 18:03:50 2010 
-0700
+++ b/src/share/vm/runtime/orderAccess.hpp      Tue Aug 10 14:53:35 2010 
-0700
@@ -1,5 +1,5 @@
  /*
- * Copyright (c) 2003, 2009, Oracle and/or its affiliates. All rights 
reserved.
+ * Copyright (c) 2003, 2010, Oracle and/or its affiliates. All rights 
reserved.
   * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
   *
   * This code is free software; you can redistribute it and/or modify it
@@ -166,6 +166,12 @@
  // and release must include a sequence point, usually via a volatile 
memory
  // access.  Other ways to guarantee a sequence point are, e.g., use of
  // indirect calls and linux's __asm__ volatile.
+// Note: as of 6973570, we have replaced the originally static "dummy" 
field
+// (see above) by a volatile store to the stack. All of the versions of the
+// compilers that we currently use (SunStudio, gcc and VC++) respect the
+// semantics of volatile here. If you build HotSpot using other
+// compilers, you may need to verify that no compiler reordering occurs
+// across the sequence point respresented by the volatile access.
  //
  //
  //                os::is_MP Considered Redundant
@@ -297,10 +303,6 @@ class OrderAccess : AllStatic {
    static void     release_store_ptr_fence(volatile intptr_t* p, 
intptr_t v);
    static void     release_store_ptr_fence(volatile void*     p, 
void*    v);

-  // In order to force a memory access, implementations may
-  // need a volatile externally visible dummy variable.
-  static volatile intptr_t dummy;
-
   private:
    // This is a helper that invokes the StubRoutines::fence_entry()
    // routine if it exists, It should only be used by platforms that

680 2014.07.14 09:25:57 $ hg diff -r 1630 -r 1631 
src/share/vm/runtime/orderAccess.cpp
diff -r 94251661de76 -r a6bff45449bc src/share/vm/runtime/orderAccess.cpp
--- a/src/share/vm/runtime/orderAccess.cpp      Mon Aug 09 18:03:50 2010 
-0700
+++ b/src/share/vm/runtime/orderAccess.cpp      Tue Aug 10 14:53:35 2010 
-0700
@@ -1,5 +1,5 @@
  /*
- * Copyright (c) 2003, 2009, Oracle and/or its affiliates. All rights 
reserved.
+ * Copyright (c) 2003, 2010, Oracle and/or its affiliates. All rights 
reserved.
   * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
   *
   * This code is free software; you can redistribute it and/or modify it
@@ -25,8 +25,6 @@
  # include "incls/_precompiled.incl"
  # include "incls/_orderAccess.cpp.incl"

-volatile intptr_t OrderAccess::dummy = 0;
-
  void OrderAccess::StubRoutines_fence() {
    // Use a stub if it exists.  It may not exist during bootstrap so do
    // nothing in that case but assert if no fence code exists after 
threads have been created



>
> - In objectMonitor.hpp we have:
>
> volatile markOop   _header;     // displaced object header word - mark
> void*     volatile _object;     // backward object pointer - strong root
> double SharingPad[1];           // temp to reduce false sharing
> void *  volatile _owner;        // pointer to owning thread OR BasicLock
>
> and we check:
>
> +     if ((offset_owner - offset_header) < cache_line_size) {
>
> but on non-486 x86 the cache line size is a minimum of 32 bytes, and 
> AFAICS _owner and _header are only spaced 16 bytes apart ??? Unless 
> the array has some additional alignment constraints that insert 
> additional padding?

The correct padding is coming in the first optimization bucket
for the Contended Locking project. That optimization bucket
contains the field reorder and cache line changes.

Dan


>
> Thanks,
> David
>
> On 12/07/2014 6:14 AM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I've tweaked the fix based on Vladimir's feedback.
>>
>> Here is the URL for the second round webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8049717-webrev/1-jdk9-hs-rt/
>>
>> The following files changed between round 0 and round 1:
>>
>> src/cpu/sparc/vm/vm_version_sparc.cpp
>> src/cpu/x86/vm/vm_version_x86.cpp
>> src/cpu/x86/vm/vm_version_x86.hpp
>> src/share/vm/runtime/objectMonitor.cpp
>> src/share/vm/runtime/synchronizer.cpp
>>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan
>>
>>
>> On 7/9/14 10:42 AM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I have the fix for the following bug ready for JDK9 RT_Baseline:
>>>
>>>     JDK-8049717 expose L1_data_cache_line_size for diagnostic/sanity
>>>                 checks
>>>     https://bugs.openjdk.java.net/browse/JDK-8049717
>>>
>>> Here is the URL for the webrev:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8049717-webrev/0-jdk9-hs-rt/
>>>
>>> This fix is a standalone piece from my Contended Locking reorder
>>> and cache-line bucket. I've split it off as an independent bug fix
>>> in order to make the reorder and cache-line bucket more clear.
>>>
>>> Testing:
>>>
>>> - JPRT test jobs
>>> - manual testing of the new output via existing options:
>>>   -XX:+UnlockExperimentalVMOptions -XX:SyncKnobs=Verbose=1
>>>   -XX:+ExecuteInternalVMTests -XX:+VerboseInternalVMTests
>>> - Aurora Adhoc nsk.sajdi and vm.parallel_class_loading as part of
>>>   testing for my Contended Locking reorder and cache-line bucket
>>>
>>> Thanks, in advance, for any comments, questions or suggestions.
>>>
>>> Dan
>>>
>>>
>>>
>>>
>>



More information about the hotspot-runtime-dev mailing list