RFR (S/M) expose L1_data_cache_line_size for diagnostic/sanity checks (8049717)
Daniel D. Daugherty
daniel.daugherty at oracle.com
Mon Jul 14 15:30:35 UTC 2014
David,
Thanks for the review! Replies embedded below...
On 7/14/14 12:18 AM, David Holmes wrote:
> Hi Dan,
>
> Looks okay generally, a couple of comments:
>
> - when you print the uint values use %u instead of %d.
Will fix.
> - I noticed this comment in relation to the GlobalVars is out of date:
>
> // Performance concern:
> // OrderAccess::storestore() calls release() which STs 0 into the
> global volatile
> // OrderAccess::Dummy variable.
>
> We got rid of the global dummy quite some time ago now.
Will delete the stale comment.
History:
The OrderAccess::dummy field was created in the original version
of the src/share/vm/runtime/orderAccess.cpp file:
src/share/vm/runtime/SCCS/s.orderAccess.cpp:
D 1.1 03/07/23 19:01:39 phh 1 0 00012/00000/00000
MRs:
COMMENTS:
date and time created 03/07/23 19:01:39 by phh
The field was removed using this changeset:
changeset: 1631:a6bff45449bc
user: ysr
date: Tue Aug 10 14:53:35 2010 -0700
summary: 6973570: OrderAccess::storestore() scales poorly on
multi-socket x6
4 and sparc: cache-line ping-ponging
$ hg diff -r 1630 -r 1631 src/share/vm/runtime/orderAccess.hpp
diff -r 94251661de76 -r a6bff45449bc src/share/vm/runtime/orderAccess.hpp
--- a/src/share/vm/runtime/orderAccess.hpp Mon Aug 09 18:03:50 2010
-0700
+++ b/src/share/vm/runtime/orderAccess.hpp Tue Aug 10 14:53:35 2010
-0700
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2003, 2009, Oracle and/or its affiliates. All rights
reserved.
+ * Copyright (c) 2003, 2010, Oracle and/or its affiliates. All rights
reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
@@ -166,6 +166,12 @@
// and release must include a sequence point, usually via a volatile
memory
// access. Other ways to guarantee a sequence point are, e.g., use of
// indirect calls and linux's __asm__ volatile.
+// Note: as of 6973570, we have replaced the originally static "dummy"
field
+// (see above) by a volatile store to the stack. All of the versions of the
+// compilers that we currently use (SunStudio, gcc and VC++) respect the
+// semantics of volatile here. If you build HotSpot using other
+// compilers, you may need to verify that no compiler reordering occurs
+// across the sequence point respresented by the volatile access.
//
//
// os::is_MP Considered Redundant
@@ -297,10 +303,6 @@ class OrderAccess : AllStatic {
static void release_store_ptr_fence(volatile intptr_t* p,
intptr_t v);
static void release_store_ptr_fence(volatile void* p,
void* v);
- // In order to force a memory access, implementations may
- // need a volatile externally visible dummy variable.
- static volatile intptr_t dummy;
-
private:
// This is a helper that invokes the StubRoutines::fence_entry()
// routine if it exists, It should only be used by platforms that
680 2014.07.14 09:25:57 $ hg diff -r 1630 -r 1631
src/share/vm/runtime/orderAccess.cpp
diff -r 94251661de76 -r a6bff45449bc src/share/vm/runtime/orderAccess.cpp
--- a/src/share/vm/runtime/orderAccess.cpp Mon Aug 09 18:03:50 2010
-0700
+++ b/src/share/vm/runtime/orderAccess.cpp Tue Aug 10 14:53:35 2010
-0700
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2003, 2009, Oracle and/or its affiliates. All rights
reserved.
+ * Copyright (c) 2003, 2010, Oracle and/or its affiliates. All rights
reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
@@ -25,8 +25,6 @@
# include "incls/_precompiled.incl"
# include "incls/_orderAccess.cpp.incl"
-volatile intptr_t OrderAccess::dummy = 0;
-
void OrderAccess::StubRoutines_fence() {
// Use a stub if it exists. It may not exist during bootstrap so do
// nothing in that case but assert if no fence code exists after
threads have been created
>
> - In objectMonitor.hpp we have:
>
> volatile markOop _header; // displaced object header word - mark
> void* volatile _object; // backward object pointer - strong root
> double SharingPad[1]; // temp to reduce false sharing
> void * volatile _owner; // pointer to owning thread OR BasicLock
>
> and we check:
>
> + if ((offset_owner - offset_header) < cache_line_size) {
>
> but on non-486 x86 the cache line size is a minimum of 32 bytes, and
> AFAICS _owner and _header are only spaced 16 bytes apart ??? Unless
> the array has some additional alignment constraints that insert
> additional padding?
The correct padding is coming in the first optimization bucket
for the Contended Locking project. That optimization bucket
contains the field reorder and cache line changes.
Dan
>
> Thanks,
> David
>
> On 12/07/2014 6:14 AM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I've tweaked the fix based on Vladimir's feedback.
>>
>> Here is the URL for the second round webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8049717-webrev/1-jdk9-hs-rt/
>>
>> The following files changed between round 0 and round 1:
>>
>> src/cpu/sparc/vm/vm_version_sparc.cpp
>> src/cpu/x86/vm/vm_version_x86.cpp
>> src/cpu/x86/vm/vm_version_x86.hpp
>> src/share/vm/runtime/objectMonitor.cpp
>> src/share/vm/runtime/synchronizer.cpp
>>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan
>>
>>
>> On 7/9/14 10:42 AM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I have the fix for the following bug ready for JDK9 RT_Baseline:
>>>
>>> JDK-8049717 expose L1_data_cache_line_size for diagnostic/sanity
>>> checks
>>> https://bugs.openjdk.java.net/browse/JDK-8049717
>>>
>>> Here is the URL for the webrev:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8049717-webrev/0-jdk9-hs-rt/
>>>
>>> This fix is a standalone piece from my Contended Locking reorder
>>> and cache-line bucket. I've split it off as an independent bug fix
>>> in order to make the reorder and cache-line bucket more clear.
>>>
>>> Testing:
>>>
>>> - JPRT test jobs
>>> - manual testing of the new output via existing options:
>>> -XX:+UnlockExperimentalVMOptions -XX:SyncKnobs=Verbose=1
>>> -XX:+ExecuteInternalVMTests -XX:+VerboseInternalVMTests
>>> - Aurora Adhoc nsk.sajdi and vm.parallel_class_loading as part of
>>> testing for my Contended Locking reorder and cache-line bucket
>>>
>>> Thanks, in advance, for any comments, questions or suggestions.
>>>
>>> Dan
>>>
>>>
>>>
>>>
>>
More information about the hotspot-runtime-dev
mailing list