RFR(M): 8033552: Fix missing missing volatile specifiers in CAS operations in GC code
Kim Barrett
kim.barrett at oracle.com
Thu Sep 8 23:22:35 UTC 2016
> On Apr 6, 2016, at 2:54 PM, Erik Österlund <erik.osterlund at oracle.com> wrote:
>
> Hi,
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8033552
> CR: http://cr.openjdk.java.net/~eosterlund/8033552/webrev.00/
>
> Basically, there have been issues with Atomic::* operations being used on fields declared non-volatile. Especially in loop load phi cas phi-style code where the reload of a non-volatile field is not done because the compiler is allowed to assume it has not changed and keep the value in register.
>
> My proposed patch fixes this by declaring all fields used in GC code volatile. As David Holmes said - all such code should use volatile unless there are peculiar reasons for specific code not to for performance reasons.
>
> I also changed the cas phi load phi loops to not reload the field, for the simple reason that it is unnecessary - the CAS already performs the required load, so there is no need to do it again. So whether safe or not do do this with volatiles, a pointless load is a pointless load, so I removed them. (cf. g1ConcurrentMark.cpp and workgroup.cpp)
>
> Testing:
> * JPRT
> * Benchmarked with SPECjbb2005 and SPECjvm2008 to make sure there was no regression
>
> I will need a sponsor to push this.
Finally made my way through (most of) this. Sorry it's taken so long,
but it's a pretty challenging changeset to review.
------------------------------------------------------------------------------
Many of the changes here are to make the contents of card tables
volatile. I think those changes should be separated. I'm not certain
they should even be done, but not in their currently offered form, and
not mixed with a bunch of unrelated changes.
One issue with the changes for card tables is that they introduce a
lot of casts to strip off the volatile qualifier. Some of those could
be eliminated by fixing / extending the presently volatile-free APIs
being called that require those casts. (For example, something like
what was recently done with pointer_delta, adding volatile qualifiers
to its parameters.)
On the assumption these changes will be backed out, I've not reviewed
the following files:
src/cpu/aarch64/vm/aarch64.ad
src/cpu/aarch64/vm/macroAssembler_aarch64.cpp
src/cpu/ppc/vm/c1_Runtime1_ppc.cpp
src/cpu/ppc/vm/macroAssembler_ppc.cpp
src/cpu/ppc/vm/macroAssembler_ppc.hpp
src/cpu/sparc/vm/c1_Runtime1_sparc.cpp
src/cpu/sparc/vm/macroAssembler_sparc.cpp
src/cpu/sparc/vm/macroAssembler_sparc.hpp
src/cpu/sparc/vm/stubGenerator_sparc.cpp
src/share/vm/c1/c1_LIRGenerator.cpp
src/share/vm/gc/cms/parCardTableModRefBS.cpp
src/share/vm/gc/g1/dirtyCardQueue.cpp
src/share/vm/gc/g1/dirtyCardQueue.hpp
src/share/vm/gc/g1/g1CardCounts.cpp
src/share/vm/gc/g1/g1CardCounts.hpp
src/share/vm/gc/g1/g1CollectedHeap.cpp
src/share/vm/gc/g1/g1EvacFailure.cpp
src/share/vm/gc/g1/g1HotCardCache.cpp
src/share/vm/gc/g1/g1HotCardCache.hpp
src/share/vm/gc/g1/g1ParScanThreadState.hpp
src/share/vm/gc/g1/g1RemSet.cpp
src/share/vm/gc/g1/g1RemSet.hpp
src/share/vm/gc/g1/g1SATBCardTableModRefBS.cpp
src/share/vm/gc/g1/heapRegion.cpp
src/share/vm/gc/g1/heapRegion.hpp
src/share/vm/gc/parallel/cardTableExtension.cpp
src/share/vm/gc/parallel/cardTableExtension.hpp
src/share/vm/gc/shared/cardTableModRefBS.cpp
src/share/vm/gc/shared/cardTableModRefBS.hpp
src/share/vm/gc/shared/cardTableModRefBS.inline.hpp
src/share/vm/gc/shared/cardTableModRefBSForCTRS.hpp
src/share/vm/gc/shared/cardTableRS.cpp
-- card table changes only
And I've only reviewed the non-card-table changes in these files:
src/share/vm/jvmci/jvmciCompilerToVM.cpp
src/share/vm/jvmci/jvmciCompilerToVM.hpp
src/share/vm/runtime/vmStructs.cpp
-- mix of card table and other
-- no comments on the non-card-table changes
------------------------------------------------------------------------------
src/share/vm/gc/shared/cardTableRS.hpp
92 volatile jbyte* _last_cur_val_in_gen;
I don't think the volatile qualifier needs to be added here.
The values are only assigned in single-threaded contexts
(initialization, and in younger_refs_iterate from gen_process_roots).
The values are only read by worker threads started after the
assignment (or read by the assigning thread if parallel workers aren't
involved).
Other changes to this file are specific to card tables, which I've not
reviewed.
------------------------------------------------------------------------------
src/share/vm/gc/g1/g1ConcurrentMark.cpp
1924 assert(_finger > finger, "the finger should have moved forward");
1925 // read it again
1926 finger = res;
[Change was to replace "finger = _finger;" with "finger = res;"]
This change doesn't seem necessary. If the change is made, the
preceeding assert and comment also need updating.
------------------------------------------------------------------------------
src/share/vm/gc/cms/cmsOopClosures.hpp
src/share/vm/gc/cms/concurrentMarkSweepGeneration.cpp
-- no comments
------------------------------------------------------------------------------
src/share/vm/gc/cms/parNewGeneration.hpp
326 volatile oop _overflow_list;
Adding volatile to _overflow_list induces the changes to
src/share/vm/oops/oopsHierarchy.hpp
adding many volatile operator overloads for the oop class.
I'm not keen on the oop changes, and that seems like a lot of fannout
for the _overflow_list change. Also, _overflow_list doesn't really
involve oops, more like the shattered remains of oops.
I think a better change would be to not add the overloads to the oop
class, and instead change the type of _overflow_list, perhaps to
something like "HeapWord* volatile", and carry through the adjustments
needed for that.
I think such a change ought to be done under a different CR, with the
relevant changes from this changeset backed out. I *think* the
existing manipulation of _overflow_list has sufficient barriers to
work even in the absence of the volatile qualifier, and that's what
we've been using all along.
------------------------------------------------------------------------------
src/share/vm/gc/cms/concurrentMarkSweepGeneration.hpp
I think CMSCollector::_overflow_list may also need a volatile qualifier.
But see discussion above for parNewGeneration.hpp and
oopsHierarchy.hpp about ParNewGeneration::_overflow_list.
------------------------------------------------------------------------------
src/share/vm/gc/g1/heapRegionManager.cpp
485 _claims = NEW_C_HEAP_ARRAY(uint, _n_regions, mtGC);
486 memset((uint*)_claims, Unclaimed, sizeof(*_claims) * _n_regions);
I would prefer a const_cast to strip away volatile. Could avoid cast by
uint* new_claims = NEW_C_HEAP_ARRAY(...);
memset(new_claims, ...);
_claims = new_claims;
------------------------------------------------------------------------------
src/share/vm/gc/g1/heapRegionManager.hpp
src/share/vm/gc/g1/heapRegionRemSet.cpp
-- no comments
------------------------------------------------------------------------------
src/share/vm/gc/g1/sparsePRT.hpp
-- no direct comments, however:
src/share/vm/gc/g1/sparsePRT.cpp
Can add_to_expanded_list be called by multiple threads with the same
argument at the same time? If so, it doesn't look correct. It has a
comment saying multiple expansions are possible, but should only be
put on list once. But it's not clear whether that can happen from
multiple threads at the same time. The check expanded / set expanded
dance is not thread safe.
------------------------------------------------------------------------------
src/share/vm/gc/parallel/gcTaskThread.hpp
-- no direct comments, however:
src/share/vm/gc/parallel/gcTaskThread.hpp
time_stamp_at() is using a variant of DCLP, but is missing the acquire
on the initial read of _time_stamps.
------------------------------------------------------------------------------
src/share/vm/gc/parallel/mutableSpace.hpp
src/share/vm/gc/parallel/parallelScavengeHeap.hpp
-- no comments
------------------------------------------------------------------------------
src/share/vm/gc/parallel/psCompactionManager.cpp
src/share/vm/gc/parallel/psCompactionManager.hpp
_recycled_bottom and friends were eliminated by
8150994: UseParallelGC fails with UseDynamicNumberOfGCThreads with specjbb2005
------------------------------------------------------------------------------
src/share/vm/gc/parallel/psYoungGen.hpp
src/share/vm/gc/serial/defNewGeneration.cpp
src/share/vm/gc/serial/defNewGeneration.hpp
src/share/vm/gc/shared/collectedHeap.hpp
src/share/vm/gc/shared/genCollectedHeap.cpp
src/share/vm/gc/shared/genCollectedHeap.hpp
src/share/vm/gc/shared/generation.hpp
src/share/vm/gc/shared/taskqueue.hpp
-- no comments
------------------------------------------------------------------------------
src/share/vm/gc/shared/workgroup.hpp
src/share/vm/gc/shared/workgroup.cpp
287 class SubTasksDone: public CHeapObj<mtInternal> {
288 volatile uint* _tasks;
...
290 volatile uint _threads_completed;
Added volatile to member declarations, but corresponding changes in
the cpp file seem to be here:
452 bool SequentialSubTasksDone::is_task_claimed(uint& t) {
464 bool SequentialSubTasksDone::all_tasks_completed() {
So volatile was added to similarly (but not identially) named members
of a different class from where the change is needed. But perhaps
parallel changes are needed for both classes?
------------------------------------------------------------------------------
src/share/vm/gc/parallel/vm_Structs_parallelgc.hpp
-- no comments
------------------------------------------------------------------------------
More information about the hotspot-gc-dev
mailing list