RFR(M): 8033552: Fix missing missing volatile specifiers in CAS operations in GC code
Erik Österlund
erik.osterlund at oracle.com
Tue Sep 13 15:28:52 UTC 2016
Hi Kim,
New webrev with your proposed changes:
http://cr.openjdk.java.net/~eosterlund/8033552/webrev.05/
On 2016-09-13 00:20, Kim Barrett wrote:
>> On Sep 12, 2016, at 11:14 AM, Erik Österlund<erik.osterlund at oracle.com> wrote:
>>
>> Hi Kim,
>>
>> Thank you for your long and thorough review of my slightly too long patch. I have rebased the patch on the latest hotspot sources and split the patch into smaller chunks and CRs.
>> I will start sending them out soon so we can get this over and done with.
>>
>> In general, the card table changes have been backed out so we can defer that to another day. The changes where you had no comments will remain in the CR. The rest have been split out to new CRs.
>>
>> Here is a list of the new related CRs:
>>
>> 1) g1ConcurrentMark is missing volatile specifiers for _finger.
>> Bug:https://bugs.openjdk.java.net/browse/JDK-8165856
>> 2) CMS _overflow_list is missing volatile specifiers.
>> Bug:https://bugs.openjdk.java.net/browse/JDK-8165857
>> 3) heapRegionManager is missing volatile specifier for _claims.
>> Bug:https://bugs.openjdk.java.net/browse/JDK-8165858
>> 4) gcTaskThread is missing volatile specifier and barriers for _time_stamps.
>> Bug:https://bugs.openjdk.java.net/browse/JDK-8165859
>> 5) workgroup classes are missing volatile specifiers for lock-free code.
>> Bug:https://bugs.openjdk.java.net/browse/JDK-8165860
> Thanks for doing this.
>
> However, a proceedural issue: These new CRs have been marked as
> subtasks of the main CR. I'm not sure what happens when the main CR
> is resolved (by pushing something like the updated change set) but
> subtasks remain open?
Hmm, good point. I made them bugs originally but they were changed to
subtasks.
Would it be best to changed them back to bugs maybe?
>> Here is a new webrev for the rebased patch with the above changes and card table changes moved out:
>> http://cr.openjdk.java.net/~eosterlund/8033552/webrev.01/
> ------------------------------------------------------------------------------
> src/share/vm/gc/parallel/mutableSpace.hpp
> 72 HeapWord* volatile* top_addr() { return &_top; }
>
> Body out of alignment with nearby aligned function bodies.
Fixed.
> ------------------------------------------------------------------------------
> src/share/vm/gc/shared/taskqueue.hpp
> 129 void increment() volatile {
> 130 _fields._top = increment_index(_fields._top);
> 131 if (_fields._top == 0) ++_fields._tag;
> 132 }
>
> Perhaps this should be
>
> uint new_top = increment_index(_fields._top);
> _fields._top = new_top;
> if (new_top == 0) ++_fields._tag;
>
> Although if it matters then we probably have other
> problems... e.g. adding a volatile qualifier doesn't make this
> thread-safe. And if it doesn't need to be thread-safe, then why add
> the volatile qualifier?
I do not have strong opinions on this one. It should not matter as you
say, but it seemed more consistent when the other member functions above
were implemented with volatile.
I decided to withdraw that volatile specifier on this one as it seems
you would prefer that.
> ------------------------------------------------------------------------------
> [This question from my initial review seems not covered anywhere. I
> don't know if there is actually a bug here. If there is, it should be
> a new CR.]
>
> src/share/vm/gc/g1/sparsePRT.hpp
> -- no direct comments, however:
>
> src/share/vm/gc/g1/sparsePRT.cpp
>
> Can add_to_expanded_list be called by multiple threads with the same
> argument at the same time? If so, it doesn't look correct. It has a
> comment saying multiple expansions are possible, but should only be
> put on list once. But it's not clear whether that can happen from
> multiple threads at the same time. The check expanded / set expanded
> dance is not thread safe.
I did answer it, but sorry if I was not explicit enough that my comment
was linked to this. Yes it can be called by multiple threads. However, I
can not see what is wrong about that.
If you are thinking about the classic ABA problem, I can not see how
that would be triggered, because the list is operated on in split phases
protected by a safepoint where it's either add-only (outside of
safepoint, while adding sparsePRT data), or remove-only (inside of
safepoint, before card scanning).
Having multiple competing threads only adding or only removing from the
list without explicit ABA-protection mechanisms should be safe.
The only remaining problem is the variable _expanded. And it is changed
only on the current sparsePRT, while under the per-region lock in
OtherRegionsTable::add_reference() calling SparsePRT::add_card, calling
SparsePRT::add_to_expanded_list, ultimately calling add_to_expanded_list
that we are currently looking at.
So the data for the specific SparsePRT is changed under a per-region
lock, and it's added to a list (which can happen concurrent to other
SparsePRTs being expanded), but that is fine as the ABA-problem can't
occur as the entries do not get deleted until after a safepoint
synchronization, when the list is consumed during cleanup before card
scanning.
Thanks for the review,
/Erik
> ------------------------------------------------------------------------------
>
More information about the hotspot-gc-dev
mailing list