RFR: 8263107: PSPromotionManager::copy_and_push_safe_barrier needs acquire memory barrier [v2]
Martin Doerr
mdoerr at openjdk.java.net
Wed Jun 9 12:20:15 UTC 2021
On Wed, 9 Jun 2021 11:15:35 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:
>> Please review this change to PSPromotionManager::copy_to_survivor_space
>> (ParallelGC) to remove some redundant work, and to add some missing memory
>> barriers.
>>
>> There are two callers of copy_to_survivor_space, both of which wrap that
>> call with the idiom
>>
>> if obj->is_forwarded() then
>> new_obj = obj->forwardee()
>> else
>> new_obj = copy_to_survivor_space(obj)
>> endif
>>
>> There are problems with this.
>>
>> (1) The first thing copy_to_survivor_space does is check whether the object
>> is already forwarded, and if so then return obj->forwardee_acquire(). The
>> idiom used by the callers is a redundant check, and the redundancy can't be
>> optimized away. It is also missing the acquire barrier that was added by
>> JDK-8154736 after long discussion.
>>
>> (2) It turns out the forwardee_acquire() from JDK-8154736 isn't sufficient
>> after all. The "if is_forwarded() then use forwardee()" idiom is hiding
>> under the abstractions that we're doing two relaxed atomic loads of the mark
>> word, and there is nothing here to prevent the second from reading a value
>> older than that read by the first, with bad consequences. This possibility
>> came up in the discussion of JDK-8154736, but seems to have been either lost
>> or discounted. If you think loads from the same location can't do that, see
>> JDK-8229169 for a counter example.
>>
>> Part of this change involves removing the conditionalization of the calls to
>> copy_to_survivor_space; just call it directly. However, it turns out that
>> some compilers don't inline copy_to_survivor_space because of its size. So
>> we refactored it into two functions, one doing the already marked check and
>> then calling the other to do most of the work. This is enough for the check
>> to be inlined into callers, so we've effectively removed the redundant inner
>> check. Note: This part of the change introduces a large block of whitespace
>> differences due to removal of an if-else and outdenting the body; I recommend
>> using a view that suppresses those when reviewing.
>>
>> The second part of the change involves adding or moving some acquire barriers.
>>
>> (a) For the initial check whether the object is already marked, if it is
>> then add an acquire fence before returning the forwardee. We could instead
>> use a load-acquire to obtain the mark word, but that would be an unneeded
>> acquire barrier on the much more common unmarked case. Also removed
>> forwardee_acquire(), which is no longer used.
>>
>> (b) If the cmpxchg race is lost, add an acquire fence before fetching and
>> returning the forwardee. The failed release-cmpxchg effectively behaves
>> like a relaxed-load, which must preceed the forwardee access and any reads
>> from it.
>>
>> I've also changed to only log copying when actually copied, not when already
>> copied and forwarded. Also changed a guarantee to an assert.
>>
>> I looked at all uses of forwardee() in light of problem (2), and did not
>> find any additional problems. (That doesn't mean there aren't any, just
>> that I didn't spot any. This is low-level atomics, after all.)
>>
>> Testing:
>> mach5 tier1-3,5,7 (tier3,5,7 are where a lot of ParallelGC testing is done).
>> Performance testing showed no significant change.
>
> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision:
>
> avoid reloading forwardee
Thanks a lot for checking! That makes sense.
-------------
PR: https://git.openjdk.java.net/jdk/pull/4371
More information about the hotspot-dev
mailing list