RFR: 8263107: PSPromotionManager::copy_and_push_safe_barrier needs acquire memory barrier
Kim Barrett
kbarrett at openjdk.java.net
Sat Jun 5 03:38:11 UTC 2021
Please review this change to PSPromotionManager::copy_to_survivor_space
(ParallelGC) to remove some redundant work, and to add some missing memory
barriers.
There are two callers of copy_to_survivor_space, both of which wrap that
call with the idiom
if obj->is_forwarded() then
new_obj = obj->forwardee()
else
new_obj = copy_to_survivor_space(obj)
endif
There are problems with this.
(1) The first thing copy_to_survivor_space does is check whether the object
is already forwarded, and if so then return obj->forwardee_acquire(). The
idiom used by the callers is a redundant check, and the redundancy can't be
optimized away. It is also missing the acquire barrier that was added by
JDK-8154736 after long discussion.
(2) It turns out the forwardee_acquire() from JDK-8154736 isn't sufficient
after all. The "if is_forwarded() then use forwardee()" idiom is hiding
under the abstractions that we're doing two relaxed atomic loads of the mark
word, and there is nothing here to prevent the second from reading a value
older than that read by the first, with bad consequences. This possibility
came up in the discussion of JDK-8154736, but seems to have been either lost
or discounted. If you think loads from the same location can't do that, see
JDK-8229169 for a counter example.
Part of this change involves removing the conditionalization of the calls to
copy_to_survivor_space; just call it directly. However, it turns out that
some compilers don't inline copy_to_survivor_space because of its size. So
we refactored it into two functions, one doing the already marked check and
then calling the other to do most of the work. This is enough for the check
to be inlined into callers, so we've effectively removed the redundant inner
check. Note: This part of the change introduces a large block of whitespace
differences due to removal of an if-else and outdenting the body; I recommend
using a view that suppresses those when reviewing.
The second part of the change involves adding or moving some acquire barriers.
(a) For the initial check whether the object is already marked, if it is
then add an acquire fence before returning the forwardee. We could instead
use a load-acquire to obtain the mark word, but that would be an unneeded
acquire barrier on the much more common unmarked case. Also removed
forwardee_acquire(), which is no longer used.
(b) If the cmpxchg race is lost, add an acquire fence before fetching and
returning the forwardee. The failed release-cmpxchg effectively behaves
like a relaxed-load, which must preceed the forwardee access and any reads
from it.
I've also changed to only log copying when actually copied, not when already
copied and forwarded. Also changed a guarantee to an assert.
I looked at all uses of forwardee() in light of problem (2), and did not
find any additional problems. (That doesn't mean there aren't any, just
that I didn't spot any. This is low-level atomics, after all.)
Testing:
mach5 tier1-3,5,7 (tier3,5,7 are where a lot of ParallelGC testing is done).
Performance testing showed no significant change.
-------------
Commit messages:
- more barriers, remove redundent work
Changes: https://git.openjdk.java.net/jdk/pull/4371/files
Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4371&range=00
Issue: https://bugs.openjdk.java.net/browse/JDK-8263107
Stats: 224 lines in 6 files changed: 65 ins; 73 del; 86 mod
Patch: https://git.openjdk.java.net/jdk/pull/4371.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/4371/head:pull/4371
PR: https://git.openjdk.java.net/jdk/pull/4371
More information about the hotspot-dev
mailing list