RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64
David Holmes
david.holmes at oracle.com
Fri Oct 14 06:15:43 UTC 2016
On 11/10/2016 12:30 AM, Hiroshi H Horii wrote:
> Hi Thomas, David, and all,
>
>> I think you intended to modify cmpxchg_pre_membar not
>> cmpxchg_post_membar!
>
> The previous patch will change only behavior of cmpxchg_pre_membar. But
No it changed the post-membar:
http://cr.openjdk.java.net/~horii/8154736/webrev.04/src/os_cpu/linux_ppc/vm/atomic_linux_ppc.hpp.cdiff.html
inline void cmpxchg_post_membar(cmpxchg_memory_order order) {
! if (order == memory_order_conservative) {
__asm__ __volatile__ (
/* fence */
strasm_sync
);
}
but I did get confused in what I wrote previously. Given the release
must come before the store the pre barrier must be the one that does
that - as per latest code.
> the patch is not good to be reviewed (it was not obvious) and Martin
> suggested me to use lwsync rather than sync.
> I created a new webrev. This webrev includes all points that David and
> Thomas pointed also.
>
> http://cr.openjdk.java.net/~horii/8154736/webrev.05/
>
> With this change, callers of copy_to_survivor_space can safely touch
> fields of returned obj because OrderAccess::acquire() is called in
> copy_to_survivor_space when CAS fails.
So the intent is that the acquire() pairs with the release() semantics
of the cmpxchg store that succeeded in the other thread. That makes
sense, though I really have to question the trade-off in code complexity
and understandability against any performance gain due to a slight
reduction in the barrier strengths. Do you have any metrics on this
latest version?
>> Changes in shared code must be algorithmically correct on all platforms.
>> Not just "it will work fine today".
>>
>> Given all then work being done to add missing barriers, removing them
>> must come with a detailed analysis establishing the safety of doing so.
>> And I am not seeing that here.
>
> The latest codes in the repository are missing some calls of
> OrderAccess::acquire() before touching fileds of new_obj or
> o->forwardee() in PSPromotionManager::copy_and_push_safe_barrier and
> copy_to_survivor_space respectivey. I believe, this webrev correct them,
> also.
>
> Some methods call forwardee(). However, they don't toruch fields of
> forwardee while copying survived objects to a survivor space.
> PSMarkSweepDecorator::compact()
> PSPromotionManager::process_array_chunk()
> PSPromotionManager::claim_or_forward_internal_depth()
Focusing on the code for now ...
src/os_cpu/linux_ppc/vm/atomic_linux_ppc.hpp
src/os_cpu/aix_ppc/vm/atomic_aix_ppc.hpp
In cmpxchg_post_membar I think it is preferable to maintain the existing
default of a full fence if not specifically a "release" or "relaxed"
operation ie:
inline void cmpxchg_post_membar(cmpxchg_memory_order order) {
if (order == memory_order_release) {
// no post membar
} else if (order != memory_order_relaxed) {
__asm__ __volatile__ (
/* fence */
strasm_sync
);
}
as is done for the pre-membar. If nothing else pre-membar and
post-membar should be consistent in their approach.
---
src/share/vm/gc/parallel/psPromotionManager.cpp
507 // call acquire for reading fields of obj in callers
May I suggest:
// acquire() by cas loser is needed to pair with 'release' of cas winner
// so we can safely access data (eg. fields of obj)
---
src/share/vm/gc/parallel/psPromotionManager.inline.hpp
258 // call acquire for reading fields of new_obj in callers
264 // call acquire for reading fields of new_obj in callers
Same as above.
264 // call acquire for reading fields of new_obj in callers
265 OrderAccess::acquire();
If I'm reading this right this is the else part of:
119 // The same test as "o->is_forwarded()"
120 if (!test_mark->is_marked()) {
and it less clear what the acquire() is pairing with, but presumably it
is still the release of a successful cas_forward_to. But given the
isolation of this code from the modified cas operation I have to wonder
about performance again - how often will we take this path with the new
barrier, compared to the paths with the now modified cas operations?
---
Overall the way this proposal has been presented does not instill me
with great confidence about its correctness, or performance benefit:
1. Replace strong cas with a relaxed cas
Issue: accessing obj fields in logging statement may not be safe.
2. Remove access to obj fields in logging statements
Issue: callee access to obj fields is also unsafe
3. Return NULL on failed cas paths
Issue: changes overall semantics and correctness of returning NULL is
very unclear.
4. Go back to previous code.
Issue: callee access to obj fields is also unsafe
Suggestion from Kim: fully relaxed seems unsafe but using "release"
semantics might be okay [this hypothesis warranted detailed discussion
but there was none]
5. replace relaxed cas with release cas
6. Add in acquire() on cas-losing paths, and general access to forwardee
Aside: we can presumably add back the logging statement in the losing
path now we have the acquire() in place.
So far the only justification for making these changes to the GC code
come from the April discussion [1] where it was stated simply that:
"We've looked at the proposed changes and we are pretty sure that the
cmpxchg done during copy_to_survivor_space in the parallel GC doesn't
require the full fence/acquire semantics." [Martin & Volker]
Reading back through all the emails, including the ones in April, I
_think_ part of the reasoning here is that we're not doing a CAS that
publishes a new object that was just created, but that we have
previously created that object using a full CAS and are now only
updating the markword of another object with a forwarding pointer. The
second cas would not need full fence semantics as the other object is
already visible. However I am not a GC expert and other comments by GC
folk suggest that is not in fact the case, or at least is not
necessarily always the case. So I can not establish that what is being
proposed is correct.
I think the GC experts need to have a discussion to resolve things to
their mutual satisfaction.
Thanks,
David
[1]
http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-April/019079.html
-------
> Regards,
> Hiroshi
> -----------------------
> Hiroshi Horii, Ph.D.
> IBM Research - Tokyo
>
More information about the hotspot-compiler-dev
mailing list