RFR: 8233073: Make BitMap accessors more memory ordering friendly

Mon Oct 28 15:29:35 UTC 2019

Hi,

I have need for accessors on BitMap being more explicitly memory 
ordering aware, to fix a bug in ZGC for AArch64 
(https://bugs.openjdk.java.net/browse/JDK-8233061).
In particular, I need failed bit sets to still have acquire semantics, 
and I need a getter with acquire semantics.

My intention is to solve the problem by making the relevant BitMap 
accessors accept explicitly passed in memory ordering parameters, and 
utilize them. I draw the line of conservativeness at supporting 
IRIW-consistent loads. Having spent a great deal of time finding a 
single algorithm that breaks due to IRIW-consistency violations, and 
knowing the complexity of algorithms that actually can break due to 
that, I would be *very* surprised if we got anywhere close to that. 
Therefore, acquiring loads are the most conservative loads I support. 
This is explicitly stated in the API, so that anyone that actually 
relies on IRIW consistency in the future can reconsider that, and add a 
mode that fences before loads on nMCA machines.

The main points of controversy with this patch, where I expect people to 
have wildly different opinions and hopefully get at least a little bit 
upset are the following:

1) For the same reason that our implementation of Atomic::cmpxchg does 
not supply both one ordering for success and one for failed CAS, unlike 
the C++11 atomic counter part, I do not do so either in the par_set_bit 
API. In the Atomic API, this was very much intentional, because it is 
tricky to reason about the subtle effects of having relaxed failed CAS 
and conservative success. In fact, it's a bug of precisely that nature I 
am chasing. Therefore, I wish to transfer that same reasoning to the 
par_set_bit API, and not allow passing in a weaker failing memory 
ordering. A consequence of this is that I have made the uses of this API 
more conservative for failed bit flips than it was in the past. However, 
this new API allows relaxing the real pain point of the API: the success 
case (with it's bi-directional full fencing semantics). So I expect it 
can be applied to make RMO architectures happier where it really matters 
in the end. However, I will not attempt to prove that relaxing these 
calls is okay in various places with this patch: that is outside of my 
scope, I'm merely adding API hooks for allowing that.

2) The default strength on the getter is memory_order_relaxed and not 
memory_order_conservative. After looking at uses, I realize it's used 
mostly in single threaded contexts by compiler code, and there is 
seemingly only a single use in the VM that cares about having acquire 
(ZGC, and it's broken today). While letting the frequency of uses decide 
what is the default rather than what is safest is not something I would 
normally do, it does feel like since the norm is so vastly in favour of 
the relaxed variant, I don't want to let the one ZGC use case clutter 
half the VM with explicitly relaxing the load. I am okay with reverting 
that decision if people want me to.

CR:
http://cr.openjdk.java.net/~eosterlund/8233073/webrev.00/

Bug:
https://bugs.openjdk.java.net/browse/JDK-8233073

Thanks,
/Erik