[jmm-dev] bitwise RMW operators, specifically testAndSetBit/BTS

Fri Jul 15 02:09:05 UTC 2016

I think we are missing an important opportunity by not supporting single-bit RMW operations in VarHandles.

In particular, the x86 "bts" (bit-test-and-test) "btr" (bit-test-and-reset) and "btc" (bit-test-and-clear) are sometimes the right way to modify some data structure, when the alternative is a load and "cmpxchg" in a loop.  The overall costs are probably the same in the best case, but the loop-based idiom has some danger (relative to the single-instruction idiom) of costs stemming from larger code size.

At the JIT level, one can hope that the CAS-based idiom (coded in the current VH API) will be recognized and optimized to a single instruction on x86, but there is a strong risk that this will fail.  It's safer to specify the operation explicitly using a separate VH method.

The particular use case I have in mind is SeqLocks, specifically the writer-enter operation, which needs to change the lock state to "odd", unless it is already "odd", and let the processor know what happened.  An "xadd" cannot do this, but a "cmpxchg" or "bts" can, and the "bts" is preferable.

(In a rather deep sense, getAndAdd is less powerful than testAndSetBit or getAndBitwiseOr,
because op+ is bijective in each argument, while op| is idempotent.  This means that
you can operate bitwise on a structure in such a way that your operation disappears
when the structure is already in some state you are pushing it towards.  Of course,
you also need a way to "exchange" in the previous value, atomically.)

For a parallel discussion among the gcc folk, where they are working on pattern matching
of CAS to BTS, see:
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49244

So, here is a specific proposal:
  https://bugs.openjdk.java.net/browse/JDK-8161444
  VarHandles should provide access bitwise atomics

In a nutshell, testAndSetBitAcquire behaves as if it were built on top of
compareAndExchangeAcquire (which it may on some platforms).
The outgoing value parameter is not a value but a bit position within
the memory value (zero = LSB, range-checked).  On x86 it compiles
to "lock;bts" with appropriate fencing.  It is a great candidate for
building a mutex-enter operation.

For symmetry, I'm also proposing testAndClearBitRelease (or should it
also be Acquire?) and flipAndGetBit (with Volatile ordering since it is likely
to be stand-alone).  But it's just symmetry; testAndSetBitAcquire is the
important part. 

What do folks think?  Is "lock;bts" useless for some reason?  (Note that
the lock prefix is interpreted by modern x86's as a cache transaction
request, just like xadd, with no external signal.)  Or are there no significant
single-bit concurrent structures out there?  I know of two:  SeqLocks and
AtomicMarkableReference (if/when the JVM embraces it).

More background: The SeqLocks are likely to be important for value types
(when they are too large for native hardware atomics, and must be accessed
atomically).  Note that many uncontended value types will still need to be used
with SeqLocks, when structure-tearing must be prevented for one reason
or another.

(Yet more background:  Non-tearability can be demanded by a value type's definition.
If this were not possible, values could not embody invariants that affect security.)

Thanks,
— John

P.S.  For the record here are the important spec. details:

/** 
* Atomically loads the bit at the specified {@code index} in a variable with 
* the memory semantics of {@link #getAcquire}; if the bit is clear, 
* sets it with the memory semantics of {@link #set}; and finally returns 
* the original bit value as a boolean. 
* 
* <p>The variable may be of any primitive type. 
* Bits are numbered from zero, which refers to the arithmetically 
* least-significant bit, to {@code N-1} inclusive, where {@code N} is 
* the number of bits in the variable. Booleans have exactly one bit, 
* while other variables have an appropriate multiple of eight bits. 
* 
* <p>The method signature is of the form {@code (CT, int index)boolean}. 
* 
* <p>The symbolic type descriptor at the call site of {@code testAndSetBitAcquire} 
* must match the access mode type that is the result of calling 
* {@code accessModeType(VarHandle.AccessMode.TEST_AND_SET_BIT_ACQUIRE)} on this 
* VarHandle. 
* 
* @implNote The effects of this method are similar to a call to 
* {@code get} and {@code compareAndExchangeAcquire}, where the new 
* value is obtained from the old value by setting the specified bit. 
* The full effect of {@code testAndSetBitAcquire} would be obtained 
* by retrying the sequence as needed until the bit is either observed 
* to be set, or updated to be set. More efficient implementations may 
* be available on some platforms. 
* 
* @param args the signature-polymorphic parameter list of the form 
* {@code (CT, int index)} 
* , statically represented using varargs. 
* @return a boolean, the original value of the bit (before any update) 
* , statically represented using {@code Object}. 
* @throws UnsupportedOperationException if the access mode is unsupported 
* for this VarHandle. 
* @throws WrongMethodTypeException if the access mode type is not 
* compatible with the caller's symbolic type descriptor. 
* @throws ClassCastException if the access mode type is compatible with the 
* caller's symbolic type descriptor, but a reference cast fails. 
* @throws ClassCastException if the access mode type is compatible with the 
* caller's symbolic type descriptor, but a reference cast fails. 
* @throws IllegalArgumentException if the supplied index is not in the range 
* of zero (inclusive) to the number of bits in the variable (exclusive). 
* @see #getAcquire(Object...) 
* @see #set(Object...) 
* @see #compareAndExchangeAcquire(Object...) 
*/ 
public final native 
@MethodHandle.PolymorphicSignature 
@HotSpotIntrinsicCandidate 
Object testAndSetBitAcquire(Object... args);