Array of booleans
John Rose
john.r.rose at oracle.com
Thu Mar 18 18:44:46 UTC 2021
On Mar 18, 2021, at 11:12 AM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com<mailto:maurizio.cimadamore at oracle.com>> wrote:
If the JVMS were to be tightened (as opposed to just describe what Hotspot does), I would have no issues adding boolean support in the memory access API (and, as a consequence, in the linker API).
Here’s the current status of the JVMS as of Java 9:
https://docs.oracle.com/javase/specs/jvms/se9/html/jvms-6.html#jvms-6.5.bastore
If the arrayref refers to an array whose components are of type byte, then the int value is truncated to a byte and stored as the component of the array indexed by index.
This has always been present in the spec.
Note that byte array elements can be any size of 8 bits or larger.
A JVM *could* (but never does) implement any primitive array
of type boolean, byte, short, or char using 32-bit elements (or
even larger!), as long as the values visible in such elements
are constrained to the range associated with each type.
(On a CPU which cannot implement the JMM for storage units
smaller than 64 bits, a JVM implementation *might* choose to
make all heap variables be 64 bits!)
If the arrayref refers to an array whose components are of type boolean, then the int value is narrowed by taking the bitwise AND of value and 1; the result is stored as the component of the array indexed by index.
This change ensures that boolean array elements
(on the Java heap), whatever their format, always
contain either zero or one.
And we also have this old observation in the spec:
The bastore instruction is used to store values into both byte and boolean arrays. In Oracle's Java Virtual Machine implementation, boolean arrays - that is, arrays of type T_BOOLEAN (§2.2, §newarray) - are implemented as arrays of 8-bit values. Other implementations may implement packed boolean arrays; in such implementations the bastore instruction must be able to store boolean values into packed boolean arrays as well as byte values into byte arrays.
A JVM which implements packed boolean arrays
with 1-bit elements will not be distinguishable from
a one which uses some other format, such as 2-bit
or 8-bit or 32-bit elements.
In fact you can’t know whether a JVM is secretly
storing all array elements in 64-bit chunks, for
some odd reason. What you do know is the
nominal size of array elements, and (separately)
the value space of such array elements for any
array type.
Since bastore is used for boolean[] and byte[], I
claim that the nominal size of both array elements
is the expected size of a byte, or 8 bits.
None of the above reasoning depends on HotSpot
implementation.
Moving to the C language, we note that booleans are
stored in one byte. Also, native memory hardware
today bottoms out at one byte (or 32 or 64 bytes),
with bitfields supported but only with a separate
set of primitives.
All this suggests to me that T_BOOLEAN has only one
interpretation on native platforms: A byte whose value
is normalized (when observed by Java code or produced
by Java code) to 1 or 0, using x&1. Unconditioned bytes
(as may be produced via JNI) are conditioned using x!=0.
It’s messy but it’s all there in the specs. Use or toss, as
appropriate!
— John
More information about the panama-dev
mailing list