Primitive boolean array packing
Andrew Haley
aph at redhat.com
Mon Oct 8 15:19:26 UTC 2018
On 10/08/2018 12:16 PM, B. Blaser wrote:
> The current prototype was intended to scrupulously implement what the
> JVMS explicitly allows for baload/bastore instructions.
> But helper intrinsics might be good alternatives too, we could try both.
So I tried it with a rough C++/assembly language test, and it doesn't
make much sense to use this for default boolean arrays. The update
sequence on AArch64 is
0: ldxrb w2, [x0];
and w2, w2, w1
stxrb w3, w2, [x0]
cbnz w3, 0b
(combined with a bunch of uninteresting shifts and logical
operations.) We have to load a byte in exclusive state, AND or OR it
as appropriate, then store it, still in exclusive state. That gets us
the atomicity we need.
10**9 random set operations on an 8 Mbit array take 0.550s; with a
boolean array the time is 0.150s. This is the local overhead of the
load/store exclusive.
> Note that the existing BitSet needs external sync (most probably on
> the whole set) when used in multi-threaded environments.
> With packed baload/bastore instructions or dedicated intrinsics, the
> sync would be on smaller 8-bit blocks or no sync at all on some arch
> as Roman mentioned (AArch64).
It might help. The question is how much we need large bit
arrays. They're certainly useful for things like Bloom filters.
--
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
More information about the jdk-dev
mailing list