Bounds Check Elimination with Fast-Range
August Nagro
augustnagro at gmail.com
Thu Dec 19 14:38:49 UTC 2019
One thing I’ve come to realize is that the bit-mixing step in fast-range does not require a high quality hash function. It just needs to distribute the bits so that the probability of falling into the buckets [0, 2^32), [2^32, 2 * 2^32), [2*2^32, 3*2^32), …, is even. This is why I’m so enthusiastic about fibonacci hashing for this operation, since it costs only a single multiply.
To prove it works, take a look at this sample. The integer gFactor is equal to 2^32 / phi, where phi is the Golden Ratio. Note that hash * gFactor is 32-bit multiplication, and we AND with 0xffffffffL for the unsigned multiply with int size. Set::add returns false if the element is present.
import java.util.HashSet;
import java.util.Set;
class Scratch {
public static void main(String[] args) {
int gFactor = -1640531527;
int size = 1_000;
int range = 4_000;
int collisions = 0;
Set<Integer> set = new HashSet<>();
for (int hash = 0; hash < range; ++hash) {
int reduced = (int) (((hash * gFactor) & 0xffffffffL) * size >>> 32);
if (!set.add(reduced)) collisions++;
}
System.out.println("Optimal collisions: " + (range - size));
System.out.println("Actual collisions: " + collisions);
}
}
> On Dec 19, 2019, at 4:31 AM, Andrew Haley <aph at redhat.com> wrote:
>
> On 12/3/19 6:06 AM, John Rose wrote:
>> But what I keep wishing for a good one- or two-cycle instruction
>> that will mix all input bits into all the output bits, so that any
>> change in one input bit is likely to cause cascading changes in
>> many output bits, perhaps even 50% on average. A pair of AES
>> steps is a good example of this.
>
> I've searched for something like this too. However, I
> experimented with two rounds of AES and I didn't get very good
> results. From what I remember, it took at least three or four
> rounds to get decent mixing, let alone full avalanche.
>
> Also, the latency of AES instructions tends to be a few cycles
> and the latency of moving data from integer to vector registers
> is as many as five cycles. So I gave up.
>
> This was a while ago, so probably badly remembered...
>
> --
> Andrew Haley (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>
More information about the hotspot-compiler-dev
mailing list