Unsafe: efficiently comparing two byte arrays

Thu Mar 13 16:26:59 UTC 2014

On 03/13/2014 02:19 PM, Paul Sandoz wrote:
> 
> On Mar 13, 2014, at 1:57 PM, Andrew Haley <aph at redhat.com> wrote:
> 
>> On 03/13/2014 12:49 PM, Paul Sandoz wrote:
>>> On Mar 13, 2014, at 1:19 PM, Andrew Haley <aph at redhat.com> wrote:
>>>> On 03/13/2014 11:57 AM, Paul Sandoz wrote:
>>>>> Now i am in two minds to whether to add ByteBuffer.compareUnsigned or add Arrays.compareUnsigned.
>>>>>
>>>>> An explosion of methods on Arrays for all types (plus to/from versions) would be annoying.
>>>>
>>>> Surely it's a no brainer?  All we have to do is add the intrinsics for
>>>> ByteBuffers, then we get fast ByteBuffers and fast array comparisons too.
>>>> I don't see the problem.
>>>
>>> For byte[] comparison, I don't think optimal long access is
>>> sufficient on it's own due to alignment issues. It would be nice if
>>> we can avoid burdening users with alignment issues to ensure the
>>> main comparison loop is efficient.
>>
>> But alignment issues are a feature of the hardware, not of software.
>> The problem right now is that ByteBuffers use a byte-by-byte
>> comparison of longs, even when it isn't needed, because there aren't
>> the intrinsics.
> 
> I was considering the following case:
> 
>         byte[] b1 = ...
>         byte[] b2 = ...
>         // compare from offset
>         ByteBuffer left = ByteBuffer.wrap(b1, 1, b1.length - 1);
>         ByteBuffer right = ByteBuffer.wrap(b2, 1, b2.length - 1);
>         int r = compare(left, right);
> 
>     static int compare(ByteBuffer left, ByteBuffer right) {
>         int minLength = Math.min(left.remaining(), right.remaining());
>         int minWords = minLength / 8;
> 
>         for (int i = 0; i < minWords * 8; i += 8) {
>             long lw = left.getLong(); // Inefficient when unaligned access is not supported by hardware?

Possibly, but HotSpot can do an awful lot with specialization when
offsets are known.

>             long rw = right.getLong(); // Are bounds checks hoisted out? i presume so

Yes.

>             if (lw != rw) {
>                 return Long.compareUnsigned(lw, rw);
>             }
>         }
> 
>         for (int i = minWords * 8; i < minLength; i++) {
>             int result = Byte.toUnsignedInt(left.get()) - Byte.toUnsignedInt(right.get());
>             if (result != 0) {
>                 return result;
>             }
>         }
>         
>         return left.remaining() - right.remaining();
>     }
> 
> Certain Apache code (e.g. see Cassandra) supports comparing byte[]
> arrays with offsets and i am not sure that code is portable [1].
> 
> I was wondering in the case where the hardware does not support
> unaligned access it might be possible to sill optimize by reading
> longs at aligned positions and doing some some extra bit twiddling.

Probably, but given that the offsets of two strings don't have to be
the same, it can get very messy.

>>> A quick solution is to leverage Unsafe within a
>>> ByteBuffer.compareUnsigned method. In fact i believe Unsafe could be
>>> used (as Aleksey says in [1]) for put/get as well, which i presume
>>> is likely to be much easier/quicker to implement. Might be a good
>>> first step until a more superior intrinsics solution is implemented?
>>
>> I still don't get it, sorry.  What can Unsafe do that ByteBuffer
>> intrinsics can't do?
> 
> We can arguably implement it faster [2]

I doubt that very much.  In fact, I would say that it is almost
certainly not true.  HotSpot usually knows, for example, when both
offsets are zero and can generate better code for machines with strict
alignment.

And also, Unsafe has the same speed problems with unaligned data
that an intrinsic would have.

> and i presume it will be simpler too.

And that.

But the key point is this: we want intrinsics for ByteBuffers anyway.

Andrew.