RFR JDK-6321472: Add CRC-32C API

Thu Oct 23 08:05:00 UTC 2014

On 10/23/2014 03:52 AM, Staffan Friberg wrote:
> Webrev with these last updates. Added more tests to make sure CRC32C, 
> CRC32 and Checksum default methods all are covered.
>
>     http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.07

Hi Staffan,

Regarding default case:

  168         } else {
  169             byte[] b = new byte[Math.min(buffer.remaining(), 4096)];
  170             while (buffer.hasRemaining()) {
  171                 int length = Math.min(buffer.remaining(), b.length);
  172                 buffer.get(b, 0, length);
  173                 update(b, 0, length);
  174             }
  175         }

Have you tried using get()/getInt() directly on the (ro) ByteBuffer 
instead of copying to byte[] chunks? Intuitively one would expect it 
perform faster if a redundant copy is avoided. Ah, you already told us 
that you plan to use intrinsic for CRC32C in the future, so you want to 
have "addresses" at hand.

A hackish way to avoid copying in this case is to access the byte[] and 
offset using reflection. But this would have to be wrapped with 
doPrivileged() which would worsen performance for small buffers. A way 
to avoid repeated access checks is to do them at class initialization 
time, using MethodHandle(s). For example, something like:

     private static final MethodHandle bbArrayGetter;
     private static final MethodHandle bbArrayOffsetGetter;

     static {
         MethodHandle hbGetter;
         MethodHandle offsetGetter;
         try {
             Field hbField = ByteBuffer.class.getDeclaredField("hb");
             Field offsetField = 
ByteBuffer.class.getDeclaredField("offset");
             AccessController.doPrivileged(new PrivilegedAction<Void>() {
                 @Override
                 public Void run() {
                     hbField.setAccessible(true);
                     offsetField.setAccessible(true);
                     return null;
                 }
             });
             hbGetter = MethodHandles.lookup().unreflectGetter(hbField);
             offsetGetter = 
MethodHandles.lookup().unreflectGetter(offsetField);
         } catch (NoSuchFieldException | IllegalAccessException e) {
             hbGetter = null;
             offsetGetter = null;
         }
         bbArrayGetter = hbGetter;
         bbArrayOffsetGetter = offsetGetter;
     }

     private static byte[] getArrayOrNull(ByteBuffer bb) {
         if (bb.hasArray()) return bb.array();
         if (bbArrayGetter != null) {
             try {
                 return (byte[]) bbArrayGetter.invokeExact(bb);
             } catch (Throwable e) {
                 throw new InternalError(e);
             }
         }
         return null;
     }

     private static int getArrayOffset(ByteBuffer bb) {
         if (bb.hasArray()) return bb.arrayOffset();
         if (bbArrayOffsetGetter != null) {
             try {
                 return (int) bbArrayOffsetGetter.invokeExact(bb);
             } catch (Throwable e) {
                 throw new InternalError(e);
             }
         }
         throw new UnsupportedOperationException();
     }

Regards, Peter

>
> //Staffan
>
> On 10/22/2014 05:37 PM, Stanimir Simeonoff wrote:
>> Hi Staffan,
>>
>> The readonly buffer (ByteBuffer.asReadOnlyBuffer()) don't have 
>> array() "working".
>> You can use "int length = Math.min(buffer.remaining, b.length)" 
>> instead, same with new byte[Math.min(4096, buffer.remaining)]. Using 
>> smaller chunks will be more performance friendly than 
>> allocating/eating up a huge byte[].
>> If you feel like, add a test with a heap bytebuffer.asReadOnlyBuffer().
>>
>> Stanimir
>>
>>
>> On Thu, Oct 23, 2014 at 3:06 AM, Staffan Friberg 
>> <staffan.friberg at oracle.com <mailto:staffan.friberg at oracle.com>> wrote:
>>
>>     Hi,
>>
>>     I was thinking about this earlier when I started writing the patch
>>     and then I forgot about it again. I haven't been able to figure
>>     out when the code will be executed. ByteBuffer is implemented in
>>     such a way  that only the JDK can extend it and as far as I can
>>     tell you can only create 3 types of ByteBuffers (Direct, Mapped
>>     and Heap), all of which will be handled by the more efficient
>>     calls above.
>>
>>     That said just to make the code a bit safer from OOM it is
>>     probably best to update the default method and all current
>>     implementations which all use the same pattern.
>>
>>     A reasonable solution should be the following code
>>
>>                 byte[] b = new byte[(buffer.remaining() < 4096)
>>                         ? buffer.remaining() : 4096];
>>                 while (buffer.hasRemaining()) {
>>                     int length = (buffer.remaining() < b.length)
>>                             ? buffer.remaining() : b.length;
>>                     buffer.get(b, 0, length);
>>                     update(b, 0, length);
>>                 }
>>
>>     Xueming, do you have any further comment?
>>
>>     Regards,
>>     Staffan
>>
>>     On 10/22/2014 03:04 PM, Stanimir Simeonoff wrote:
>>>
>>>
>>>     On Thu, Oct 23, 2014 at 12:10 AM, Bernd Eckenfels
>>>     <ecki at zusammenkunft.net <mailto:ecki at zusammenkunft.net>> wrote:
>>>
>>>         Hello,
>>>
>>>         just a question in the default impl:
>>>
>>>         +        } else {
>>>         +            byte[] b = new byte[rem];
>>>         +            buffer.get(b);
>>>         +            update(b, 0, b.length);
>>>         +        }
>>>
>>>         would it be a good idea to actually put a ceiling on the size
>>>         of the
>>>         array which is processed at once?
>>>     This is an excellent catch.
>>>     Should not be too large, probably 4k or so.
>>>
>>>     Stanimir
>>>
>>>
>>>          Am Tue, 21 Oct 2014 10:28:50 -0700
>>>         schrieb Staffan Friberg <staffan.friberg at oracle.com
>>>         <mailto:staffan.friberg at oracle.com>>:
>>>
>>>         > Hi Peter,
>>>         >
>>>         > Thanks for the comments..
>>>         > >
>>>         > >   217                 if (Unsafe.ADDRESS_SIZE == 4) {
>>>         > >   218                     // On 32 bit platforms read two
>>>         ints
>>>         > > instead of a single 64bit long
>>>         > >
>>>         > > When you're reading from byte[] using Unsafe
>>>         (updateBytes), you
>>>         > > have the option of reading 64bit values on 64bit
>>>         platforms. When
>>>         > > you're reading from DirectByteBuffer memory
>>>         > > (updateDirectByteBuffer), you're only using 32bit reads.
>>>         > I will add a comment in the code for this decision. The
>>>         reason is
>>>         > that read a long results in slightly worse performance in
>>>         this case,
>>>         > in updateBytes it is faster. I was able to get it to run
>>>         slightly
>>>         > faster by working directly with the address instead of
>>>         always adding
>>>         > address + off, but this makes things worse in the 32bit
>>>         case since
>>>         > all calculation will now be using long variables. So using
>>>         the getInt
>>>         > as in the current code feels like the best solution as it
>>>         strikes the
>>>         > best balance between 32 and 64bit. Below is how
>>>         updateByteBuffer
>>>         > looked with the rewrite I mentioned.
>>>         >
>>>         >
>>>         >   ong address = ((DirectBuffer) buffer).address();
>>>         >   crc = updateDirectByteBuffer(crc, address + pos, address
>>>         + limit);
>>>         >
>>>         >
>>>         >       private static int updateDirectByteBuffer(int crc,
>>>         long adr,
>>>         > long end) {
>>>         >
>>>         >          // Do only byte reads for arrays so short they
>>>         can't be
>>>         > aligned if (end - adr >= 8) {
>>>         >
>>>         >              // align on 8 bytes
>>>         >              int alignLength = (8 - (int) (adr & 0x7)) & 0x7;
>>>         >              for (long alignEnd = adr + alignLength; adr <
>>>         alignEnd;
>>>         > adr++) { crc = (crc >>> 8)
>>>         >                          ^ byteTable[(crc ^
>>>         UNSAFE.getByte(adr)) &
>>>         > 0xFF]; }
>>>         >
>>>         >              if (ByteOrder.nativeOrder() ==
>>>         ByteOrder.BIG_ENDIAN) {
>>>         >                  crc = Integer.reverseBytes(crc);
>>>         >              }
>>>         >
>>>         >              // slicing-by-8
>>>         >              for (; adr < (end - Long.BYTES); adr +=
>>>         Long.BYTES) {
>>>         >                  int firstHalf;
>>>         >                  int secondHalf;
>>>         >                  if (Unsafe.ADDRESS_SIZE == 4) {
>>>         >                      // On 32 bit platforms read two ints
>>>         instead of
>>>         > a single 64bit long firstHalf = UNSAFE.getInt(adr);
>>>         >                      secondHalf = UNSAFE.getInt(adr +
>>>         Integer.BYTES);
>>>         >                  } else {
>>>         >                      long value = UNSAFE.getLong(adr);
>>>         >                      if (ByteOrder.nativeOrder() ==
>>>         > ByteOrder.LITTLE_ENDIAN) { firstHalf = (int) value;
>>>         >                          secondHalf = (int) (value >>> 32);
>>>         >                      } else { // ByteOrder.BIG_ENDIAN
>>>         >                          firstHalf = (int) (value >>> 32);
>>>         >                          secondHalf = (int) value;
>>>         >                      }
>>>         >                  }
>>>         >                  crc ^= firstHalf;
>>>         >                  if (ByteOrder.nativeOrder() ==
>>>         > ByteOrder.LITTLE_ENDIAN) { crc = byteTable7[crc & 0xFF]
>>>         >                              ^ byteTable6[(crc >>> 8) & 0xFF]
>>>         >                              ^ byteTable5[(crc >>> 16) & 
>>> 0xFF]
>>>         >                              ^ byteTable4[crc >>> 24]
>>>         >                              ^ byteTable3[secondHalf & 0xFF]
>>>         >                              ^ byteTable2[(secondHalf >>>
>>>         8) & 0xFF]
>>>         >                              ^ byteTable1[(secondHalf >>>
>>>         16) & 0xFF]
>>>         >                              ^ byteTable0[secondHalf >>> 24];
>>>         >                  } else { // ByteOrder.BIG_ENDIAN
>>>         >                      crc = byteTable0[secondHalf & 0xFF]
>>>         >                              ^ byteTable1[(secondHalf >>>
>>>         8) & 0xFF]
>>>         >                              ^ byteTable2[(secondHalf >>>
>>>         16) & 0xFF]
>>>         >                              ^ byteTable3[secondHalf >>> 24]
>>>         >                              ^ byteTable4[crc & 0xFF]
>>>         >                              ^ byteTable5[(crc >>> 8) & 0xFF]
>>>         >                              ^ byteTable6[(crc >>> 16) & 
>>> 0xFF]
>>>         >                              ^ byteTable7[crc >>> 24];
>>>         >                  }
>>>         >              }
>>>         >
>>>         >              if (ByteOrder.nativeOrder() ==
>>>         ByteOrder.BIG_ENDIAN) {
>>>         >                  crc = Integer.reverseBytes(crc);
>>>         >              }
>>>         >          }
>>>         >
>>>         >          // Tail
>>>         >          for (; adr < end; adr++) {
>>>         >              crc = (crc >>> 8)
>>>         >                      ^ byteTable[(crc ^
>>>         UNSAFE.getByte(adr)) & 0xFF];
>>>         >          }
>>>         >
>>>         >          return crc;
>>>         >      }
>>>         >
>>>         >
>>>         > >
>>>         > > Also, in updateBytes, the usage of
>>>         > > Unsafe.ARRAY_INT_INDEX_SCALE/ARRAY_LONG_INDEX_SCALE to
>>>         index a byte
>>>         > > array sounds a little scary. To be ultra portable you
>>>         could check
>>>         > > that ARRAY_BYTE_INDEX_SCALE == 1 first and refuse to use
>>>         Unsafe for
>>>         > > byte arrays if it is not 1. Then use
>>>         Integer.BYTES/Long.BYTES to
>>>         > > manipulate 'offsets' instead. In updateDirectByteBuffer
>>>         it would be
>>>         > > more appropriate to use Integer.BYTES/Long.BYTES too.
>>>         > Good idea. Added a check in the initial if statement and it
>>>         will get
>>>         > automatically optimized away.
>>>         >
>>>         > >   225  firstHalf = (int) (value &
>>>         > > 0xFFFFFFFF); 226            secondHalf = (int) (value
>>>         > > >>> 32); 227            } else { // ByteOrder.BIG_ENDIAN
>>>         > >   228  firstHalf = (int) (value >>> 32);
>>>         > >   229  secondHalf = (int) (value &
>>>         > > 0xFFFFFFFF);
>>>         > >
>>>         > > firstHalf = (int) value; // this is equivalent for line 225
>>>         > > secondHalf = (int) value; // this is equivalent for line 
>>> 229
>>>         > Done.
>>>         >
>>>         > Here is the latest webrev,
>>>         > http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.03
>>> <http://cr.openjdk.java.net/%7Esfriberg/JDK-6321472/webrev.03>
>>>         >
>>>         > Cheers,
>>>         > Staffan
>>>
>>>
>>
>>
>