RFR JDK-6321472: Add CRC-32C API

Tue Oct 21 21:09:37 UTC 2014

Staffan,

Thanks for the package.html update.

Just wonder if it would be better to use

     buffer.remaining(), instead of the buffer.limit() - buffer.position()

in Checksum.udpate(ByteBuffer)'s #implSpec

The rest looks fine for me.

-Sherman

On 10/21/2014 01:11 PM, Staffan Friberg wrote:
> Converted.
>
> http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.05
>
> //Staffan
>
> On 10/21/2014 12:34 PM, Joe Darcy wrote:
>> Hi Staffan,
>>
>> If you are updating package.html, please also hg mv the file to be a package-info.java file with the equivalent javadoc.
>>
>> Thanks,
>>
>> -Joe
>>
>> On 10/21/2014 11:49 AM, Staffan Friberg wrote:
>>> Hi,
>>>
>>> Got an offline comment that the package.html should be update as well to cover CRC-32C.
>>>
>>> Otherwise there are no code changes in this new webrev.
>>>
>>> http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.04
>>>
>>> //Staffan
>>>
>>> On 10/21/2014 10:28 AM, Staffan Friberg wrote:
>>>> Hi Peter,
>>>>
>>>> Thanks for the comments..
>>>>>
>>>>>   217                 if (Unsafe.ADDRESS_SIZE == 4) {
>>>>>   218                     // On 32 bit platforms read two ints instead of a single 64bit long
>>>>>
>>>>> When you're reading from byte[] using Unsafe (updateBytes), you have the option of reading 64bit values on 64bit platforms. When you're reading from DirectByteBuffer memory (updateDirectByteBuffer), you're only using 32bit reads.
>>>> I will add a comment in the code for this decision. The reason is that read a long results in slightly worse performance in this case, in updateBytes it is faster. I was able to get it to run slightly faster by working directly with the address instead of always adding address + off, but this makes things worse in the 32bit case since all calculation will now be using long variables. So using the getInt as in the current code feels like the best solution as it strikes the best balance between 32 and 64bit. Below is how updateByteBuffer looked with the rewrite I mentioned.
>>>>
>>>>
>>>>  ong address = ((DirectBuffer) buffer).address();
>>>>  crc = updateDirectByteBuffer(crc, address + pos, address + limit);
>>>>
>>>>
>>>>      private static int updateDirectByteBuffer(int crc, long adr, long end) {
>>>>
>>>>         // Do only byte reads for arrays so short they can't be aligned
>>>>         if (end - adr >= 8) {
>>>>
>>>>             // align on 8 bytes
>>>>             int alignLength = (8 - (int) (adr & 0x7)) & 0x7;
>>>>             for (long alignEnd = adr + alignLength; adr < alignEnd; adr++) {
>>>>                 crc = (crc >>> 8)
>>>>                         ^ byteTable[(crc ^ UNSAFE.getByte(adr)) & 0xFF];
>>>>             }
>>>>
>>>>             if (ByteOrder.nativeOrder() == ByteOrder.BIG_ENDIAN) {
>>>>                 crc = Integer.reverseBytes(crc);
>>>>             }
>>>>
>>>>             // slicing-by-8
>>>>             for (; adr < (end - Long.BYTES); adr += Long.BYTES) {
>>>>                 int firstHalf;
>>>>                 int secondHalf;
>>>>                 if (Unsafe.ADDRESS_SIZE == 4) {
>>>>                     // On 32 bit platforms read two ints instead of a single 64bit long
>>>>                     firstHalf = UNSAFE.getInt(adr);
>>>>                     secondHalf = UNSAFE.getInt(adr + Integer.BYTES);
>>>>                 } else {
>>>>                     long value = UNSAFE.getLong(adr);
>>>>                     if (ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN) {
>>>>                         firstHalf = (int) value;
>>>>                         secondHalf = (int) (value >>> 32);
>>>>                     } else { // ByteOrder.BIG_ENDIAN
>>>>                         firstHalf = (int) (value >>> 32);
>>>>                         secondHalf = (int) value;
>>>>                     }
>>>>                 }
>>>>                 crc ^= firstHalf;
>>>>                 if (ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN) {
>>>>                     crc = byteTable7[crc & 0xFF]
>>>>                             ^ byteTable6[(crc >>> 8) & 0xFF]
>>>>                             ^ byteTable5[(crc >>> 16) & 0xFF]
>>>>                             ^ byteTable4[crc >>> 24]
>>>>                             ^ byteTable3[secondHalf & 0xFF]
>>>>                             ^ byteTable2[(secondHalf >>> 8) & 0xFF]
>>>>                             ^ byteTable1[(secondHalf >>> 16) & 0xFF]
>>>>                             ^ byteTable0[secondHalf >>> 24];
>>>>                 } else { // ByteOrder.BIG_ENDIAN
>>>>                     crc = byteTable0[secondHalf & 0xFF]
>>>>                             ^ byteTable1[(secondHalf >>> 8) & 0xFF]
>>>>                             ^ byteTable2[(secondHalf >>> 16) & 0xFF]
>>>>                             ^ byteTable3[secondHalf >>> 24]
>>>>                             ^ byteTable4[crc & 0xFF]
>>>>                             ^ byteTable5[(crc >>> 8) & 0xFF]
>>>>                             ^ byteTable6[(crc >>> 16) & 0xFF]
>>>>                             ^ byteTable7[crc >>> 24];
>>>>                 }
>>>>             }
>>>>
>>>>             if (ByteOrder.nativeOrder() == ByteOrder.BIG_ENDIAN) {
>>>>                 crc = Integer.reverseBytes(crc);
>>>>             }
>>>>         }
>>>>
>>>>         // Tail
>>>>         for (; adr < end; adr++) {
>>>>             crc = (crc >>> 8)
>>>>                     ^ byteTable[(crc ^ UNSAFE.getByte(adr)) & 0xFF];
>>>>         }
>>>>
>>>>         return crc;
>>>>     }
>>>>
>>>>
>>>>>
>>>>> Also, in updateBytes, the usage of Unsafe.ARRAY_INT_INDEX_SCALE/ARRAY_LONG_INDEX_SCALE to index a byte array sounds a little scary. To be ultra portable you could check that ARRAY_BYTE_INDEX_SCALE == 1 first and refuse to use Unsafe for byte arrays if it is not 1. Then use Integer.BYTES/Long.BYTES to manipulate 'offsets' instead. In updateDirectByteBuffer it would be more appropriate to use Integer.BYTES/Long.BYTES too.
>>>> Good idea. Added a check in the initial if statement and it will get automatically optimized away.
>>>>
>>>>>   225 firstHalf = (int) (value & 0xFFFFFFFF);
>>>>>   226                         secondHalf = (int) (value >>> 32);
>>>>>   227                     } else { // ByteOrder.BIG_ENDIAN
>>>>>   228                         firstHalf = (int) (value >>> 32);
>>>>>   229                         secondHalf = (int) (value & 0xFFFFFFFF);
>>>>>
>>>>> firstHalf = (int) value; // this is equivalent for line 225
>>>>> secondHalf = (int) value; // this is equivalent for line 229
>>>> Done.
>>>>
>>>> Here is the latest webrev, http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.03
>>>>
>>>> Cheers,
>>>> Staffan
>>>
>>
>