RFR JDK-6321472: Add CRC-32C API

Tue Oct 21 21:34:55 UTC 2014

I believe it must be <, as it is in the tail loop as well, because end 
is (off+len or limit) so end is exclusive, similar to subString(begin,end).

Makes sense?

//Staffan

On 10/21/2014 01:46 PM, Peter Levart wrote:
> Sorry Staffan, another nit...
>
>   212             for (; off < (end - Long.BYTES); off += Long.BYTES) {
> and
>
>   286             for (; off < (end - Long.BYTES); off += Long.BYTES) {
>
>
> I think you could change "off < (end - Long.BYTES)" to "off <= (end - 
> Long.BYTES)". Am I right?
>
> Regards, Peter
>
>
> On 10/21/2014 10:30 PM, Peter Levart wrote:
>>
>> On 10/21/2014 08:49 PM, Staffan Friberg wrote:
>>> Hi,
>>>
>>> Got an offline comment that the package.html should be update as 
>>> well to cover CRC-32C.
>>>
>>> Otherwise there are no code changes in this new webrev.
>>>
>>> http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.04
>>
>>
>> Hi Staffan,
>>
>>   198         if (end - off >= 8 && Unsafe.ARRAY_BOOLEAN_INDEX_SCALE == 1) {
>>
>> ARRAY_BOOLEAN_INDEX_SCALE -> ARRAY_BYTE_INDEX_SCALE
>>
>>
>> Otherwise looks good now.
>>
>> Regards, Peter
>>
>> P.S.
>>
>> I think (by looking at DirectByteBuffer.asIntBuffer() implementation) 
>> that when doing 32 bit (4 byte) reads using Unsafe, the address only 
>> has to be aligned to 4 bytes (8 is necessary alignment for 64 bit 
>> reads). So updateDirectByteBuffer could make this alignment on 4 
>> bytes as it's only using 32 bit reads (with additional check on 
>> ADDRESS_SIZE, you could do that for updateBytes too).
>>
>> You don't get much out of it, so you decide if it's worth complication.
>>
>>
>>>
>>> //Staffan
>>>
>>> On 10/21/2014 10:28 AM, Staffan Friberg wrote:
>>>> Hi Peter,
>>>>
>>>> Thanks for the comments..
>>>>>
>>>>>   217                 if (Unsafe.ADDRESS_SIZE == 4) {
>>>>>   218                     // On 32 bit platforms read two ints 
>>>>> instead of a single 64bit long
>>>>>
>>>>> When you're reading from byte[] using Unsafe (updateBytes), you 
>>>>> have the option of reading 64bit values on 64bit platforms. When 
>>>>> you're reading from DirectByteBuffer memory 
>>>>> (updateDirectByteBuffer), you're only using 32bit reads.
>>>> I will add a comment in the code for this decision. The reason is 
>>>> that read a long results in slightly worse performance in this 
>>>> case, in updateBytes it is faster. I was able to get it to run 
>>>> slightly faster by working directly with the address instead of 
>>>> always adding address + off, but this makes things worse in the 
>>>> 32bit case since all calculation will now be using long variables. 
>>>> So using the getInt as in the current code feels like the best 
>>>> solution as it strikes the best balance between 32 and 64bit. Below 
>>>> is how updateByteBuffer looked with the rewrite I mentioned.
>>>>
>>>>
>>>>  ong address = ((DirectBuffer) buffer).address();
>>>>  crc = updateDirectByteBuffer(crc, address + pos, address + limit);
>>>>
>>>>
>>>>      private static int updateDirectByteBuffer(int crc, long adr, 
>>>> long end) {
>>>>
>>>>         // Do only byte reads for arrays so short they can't be 
>>>> aligned
>>>>         if (end - adr >= 8) {
>>>>
>>>>             // align on 8 bytes
>>>>             int alignLength = (8 - (int) (adr & 0x7)) & 0x7;
>>>>             for (long alignEnd = adr + alignLength; adr < alignEnd; 
>>>> adr++) {
>>>>                 crc = (crc >>> 8)
>>>>                         ^ byteTable[(crc ^ UNSAFE.getByte(adr)) & 
>>>> 0xFF];
>>>>             }
>>>>
>>>>             if (ByteOrder.nativeOrder() == ByteOrder.BIG_ENDIAN) {
>>>>                 crc = Integer.reverseBytes(crc);
>>>>             }
>>>>
>>>>             // slicing-by-8
>>>>             for (; adr < (end - Long.BYTES); adr += Long.BYTES) {
>>>>                 int firstHalf;
>>>>                 int secondHalf;
>>>>                 if (Unsafe.ADDRESS_SIZE == 4) {
>>>>                     // On 32 bit platforms read two ints instead of 
>>>> a single 64bit long
>>>>                     firstHalf = UNSAFE.getInt(adr);
>>>>                     secondHalf = UNSAFE.getInt(adr + Integer.BYTES);
>>>>                 } else {
>>>>                     long value = UNSAFE.getLong(adr);
>>>>                     if (ByteOrder.nativeOrder() == 
>>>> ByteOrder.LITTLE_ENDIAN) {
>>>>                         firstHalf = (int) value;
>>>>                         secondHalf = (int) (value >>> 32);
>>>>                     } else { // ByteOrder.BIG_ENDIAN
>>>>                         firstHalf = (int) (value >>> 32);
>>>>                         secondHalf = (int) value;
>>>>                     }
>>>>                 }
>>>>                 crc ^= firstHalf;
>>>>                 if (ByteOrder.nativeOrder() == 
>>>> ByteOrder.LITTLE_ENDIAN) {
>>>>                     crc = byteTable7[crc & 0xFF]
>>>>                             ^ byteTable6[(crc >>> 8) & 0xFF]
>>>>                             ^ byteTable5[(crc >>> 16) & 0xFF]
>>>>                             ^ byteTable4[crc >>> 24]
>>>>                             ^ byteTable3[secondHalf & 0xFF]
>>>>                             ^ byteTable2[(secondHalf >>> 8) & 0xFF]
>>>>                             ^ byteTable1[(secondHalf >>> 16) & 0xFF]
>>>>                             ^ byteTable0[secondHalf >>> 24];
>>>>                 } else { // ByteOrder.BIG_ENDIAN
>>>>                     crc = byteTable0[secondHalf & 0xFF]
>>>>                             ^ byteTable1[(secondHalf >>> 8) & 0xFF]
>>>>                             ^ byteTable2[(secondHalf >>> 16) & 0xFF]
>>>>                             ^ byteTable3[secondHalf >>> 24]
>>>>                             ^ byteTable4[crc & 0xFF]
>>>>                             ^ byteTable5[(crc >>> 8) & 0xFF]
>>>>                             ^ byteTable6[(crc >>> 16) & 0xFF]
>>>>                             ^ byteTable7[crc >>> 24];
>>>>                 }
>>>>             }
>>>>
>>>>             if (ByteOrder.nativeOrder() == ByteOrder.BIG_ENDIAN) {
>>>>                 crc = Integer.reverseBytes(crc);
>>>>             }
>>>>         }
>>>>
>>>>         // Tail
>>>>         for (; adr < end; adr++) {
>>>>             crc = (crc >>> 8)
>>>>                     ^ byteTable[(crc ^ UNSAFE.getByte(adr)) & 0xFF];
>>>>         }
>>>>
>>>>         return crc;
>>>>     }
>>>>
>>>>
>>>>>
>>>>> Also, in updateBytes, the usage of 
>>>>> Unsafe.ARRAY_INT_INDEX_SCALE/ARRAY_LONG_INDEX_SCALE to index a 
>>>>> byte array sounds a little scary. To be ultra portable you could 
>>>>> check that ARRAY_BYTE_INDEX_SCALE == 1 first and refuse to use 
>>>>> Unsafe for byte arrays if it is not 1. Then use 
>>>>> Integer.BYTES/Long.BYTES to manipulate 'offsets' instead. In 
>>>>> updateDirectByteBuffer it would be more appropriate to use 
>>>>> Integer.BYTES/Long.BYTES too.
>>>> Good idea. Added a check in the initial if statement and it will 
>>>> get automatically optimized away.
>>>>
>>>>> 225                         firstHalf = (int) (value & 0xFFFFFFFF);
>>>>>   226                         secondHalf = (int) (value >>> 32);
>>>>>   227                     } else { // ByteOrder.BIG_ENDIAN
>>>>>   228                         firstHalf = (int) (value >>> 32);
>>>>>   229                         secondHalf = (int) (value & 
>>>>> 0xFFFFFFFF);
>>>>>
>>>>> firstHalf = (int) value; // this is equivalent for line 225
>>>>> secondHalf = (int) value; // this is equivalent for line 229
>>>> Done.
>>>>
>>>> Here is the latest webrev, 
>>>> http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.03
>>>>
>>>> Cheers,
>>>> Staffan
>>>
>>
>