RFR JDK-6321472: Add CRC-32C API
Stanimir Simeonoff
stanimir at riflexo.com
Thu Oct 23 00:37:29 UTC 2014
Hi Staffan,
The readonly buffer (ByteBuffer.asReadOnlyBuffer()) don't have array()
"working".
You can use "int length = Math.min(buffer.remaining, b.length)" instead,
same with new byte[Math.min(4096, buffer.remaining)]. Using smaller chunks
will be more performance friendly than allocating/eating up a huge byte[].
If you feel like, add a test with a heap bytebuffer.asReadOnlyBuffer().
Stanimir
On Thu, Oct 23, 2014 at 3:06 AM, Staffan Friberg <staffan.friberg at oracle.com
> wrote:
> Hi,
>
> I was thinking about this earlier when I started writing the patch and
> then I forgot about it again. I haven't been able to figure out when the
> code will be executed. ByteBuffer is implemented in such a way that only
> the JDK can extend it and as far as I can tell you can only create 3 types
> of ByteBuffers (Direct, Mapped and Heap), all of which will be handled by
> the more efficient calls above.
>
> That said just to make the code a bit safer from OOM it is probably best
> to update the default method and all current implementations which all use
> the same pattern.
>
> A reasonable solution should be the following code
>
> byte[] b = new byte[(buffer.remaining() < 4096)
> ? buffer.remaining() : 4096];
> while (buffer.hasRemaining()) {
> int length = (buffer.remaining() < b.length)
> ? buffer.remaining() : b.length;
> buffer.get(b, 0, length);
> update(b, 0, length);
> }
>
> Xueming, do you have any further comment?
>
> Regards,
> Staffan
>
> On 10/22/2014 03:04 PM, Stanimir Simeonoff wrote:
>
>
>
> On Thu, Oct 23, 2014 at 12:10 AM, Bernd Eckenfels <ecki at zusammenkunft.net>
> wrote:
>
>> Hello,
>>
>> just a question in the default impl:
>>
>> + } else {
>> + byte[] b = new byte[rem];
>> + buffer.get(b);
>> + update(b, 0, b.length);
>> + }
>>
>> would it be a good idea to actually put a ceiling on the size of the
>> array which is processed at once?
>
> This is an excellent catch.
> Should not be too large, probably 4k or so.
>
> Stanimir
>
>
>> Am Tue, 21 Oct 2014 10:28:50 -0700
>> schrieb Staffan Friberg <staffan.friberg at oracle.com>:
>>
>> > Hi Peter,
>> >
>> > Thanks for the comments..
>> > >
>> > > 217 if (Unsafe.ADDRESS_SIZE == 4) {
>> > > 218 // On 32 bit platforms read two ints
>> > > instead of a single 64bit long
>> > >
>> > > When you're reading from byte[] using Unsafe (updateBytes), you
>> > > have the option of reading 64bit values on 64bit platforms. When
>> > > you're reading from DirectByteBuffer memory
>> > > (updateDirectByteBuffer), you're only using 32bit reads.
>> > I will add a comment in the code for this decision. The reason is
>> > that read a long results in slightly worse performance in this case,
>> > in updateBytes it is faster. I was able to get it to run slightly
>> > faster by working directly with the address instead of always adding
>> > address + off, but this makes things worse in the 32bit case since
>> > all calculation will now be using long variables. So using the getInt
>> > as in the current code feels like the best solution as it strikes the
>> > best balance between 32 and 64bit. Below is how updateByteBuffer
>> > looked with the rewrite I mentioned.
>> >
>> >
>> > ong address = ((DirectBuffer) buffer).address();
>> > crc = updateDirectByteBuffer(crc, address + pos, address + limit);
>> >
>> >
>> > private static int updateDirectByteBuffer(int crc, long adr,
>> > long end) {
>> >
>> > // Do only byte reads for arrays so short they can't be
>> > aligned if (end - adr >= 8) {
>> >
>> > // align on 8 bytes
>> > int alignLength = (8 - (int) (adr & 0x7)) & 0x7;
>> > for (long alignEnd = adr + alignLength; adr < alignEnd;
>> > adr++) { crc = (crc >>> 8)
>> > ^ byteTable[(crc ^ UNSAFE.getByte(adr)) &
>> > 0xFF]; }
>> >
>> > if (ByteOrder.nativeOrder() == ByteOrder.BIG_ENDIAN) {
>> > crc = Integer.reverseBytes(crc);
>> > }
>> >
>> > // slicing-by-8
>> > for (; adr < (end - Long.BYTES); adr += Long.BYTES) {
>> > int firstHalf;
>> > int secondHalf;
>> > if (Unsafe.ADDRESS_SIZE == 4) {
>> > // On 32 bit platforms read two ints instead of
>> > a single 64bit long firstHalf = UNSAFE.getInt(adr);
>> > secondHalf = UNSAFE.getInt(adr + Integer.BYTES);
>> > } else {
>> > long value = UNSAFE.getLong(adr);
>> > if (ByteOrder.nativeOrder() ==
>> > ByteOrder.LITTLE_ENDIAN) { firstHalf = (int) value;
>> > secondHalf = (int) (value >>> 32);
>> > } else { // ByteOrder.BIG_ENDIAN
>> > firstHalf = (int) (value >>> 32);
>> > secondHalf = (int) value;
>> > }
>> > }
>> > crc ^= firstHalf;
>> > if (ByteOrder.nativeOrder() ==
>> > ByteOrder.LITTLE_ENDIAN) { crc = byteTable7[crc & 0xFF]
>> > ^ byteTable6[(crc >>> 8) & 0xFF]
>> > ^ byteTable5[(crc >>> 16) & 0xFF]
>> > ^ byteTable4[crc >>> 24]
>> > ^ byteTable3[secondHalf & 0xFF]
>> > ^ byteTable2[(secondHalf >>> 8) & 0xFF]
>> > ^ byteTable1[(secondHalf >>> 16) & 0xFF]
>> > ^ byteTable0[secondHalf >>> 24];
>> > } else { // ByteOrder.BIG_ENDIAN
>> > crc = byteTable0[secondHalf & 0xFF]
>> > ^ byteTable1[(secondHalf >>> 8) & 0xFF]
>> > ^ byteTable2[(secondHalf >>> 16) & 0xFF]
>> > ^ byteTable3[secondHalf >>> 24]
>> > ^ byteTable4[crc & 0xFF]
>> > ^ byteTable5[(crc >>> 8) & 0xFF]
>> > ^ byteTable6[(crc >>> 16) & 0xFF]
>> > ^ byteTable7[crc >>> 24];
>> > }
>> > }
>> >
>> > if (ByteOrder.nativeOrder() == ByteOrder.BIG_ENDIAN) {
>> > crc = Integer.reverseBytes(crc);
>> > }
>> > }
>> >
>> > // Tail
>> > for (; adr < end; adr++) {
>> > crc = (crc >>> 8)
>> > ^ byteTable[(crc ^ UNSAFE.getByte(adr)) & 0xFF];
>> > }
>> >
>> > return crc;
>> > }
>> >
>> >
>> > >
>> > > Also, in updateBytes, the usage of
>> > > Unsafe.ARRAY_INT_INDEX_SCALE/ARRAY_LONG_INDEX_SCALE to index a byte
>> > > array sounds a little scary. To be ultra portable you could check
>> > > that ARRAY_BYTE_INDEX_SCALE == 1 first and refuse to use Unsafe for
>> > > byte arrays if it is not 1. Then use Integer.BYTES/Long.BYTES to
>> > > manipulate 'offsets' instead. In updateDirectByteBuffer it would be
>> > > more appropriate to use Integer.BYTES/Long.BYTES too.
>> > Good idea. Added a check in the initial if statement and it will get
>> > automatically optimized away.
>> >
>> > > 225 firstHalf = (int) (value &
>> > > 0xFFFFFFFF); 226 secondHalf = (int) (value
>> > > >>> 32); 227 } else { // ByteOrder.BIG_ENDIAN
>> > > 228 firstHalf = (int) (value >>> 32);
>> > > 229 secondHalf = (int) (value &
>> > > 0xFFFFFFFF);
>> > >
>> > > firstHalf = (int) value; // this is equivalent for line 225
>> > > secondHalf = (int) value; // this is equivalent for line 229
>> > Done.
>> >
>> > Here is the latest webrev,
>> > http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.03
>> >
>> > Cheers,
>> > Staffan
>>
>>
>
>
More information about the core-libs-dev
mailing list