RFR (M): 8134553: CRC32C implementations for Nehalem x86/amd64 & Westmere+ x86/amd64

Vladimir Kozlov vladimir.kozlov at oracle.com
Mon Sep 14 18:17:46 UTC 2015


Tests found:

src/cpu/x86/vm/macroAssembler_x86.cpp:8686: Trailing whitespace
src/cpu/x86/vm/macroAssembler_x86.hpp:1330: Trailing whitespace
src/cpu/x86/vm/stubGenerator_x86_32.cpp:3001: Trailing whitespace
src/cpu/x86/vm/stubGenerator_x86_64.cpp:3968: Trailing whitespace
src/cpu/x86/vm/stubRoutines_x86.cpp:136: Trailing whitespace
src/share/vm/interpreter/interpreter.cpp:594: Carriage return (^M)
src/share/vm/interpreter/templateInterpreter.cpp:426: Carriage return (^M)
src/cpu/x86/vm/crc32c.h:31: Trailing whitespace

Please, fix.

Also build failed (make sure you build 32- and 64-bit):

hotspot/src/cpu/x86/vm/macroAssembler_x86.cpp: In member function 'void MacroAssembler::crc32c_ipl_alg2_alt2(Register, 
Register, Register, Register, Register, Register, Register, Register, Register, XMMRegister, XMMRegister, XMMRegister, 
bool)':
hotspot/src/cpu/x86/vm/macroAssembler_x86.cpp:9033:42: error: 'CRC32C' is not a class or namespace
    uint32_t const_or_pre_comp_const_index[CRC32C::NUM_PRECOMPUTED_CONSTANTS];
                                           ^
hotspot/src/cpu/x86/vm/macroAssembler_x86.cpp:9040:5: error: 'const_or_pre_comp_const_index' was not declared in this scope
      const_or_pre_comp_const_index[1] = *(uint32_t *)StubRoutines::_crc32c_table_addr;
      ^
hotspot/src/cpu/x86/vm/macroAssembler_x86.cpp:9049:5: error: 'const_or_pre_comp_const_index' was not declared in this scope
      const_or_pre_comp_const_index[0] = 1;
      ^
hotspot/src/cpu/x86/vm/macroAssembler_x86.cpp:9058:21: error: 'CRC32C' is not a class or namespace
    crc32c_proc_chunk(CRC32C::HIGH, const_or_pre_comp_const_index[0], const_or_pre_comp_const_index[1], 
is_pclmulqdq_supported,
                      ^
hotspot/src/cpu/x86/vm/macroAssembler_x86.cpp:9058:35: error: 'const_or_pre_comp_const_index' was not declared in this scope
    crc32c_proc_chunk(CRC32C::HIGH, const_or_pre_comp_const_index[0], const_or_pre_comp_const_index[1], 
is_pclmulqdq_supported,
                                    ^
hotspot/src/cpu/x86/vm/macroAssembler_x86.cpp:9064:21: error: 'CRC32C' is not a class or namespace
    crc32c_proc_chunk(CRC32C::MIDDLE, const_or_pre_comp_const_index[2], const_or_pre_comp_const_index[3], 
is_pclmulqdq_supported,
                      ^
hotspot/src/cpu/x86/vm/macroAssembler_x86.cpp:9070:21: error: 'CRC32C' is not a class or namespace
    crc32c_proc_chunk(CRC32C::LOW, const_or_pre_comp_const_index[4], const_or_pre_comp_const_index[5], 
is_pclmulqdq_supported,

Thanks,
Vladimir

On 9/14/15 10:17 AM, Vladimir Kozlov wrote:
> Nice! Let me test it.
>
> I missed in first review - you should not assign to local when you passing parameter:
>
> +      bool supports_clmul;
> +      StubRoutines::x86::generate_CRC32C_table(supports_clmul = VM_Version::supports_clmul());
>
> I can fix it myself before push.
>
> Thanks,
> Vladimir
>
> On 9/8/15 11:16 AM, Wojtowicz, Tomasz wrote:
>> Updated review, including all of the changes you have pointed out in an original e-mail:
>> http://cr.openjdk.java.net/~mcberg/8134553/webrev.02/
>>
>> --
>> Thank you,
>>
>> Tomek
>>
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>> Sent: Wednesday, September 02, 2015 7:06 PM
>> To: Wojtowicz, Tomasz; 'hotspot-compiler-dev at openjdk.java.net'
>> Subject: Re: RFR (M): 8134553: CRC32C implementations for Nehalem x86/amd64 & Westmere+ x86/amd64
>>
>> Nice work Tomek.
>> Few things.
>>
>> Why there are no C1 changes?
>> TW>We were interested in speeding up an intrinsic path. I believe it is a similar approach to what ARM and SPARC choose.
>>
>> templateInterpreter_x86_64.cpp has strange symbols in comments:
>> Java� Virtual
>> type�long�and�double�have
>> TW>I have spotted it earlier - looks like I have uploaded an earlier revision of a file than I originally thought .
>> Fixed.
>>
>> Code style comments:
>> If possible can you not use namespace? You can have named enums and
>> local static methods instead.
>> TW>I got rid of all namespace CRC32C. Created named enum for chunk sizes.
>>
>> And macroAssembler_x86.cpp totally goes against our code style.
>> I would suggest to move IPL_Alg4 and other local methods into
>> MacroAssembler class - 'This->' looks strange here.
>> I would also rename all these local methods to have prefix crc32c_.
>> TW>Done.
>>
>> Also align parameters lines to (:
>>
>> +  void IPL_Alg4(INOUT(Register B), uint32_t n,
>> +                Scratch(Register C), Scratch(Register D),
>> Scratch(Register Z),
>> +                MacroAssembler * This) {
>>
>>
>> +  CRC32C::ProcChunk(CRC32C::HIGH, CONSTOrPreCompConstIndex[0],
>> CONSTOrPreCompConstIndex[1], IsPclmulqdqSupported,
>> +                   len, buf, crc,
>> +                   A, B, C,
>> +                   AXMM, BXMM, CXMM,
>> +                   D, E, F,
>> +                   this);
>> TW>Done.
>>
>> Instead of Scratch(Register C) use Register tmp1.
>> I am fine with IN, INOUT, OUT, Scratch macros but Nehalem() and
>> Westmere() is strange to see here - I do not understand their meaning.
>> TW>Nehalem and Westmere need different resources (Registers) to obtain carry less multiplication results. In order to
>> not generate separate wrappers I thought about using single functions and then just marked registers which are
>> specific to particular architecture with Westmere(), Nehalem() empty macros. In an effort to aid in understanding of a
>> code. But I could get rid of them easily. Already done.
>> I got rid of a IN/INOUT/OUT macros and modified Register names accordingly in/in_out/out
>>
>> rename CONSTOrPreCompConstInde* to use low case and _.
>> TW>Done.
>>
>> Argument name: IsPclmulqdqSupported --> is_pclmulqdq_supported
>> TW>Done
>>
>> stubRoutines_x86.cpp upper case name are used mostly for constants and
>> enums. You have array PCLMULQDQ[] (I would change PCLMULQDQ name too -
>> pclmulqdq_table) and next weird definition:
>> +  #undef CONST
>> +  static juint x;
>> +  #define CONST x
>> +      CONST = PowN[j];
>> Keep it local as X_CONST if you want to highlight that it is constant
>> for following expression. Simple 'CONST' name is confusing.
>>> Both done
>>
>> Rename  crc32c_IPL_Alg2Alt2Fast --> crc32c_IPL_alg2 (to you need alt fast?).
>>> I just need Alt2 since Gueron's Alg2 has two alternatives. Done.
>>
>> Rename GenerateCRC32CTable --> generate_CRC32C_table
>> TW>Done
>>
>> crc32c.h:
>>    // a) constants table generation
>> (C:\Java\jdk9hs-comp\hotspot\src\cpu\x86\vm\stubRoutines_x86.cpp)
>>     62
>> Use hotspot/src/cpu/x86/vm/stubRoutines_x86.cpp
>> TW>Done
>>
>> Thanks,
>> Vladimir
>>
>> On 8/27/15 11:47 AM, Wojtowicz, Tomasz wrote:
>>> I would like to contribute following change:
>>>
>>> Review details
>>>
>>> Review Title:      CRC32C implementations for Nehalem x86/amd64 &
>>> Westmere+ x86/amd64
>>>
>>> Review ID:          #8134553
>>>
>>> Diff: http://cr.openjdk.java.net/~mcberg/8134553/webrev.01/
>>> <http://cr.openjdk.java.net/%7Emcberg/8134553/webrev.01/>
>>>
>>> Description:        Efficient use of a crc32 hardware instruction by
>>> division of a problem to a predefined chunks of an increasing size and
>>> further by 3 to be computed hiding instruction latencies. x86 delivers
>>> up to 8x improvement vs. java library, amd64 stops at even more -> 16x.
>>>
>>>                                   Performance data are attached to this
>>> message for your convenience.
>>>
>>>                                   No regressions has been observed on
>>> hotspot/compiler x86_64.
>>>
>>> Link: https://bugs.openjdk.java.net/browse/JDK-8134553
>>>
>>> Author:                Tomasz, Wojtowicz
>>>
>>> --
>>>
>>> Thank you,
>>>
>>> Tomek
>>>


More information about the hotspot-compiler-dev mailing list