RFR (L): 7088419 : Use x86 Hardware CRC32 Instruction with java.util.zip.CRC32 and java.util.zip.Adler32

David Chase david.r.chase at oracle.com
Wed Jun 26 10:52:21 PDT 2013


Bug: There is a lovely instruction available on late-model Intel chips
that can be used to accelerate some CRC calculations.

Fix: Enhance the compiler to have intrinsics for the CRC methods
that use this instruction, where appropriate.

Testing:
Hand testing on Sparc (to ensure no harm was done)
and x86 (to ensure no harm was done, and that it was faster)
plus with JPRT across a range of x86 (32/64, Windows/MacOS/Linux/Solaris)
and other targets to be sure all was well.

This is also all based on C code that had already been well-tested.

Webrev: http://cr.openjdk.java.net/~drchase/7088419/webrev.05/
No changes to JDK: this is all in the compiler.
It does not interfere with the possibility of parallelizing CRC32 and Adler32,
but that seemed like a tall order for a compiler intrinsic.

Guide to the webrev changes:

    - Add new instructions to assembler
        - src/cpu/x86/vm/assembler_x86.hpp (declare entrypoints:
          movdqa, pinsrd, pinsrq, pextrd, pextrq, vpclmulqdq)
        - src/cpu/x86/vm/assembler_x86.cpp (emit the bits)
    - Add new operations to macro-assembler
        - src/cpu/x86/vm/macroAssembler_x86.cpp (this includes the
          truly "macro" instructions update_byte_crc32,
          fold_128bit_crc32, fold_8bit_crc32, kernel_crc32)
        - src/cpu/x86/vm/macroAssembler_x86.hpp
    - Add flag and feature checks
        - src/cpu/x86/vm/globals_x86.hpp (arch-specific UseCLMUL)
        - src/share/vm/runtime/globals.hpp (UseCRC32Intrinsics  flag
          and default value)
        - src/cpu/x86/vm/vm_version_x86.hpp (pull feature from cpu info)
        - src/cpu/x86/vm/vm_version_x86.cpp (process flag, print
          feature)
        - src/share/vm/prims/jvm.cpp (set property for jdk lib -- is
          this needed for crc?)
    - Register intrinsic
        - src/share/vm/classfile/vmSymbols.hpp (declare it with
          signature etc)
    - Write stub data (CRC constants and tables)
        - src/share/vm/runtime/stubRoutines.{c,h}pp (declaration of
          statics and accessors for crc32 generated method and
          constants).
        - src/cpu/x86/vm/stubRoutines_x86.{c,h}pp (what are
          _verify_mxcsr_entry and _key_shuffle_mask_addr ? -- answer,
          these are refactored from 32/64-specific files, see below.)
        - src/cpu/x86/vm/stubRoutines_x86_{32,64}.{c,h}pp (code
          refactored into common stubRoutines.{c,h}pp)
        - src/cpu/x86/vm/stubGenerator_x86_{32,64}.cpp (write out
          generated stub procedure, initialize pointer to method and
          data)
    - Add stub call substitution to interpreter
        - src/share/vm/interpreter/templateInterpreter.cpp (declare
          method entrypoints)
        - src/cpu/x86/vm/interpreterGenerator_x86.hpp,
          src/cpu/x86/vm/templateInterpreter_x86_{32,64}.cpp
          (declaration and definition of 32/64-specific glue to call
          the intrinsic stub routine)
        - src/share/vm/interpreter/interpreter.cpp (enum lookup and
          debugging output)
        - src/share/vm/interpreter/abstractInterpreter.hpp (extend
          enumeration of method kinds)
    - Add to C1
        - src/share/vm/c1/c1_LIR.{c,h}pp  (declare new nodes
          LIR_OpUpdateCRC32)
        - src/cpu/sparc/vm/c1_LIRGenerator_sparc.cpp (unimplemented
          stubs for unsupported arch)
        - src/cpu/sparc/vm/c1_LIRAssembler_sparc.cpp (unimplemented
          stubs for unsupported arch)
        - src/share/vm/c1/c1_GraphBuilder.cpp (try inline intrinsic)
        - src/share/vm/c1/c1_Runtime1.cpp (stub routine looker-upper)
        - src/share/vm/c1/c1_LIRGenerator.{c,h}pp (Add case to
          do_Intrinsic for one-byte update)
        - src/cpu/x86/vm/c1_LIRGenerator_x86.cpp (Add cases for x86 CRC
          updates -- byte, array, buffer)
        - src/cpu/x86/vm/c1_LIRAssembler_x86.cpp (Add emit_updatecrc32
          for the the one-byte case from new node).
        - src/share/vm/c1/c1_LIRAssembler.hpp (Add decl for
          emit_updatecrc32)
    - Add to C2
        - src/share/vm/opto/runtime.{c,h}pp (new intrinsic function has
          a type; add calls to generate that type).
        - src/share/vm/opto/escape.cpp (add special case for escape
          analysis assertion)
        - src/share/vm/opto/library_call.cpp (substitution of inline
          glue code to intrinsic stub call).



More information about the hotspot-compiler-dev mailing list