RFR: 8257772: Vectorizing clear memory operation using AVX-512 masked operations [v2]

Jatin Bhateja jbhateja at openjdk.java.net
Mon Dec 7 12:25:15 UTC 2020


On Mon, 7 Dec 2020 08:40:02 GMT, Tobias Hartmann <thartmann at openjdk.org> wrote:

> Submitted some quick testing for this and there are failures with tests in `compiler/c2/cr6340864/`:
> 
> ```
> #  Internal Error (workspace/open/src/hotspot/cpu/x86/macroAssembler_x86.cpp:8178), pid=27510, tid=27529
> #  assert(MaxVectorSize >= 32) failed: vector length should be >= 32
> 
> Current CompileTask:
> C2:    259   28    b        java.lang.StringCoding::encodeASCII (158 bytes)
> 
> Stack: [0x00007f2d144f8000,0x00007f2d145f9000],  sp=0x00007f2d145f3750,  free space=1005k
> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
> V  [libjvm.so+0x13a326c]  MacroAssembler::fill64_avx(RegisterImpl*, int, XMMRegisterImpl*, bool)+0x11c
> V  [libjvm.so+0x13a3415]  MacroAssembler::xmm_clear_mem(RegisterImpl*, RegisterImpl*, RegisterImpl*, XMMRegisterImpl*)+0x195
> V  [libjvm.so+0x13a458b]  MacroAssembler::clear_mem(RegisterImpl*, RegisterImpl*, RegisterImpl*, XMMRegisterImpl*, bool)+0x19b
> V  [libjvm.so+0x395487]  rep_stosNode::emit(CodeBuffer&, PhaseRegAlloc*) const+0x167
> V  [libjvm.so+0x15b79da]  PhaseOutput::scratch_emit_size(Node const*)+0x3fa
> V  [libjvm.so+0x15ae88c]  PhaseOutput::shorten_branches(unsigned int*)+0x2ac
> V  [libjvm.so+0x15c045a]  PhaseOutput::Output()+0xcda
> V  [libjvm.so+0xa0a798]  Compile::Code_Gen()+0x438
> V  [libjvm.so+0xa13fe7]  Compile::Compile(ciEnv*, ciMethod*, int, bool, bool, bool, bool, DirectiveSet*)+0x1917
> V  [libjvm.so+0x8466ac]  C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x1dc
> V  [libjvm.so+0xa24498]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0xe08
> V  [libjvm.so+0xa24fe8]  CompileBroker::compiler_thread_loop()+0x5a8
> V  [libjvm.so+0x18ae756]  JavaThread::thread_main_inner()+0x256
> V  [libjvm.so+0x18b50e0]  Thread::call_run()+0x100
> V  [libjvm.so+0x1598346]  thread_native_entry(Thread*)+0x116
> ```
> 
> Tests are executed with `-XX:CompileThreshold=100 -XX:-TieredCompilation`.

Hi Tobi, thanks,  I missed a safety check for MaxVectorSize >= 32 in xmm_clear_mem, for platforms supporting AVX feature, I have fixed this and running tests,  can you kindly run the patch with default options over your internal performance suite and confirm there is no performance degradation.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1631


More information about the hotspot-dev mailing list