RFR: 8257772: Vectorizing clear memory operation using AVX-512 masked operations [v2]
Jatin Bhateja
jbhateja at openjdk.java.net
Mon Dec 7 12:25:15 UTC 2020
On Mon, 7 Dec 2020 08:40:02 GMT, Tobias Hartmann <thartmann at openjdk.org> wrote:
> Submitted some quick testing for this and there are failures with tests in `compiler/c2/cr6340864/`:
>
> ```
> # Internal Error (workspace/open/src/hotspot/cpu/x86/macroAssembler_x86.cpp:8178), pid=27510, tid=27529
> # assert(MaxVectorSize >= 32) failed: vector length should be >= 32
>
> Current CompileTask:
> C2: 259 28 b java.lang.StringCoding::encodeASCII (158 bytes)
>
> Stack: [0x00007f2d144f8000,0x00007f2d145f9000], sp=0x00007f2d145f3750, free space=1005k
> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
> V [libjvm.so+0x13a326c] MacroAssembler::fill64_avx(RegisterImpl*, int, XMMRegisterImpl*, bool)+0x11c
> V [libjvm.so+0x13a3415] MacroAssembler::xmm_clear_mem(RegisterImpl*, RegisterImpl*, RegisterImpl*, XMMRegisterImpl*)+0x195
> V [libjvm.so+0x13a458b] MacroAssembler::clear_mem(RegisterImpl*, RegisterImpl*, RegisterImpl*, XMMRegisterImpl*, bool)+0x19b
> V [libjvm.so+0x395487] rep_stosNode::emit(CodeBuffer&, PhaseRegAlloc*) const+0x167
> V [libjvm.so+0x15b79da] PhaseOutput::scratch_emit_size(Node const*)+0x3fa
> V [libjvm.so+0x15ae88c] PhaseOutput::shorten_branches(unsigned int*)+0x2ac
> V [libjvm.so+0x15c045a] PhaseOutput::Output()+0xcda
> V [libjvm.so+0xa0a798] Compile::Code_Gen()+0x438
> V [libjvm.so+0xa13fe7] Compile::Compile(ciEnv*, ciMethod*, int, bool, bool, bool, bool, DirectiveSet*)+0x1917
> V [libjvm.so+0x8466ac] C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x1dc
> V [libjvm.so+0xa24498] CompileBroker::invoke_compiler_on_method(CompileTask*)+0xe08
> V [libjvm.so+0xa24fe8] CompileBroker::compiler_thread_loop()+0x5a8
> V [libjvm.so+0x18ae756] JavaThread::thread_main_inner()+0x256
> V [libjvm.so+0x18b50e0] Thread::call_run()+0x100
> V [libjvm.so+0x1598346] thread_native_entry(Thread*)+0x116
> ```
>
> Tests are executed with `-XX:CompileThreshold=100 -XX:-TieredCompilation`.
Hi Tobi, thanks, I missed a safety check for MaxVectorSize >= 32 in xmm_clear_mem, for platforms supporting AVX feature, I have fixed this and running tests, can you kindly run the patch with default options over your internal performance suite and confirm there is no performance degradation.
-------------
PR: https://git.openjdk.java.net/jdk/pull/1631
More information about the hotspot-dev
mailing list