RFR: 8257772: Vectorizing clear memory operation using AVX-512 masked operations [v2]
Tobias Hartmann
thartmann at openjdk.java.net
Tue Dec 8 12:23:11 UTC 2020
On Tue, 8 Dec 2020 11:52:55 GMT, Tobias Hartmann <thartmann at openjdk.org> wrote:
>>> Submitted some quick testing for this and there are failures with tests in `compiler/c2/cr6340864/`:
>>>
>>> ```
>>> # Internal Error (workspace/open/src/hotspot/cpu/x86/macroAssembler_x86.cpp:8178), pid=27510, tid=27529
>>> # assert(MaxVectorSize >= 32) failed: vector length should be >= 32
>>>
>>> Current CompileTask:
>>> C2: 259 28 b java.lang.StringCoding::encodeASCII (158 bytes)
>>>
>>> Stack: [0x00007f2d144f8000,0x00007f2d145f9000], sp=0x00007f2d145f3750, free space=1005k
>>> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
>>> V [libjvm.so+0x13a326c] MacroAssembler::fill64_avx(RegisterImpl*, int, XMMRegisterImpl*, bool)+0x11c
>>> V [libjvm.so+0x13a3415] MacroAssembler::xmm_clear_mem(RegisterImpl*, RegisterImpl*, RegisterImpl*, XMMRegisterImpl*)+0x195
>>> V [libjvm.so+0x13a458b] MacroAssembler::clear_mem(RegisterImpl*, RegisterImpl*, RegisterImpl*, XMMRegisterImpl*, bool)+0x19b
>>> V [libjvm.so+0x395487] rep_stosNode::emit(CodeBuffer&, PhaseRegAlloc*) const+0x167
>>> V [libjvm.so+0x15b79da] PhaseOutput::scratch_emit_size(Node const*)+0x3fa
>>> V [libjvm.so+0x15ae88c] PhaseOutput::shorten_branches(unsigned int*)+0x2ac
>>> V [libjvm.so+0x15c045a] PhaseOutput::Output()+0xcda
>>> V [libjvm.so+0xa0a798] Compile::Code_Gen()+0x438
>>> V [libjvm.so+0xa13fe7] Compile::Compile(ciEnv*, ciMethod*, int, bool, bool, bool, bool, DirectiveSet*)+0x1917
>>> V [libjvm.so+0x8466ac] C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x1dc
>>> V [libjvm.so+0xa24498] CompileBroker::invoke_compiler_on_method(CompileTask*)+0xe08
>>> V [libjvm.so+0xa24fe8] CompileBroker::compiler_thread_loop()+0x5a8
>>> V [libjvm.so+0x18ae756] JavaThread::thread_main_inner()+0x256
>>> V [libjvm.so+0x18b50e0] Thread::call_run()+0x100
>>> V [libjvm.so+0x1598346] thread_native_entry(Thread*)+0x116
>>> ```
>>>
>>> Tests are executed with `-XX:CompileThreshold=100 -XX:-TieredCompilation`.
>>
>> Hi Tobi, thanks, I missed a safety check for MaxVectorSize >= 32 in xmm_clear_mem, for platforms supporting AVX feature, I have fixed this and running tests, can you kindly run the patch with default options over your internal performance suite and confirm there is no performance degradation.
>
> Okay, will do and report back once it finished.
Just noticed that you didn't update the patch yet. Could you first push the fix?
-------------
PR: https://git.openjdk.java.net/jdk/pull/1631
More information about the hotspot-compiler-dev
mailing list