MFENCE vs. LOCK addl
Jiva, Azeem
Azeem.Jiva at amd.com
Wed Feb 25 05:53:54 PST 2009
I was looking at memory barrier performance and noticed that HotSpot
uses MFENCE as a memory barrier in 64bit mode. MFENCE is significantly
slower than using a LOCKed instruction, since MFENCE is serializing
(similar to CPUID). I'd like to recommend the following change:
assembler_x86.cpp
// Serializes memory.
void Assembler::mfence() {
// Memory barriers are only needed on multiprocessors
if (os::is_MP()) {
// All usable chips support "locked" instructions which suffice
// as barriers, and are much faster than the alternative of
// using cpuid or mfence instructions. We use here a locked add
[esp],0.
// This is conveniently otherwise a no-op except for blowing
// flags (which we save and restore.)
pushf(); // Save eflags register
lock();
addl(Address(rsp, 0), 0);// Assert the lock# signal here
popf(); // Restore eflags register
}
}
Sorry it's not a diff, but I'm not setup with mercurial yet. Only
application I've ran is SPECjbb2005, and there are no regressions or
gains. Mostly because the generated code from SPECjbb2005 doesn't use
MFENCE in any significant amount.
--
Azeem Jiva
AMD Java Labs
T 512.602.0907
More information about the hotspot-compiler-dev
mailing list