<div dir="ltr"><div>Hello,</div><div><br></div><div>The Java platform team at Google has maintained a local patch to inline os::SpinPause() since 2014. We would like to upstream this patch to OpenJDK. Could someone sponsor this patch?</div><div><br></div><div>It is difficult to demonstrate performance improvement in Java benchmarks. It is more of a code refactoring to better utilize modern GCC. It partly addresses the comment about inlining SpinPause() above its declaration in os.hpp.</div><div>I found an interesting discussion about PAUSE and a microbenchmark in:<br></div><div><a href="http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2012-August/004352.html" target="_blank">http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2012-August/004352.html</a><br></div><div>However, the microbenchmark has a large variance in our experiment, making it difficult to tell if there's any benefit from inlining PAUSE. Inlining PAUSE does seem to reduce the variance a bit.</div><div><br></div><div>The patch is inlined and attached below:</div><div><br></div><div><div>diff --git a/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s b/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s<br></div><div>--- a/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s</div><div>+++ b/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s</div><div>@@ -63,15 +63,6 @@</div><div>         popl     %eax</div><div>         ret</div><div> </div><div>-        .globl  SYMBOL(SpinPause)</div><div>-        ELF_TYPE(SpinPause,@function)</div><div>-        .p2align 4,,15</div><div>-SYMBOL(SpinPause):</div><div>-        rep</div><div>-        nop</div><div>-        movl    $1, %eax</div><div>-        ret</div><div>-</div><div>         # Support for void Copy::conjoint_bytes(void* from,</div><div>         #                                       void* to,</div><div>         #                                       size_t count)</div><div>diff --git a/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s b/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s</div><div>--- a/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s</div><div>+++ b/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s</div><div>@@ -46,15 +46,6 @@</div><div> </div><div> <span style="white-space:pre">       </span>.text</div><div> </div><div>-        .globl SYMBOL(SpinPause)</div><div>-        .p2align 4,,15</div><div>-        ELF_TYPE(SpinPause,@function)</div><div>-SYMBOL(SpinPause):</div><div>-        rep</div><div>-        nop</div><div>-        movq   $1, %rax</div><div>-        ret</div><div>-</div><div>         # Support for void Copy::arrayof_conjoint_bytes(void* from,</div><div>         #                                               void* to,</div><div>         #                                               size_t count)</div><div>diff --git a/src/hotspot/os_cpu/linux_x86/linux_x86_32.s b/src/hotspot/os_cpu/linux_x86/linux_x86_32.s</div><div>--- a/src/hotspot/os_cpu/linux_x86/linux_x86_32.s</div><div>+++ b/src/hotspot/os_cpu/linux_x86/linux_x86_32.s</div><div>@@ -42,15 +42,6 @@</div><div> </div><div> <span style="white-space:pre">    </span>.text</div><div> </div><div>-        .globl  SpinPause</div><div>-<span style="white-space:pre"> </span>.type   SpinPause,@function</div><div>-        .p2align 4,,15</div><div>-SpinPause:</div><div>-        rep</div><div>-        nop</div><div>-        movl    $1, %eax</div><div>-        ret</div><div>-</div><div>         # Support for void Copy::conjoint_bytes(void* from,</div><div>         #                                       void* to,</div><div>         #                                       size_t count)</div><div>diff --git a/src/hotspot/os_cpu/linux_x86/linux_x86_64.s b/src/hotspot/os_cpu/linux_x86/linux_x86_64.s</div><div>--- a/src/hotspot/os_cpu/linux_x86/linux_x86_64.s</div><div>+++ b/src/hotspot/os_cpu/linux_x86/linux_x86_64.s</div><div>@@ -38,15 +38,6 @@</div><div> </div><div> <span style="white-space:pre"> </span>.text</div><div> </div><div>-        .globl SpinPause</div><div>-        .align 16</div><div>-        .type  SpinPause,@function</div><div>-SpinPause:</div><div>-        rep</div><div>-        nop</div><div>-        movq   $1, %rax</div><div>-        ret</div><div>-</div><div>         # Support for void Copy::arrayof_conjoint_bytes(void* from,</div><div>         #                                               void* to,</div><div>         #                                               size_t count)</div><div>diff --git a/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s b/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s</div><div>--- a/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s</div><div>+++ b/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s</div><div>@@ -51,15 +51,6 @@</div><div>         movq %fs:0x0,%rax</div><div>         ret</div><div> </div><div>-        .globl  SpinPause</div><div>-        .align  16</div><div>-SpinPause:</div><div>-        rep</div><div>-        nop</div><div>-        movq    $1, %rax</div><div>-        ret</div><div>-</div><div>-</div><div>         / Support for void Copy::arrayof_conjoint_bytes(void* from,</div><div>         /                                               void* to,</div><div>         /                                               size_t count)</div><div>diff --git a/src/hotspot/share/runtime/os.hpp b/src/hotspot/share/runtime/os.hpp</div><div>--- a/src/hotspot/share/runtime/os.hpp</div><div>+++ b/src/hotspot/share/runtime/os.hpp</div><div>@@ -1031,6 +1031,13 @@</div><div> // of the global SpinPause() with C linkage.</div><div> // It'd also be eligible for inlining on many platforms.</div><div> </div><div>+#if defined(X86) && !defined(_WINDOWS)</div><div>+extern "C" int inline SpinPause() {</div><div>+  __asm__ __volatile__ ("pause");</div><div>+  return 1;</div><div>+}</div><div>+#else</div><div> extern "C" int SpinPause();</div><div>+#endif</div><div> </div><div> #endif // SHARE_VM_RUNTIME_OS_HPP</div></div><div><br></div><div>-Man<br></div></div>