<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;">Hello<div><br></div><div>Just wanted to share some results of my analysis.</div><div><br></div><div>I have changed safepoints from reading polling page ( lwu to x0) to writing to polling page ( sw from x0).</div><div><br></div><div>here are results:</div><div><br></div><div><div><font face="PT Mono">polling page read</font></div><div><font face="PT Mono">Benchmark Mode Cnt Score Error Units</font></div><div><font face="PT Mono">InterfaceCalls.testInterfaceCastAndCall avgt 25 32.192 ? 0.364 ns/op</font></div><div><font face="PT Mono">InterfaceCalls.testInterfaceCastAndCall:L1-dcache-load-misses:u avgt 5 0.001 ? 0.001 #/op</font></div><div><font face="PT Mono">InterfaceCalls.testInterfaceCastAndCall:L1-dcache-loads:u avgt 5 9.036 ? 0.072 #/op</font></div><div><font face="PT Mono">InterfaceCalls.testInterfaceCastAndCall:L1-dcache-stores:u avgt 5 0.019 ? 0.025 #/op</font></div><div><font face="PT Mono">InterfaceCalls.testInterfaceCastAndCall:LLC-loads:u avgt 5 0.005 ? 0.007 #/op</font></div><div><font face="PT Mono">InterfaceCalls.testInterfaceCastAndCall:LLC-stores:u avgt 5 0.002 ? 0.002 #/op</font></div><div><font face="PT Mono">InterfaceCalls.testInterfaceCastAndCall:branch-misses:u avgt 5 0.001 ? 0.001 #/op</font></div><div><font face="PT Mono">InterfaceCalls.testInterfaceCastAndCall:branches:u avgt 5 4.014 ? 0.029 #/op</font></div><div><font face="PT Mono">InterfaceCalls.testInterfaceCastAndCall:cycles:u avgt 5 38.777 ? 2.444 #/op</font></div><div><font face="PT Mono">InterfaceCalls.testInterfaceCastAndCall:instructions:u avgt 5 15.121 ? 0.200 #/op</font></div><div><font face="PT Mono">InterfaceCalls.testInterfaceCastAndCall:stalled-cycles-frontend:u avgt 5 0.001 ? 0.002 #/op</font></div></div><div><font face="PT Mono"><br></font></div><div><div><font face="PT Mono">polling page write</font></div><div><font face="PT Mono">Benchmark Mode Cnt Score Error Units</font></div><div><font face="PT Mono">InterfaceCalls.testInterfaceCastAndCall avgt 25 30.326 ? 0.257 ns/op</font></div><div><font face="PT Mono">InterfaceCalls.testInterfaceCastAndCall:L1-dcache-load-misses:u avgt 5 0.001 ? 0.001 #/op</font></div><div><font face="PT Mono">InterfaceCalls.testInterfaceCastAndCall:L1-dcache-loads:u avgt 5 8.035 ? 0.016 #/op</font></div><div><font face="PT Mono">InterfaceCalls.testInterfaceCastAndCall:L1-dcache-stores:u avgt 5 1.016 ? 0.012 #/op</font></div><div><font face="PT Mono">InterfaceCalls.testInterfaceCastAndCall:LLC-loads:u avgt 5 0.004 ? 0.004 #/op</font></div><div><font face="PT Mono">InterfaceCalls.testInterfaceCastAndCall:LLC-stores:u avgt 5 0.001 ? 0.001 #/op</font></div><div><font face="PT Mono">InterfaceCalls.testInterfaceCastAndCall:branch-misses:u avgt 5 0.001 ? 0.001 #/op</font></div><div><font face="PT Mono">InterfaceCalls.testInterfaceCastAndCall:branches:u avgt 5 4.014 ? 0.007 #/op</font></div><div><font face="PT Mono">InterfaceCalls.testInterfaceCastAndCall:cycles:u avgt 5 36.552 ? 1.662 #/op</font></div><div><font face="PT Mono">InterfaceCalls.testInterfaceCastAndCall:instructions:u avgt 5 15.110 ? 0.062 #/op</font></div><div><font face="PT Mono">InterfaceCalls.testInterfaceCastAndCall:stalled-cycles-frontend:u avgt 5 0.001 ? 0.001 #/op</font></div><div><br></div><div><br></div><div><br></div><div>Minus one 1 l1 load, Plus one l1 store, it’s obvious. Since stores are cheaper (for cpu core, not for caches), total cycles got reduced.</div></div><div>One another hand, stores storm should generate some traffic from l1d to LLC and then to RAM which may slow down another threads/apps.</div><div><br></div><div>Regards, Vladimir</div></body></html>