Performance issue with Nashorn and C2's global code motion

Thu Sep 10 12:17:14 UTC 2015

Hi,

we were running Octane benchmark and noticed a very significant performance drop with JVMTI.
VTune measurement showed that the JVM has spent the majority of the whole CPU time in Node_Backward_Iterator::next during PhaseCFG::schedule_late when JvmtiExport::_can_access_local_variables is on
(see http://cr.openjdk.java.net/~mdoerr/OctaneVTune.jpg).

We were using openjdk 8 with/without the following option:
-agentlib:jdwp=transport=dt_socket,address=8000,server=y,suspend=n

This option activates the JVMTI capability can_access_local_variables which prevents C2 from killing dead locals leading to a higher number of edges in the graph.
If we don't use this option PhaseCFG::schedule_late does no longer play a significant role regarding the CPU time.

Have you noticed this before? Is this of interest to you?
For us, this is a significant issue, as we have can_access_local_variables on by default.
As a solution we could think of limiting the node iterations in schedule_late and generating a quicker and less optimized schedule in extreme cases.

Best regards,
Martin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20150910/eca70b12/attachment-0001.html>