RFR(S): 8037842: Failing to allocate MethodCounters and MDO causes a serious performance drop

Albert Noll albert.noll at oracle.com
Fri Oct 17 13:18:01 UTC 2014


Hi,

could I get reviews for this patch:

Bug:
https://bugs.openjdk.java.net/browse/JDK-8037842

Problem:
If the interpreter (or the compilers) fail to allocate from metaspace 
(e.g., to allocate a MDO), the exception
is cleared and - as a result - not reported to the Java application. Not 
propagating the OOME to the Java application
can lead to a serious performance regression, since every attempt to 
allocate from metaspace (if we have run out
of metaspace and a full GC cannot free memory) triggers another full GC. 
Consequently, the application continues
to run and schedules full GCs until (1) a critical allocation (one that 
throws an OOME) fails, or (2) the application finishes
normally (successfully). Note that the VM can continue to execute 
without allocating MethodCounters or MDOs.

Solution 1:
Report OOME to the Java application. This solution avoids handling the 
problem (running a large number of full GCs)
in the VM by passing the problem over to the the Java application. I.e., 
the performance regression is solved by
throwing an OOME. The only way to make the application run is to re-run 
the application with a larger (yet unknown)
metaspace size. However, the application could have continued to run 
(with an undefined performance drop).

Note that the metaspace size in the failing test case is artificially 
small (20m). Should we change the default behavior of Hotspot
to fix such a corner case?

Also, I am not sure if throwing an OOME in such a case makes Hotspot 
conform with the Java Language Specification.
The Specification says:

"Asynchronous exceptions occur only as a result of:

An internal error or resource limitation in the Java Virtual Machine 
that prevents
it from implementing the semantics of the Java programming language. In 
this
case, the asynchronous exception that is thrown is an instance of a 
subclass of
VirtualMachineError"

An OOME is an asynchronous exception. As I understand the paragraph 
above, we are only allowed to throw an asynchronous
exception, if we are not able to "implement the semantics of the Java 
programming language". Not being able to run the JIT
compiler does not seem to constrain the semantics of the Java language.

Solution 2:
If allocation from metaspace fails, we (1) report a warning to the user 
and (2) do not try to allocate MethodCounters and MDO
(as well as all other non-critical metaspace allocations) and thereby 
avoid the overhead from running full GCs. As a result, the
application can continue to run. I have not yet worked on such a 
solution. I just bring this up for discussion.

Testing:
JPRT

Webrev:
Here is the webrev for Solution 1. Please note that I am not familiar 
with this part of the code.

http://cr.openjdk.java.net/~anoll/8037842/webrev.00/

May thanks in advance,
Albert



More information about the hotspot-dev mailing list