Request review: 4360113: Evict nmethods when code cache gets full

Wed Dec 16 12:01:30 PST 2009

Eric Caspole wrote:
> 
> OK. I am running out of good names, how about 
> CodeCacheFlushingTriggerFreeSpace?

CodeCacheFlushingMinimumFreeSpace?

> 
>>
>> I am not sure the next code is correct.
>> What if method was compiled again and got new nmethod?
> 
> I am not sure what you mean about compiled again, but I think the method 
> would not be selected for flushing in that case.

methodOop::has_compiled_code() checks only _code field and
frequency_counter_overflow_inner() may trigger recompilation.
As result codecache may have few nmethods corresponding to
one method. What I am getting to is the check you use everywhere
(nm->method()->code() != nm) is not enough, I think. You may need
to check (nm->method()->saved_code() == nm):

if (nm->method()->code() == nm) {
   ...
} else if (nm->method()->saved_code() == nm) {

>> +           // This method was previously considered for preemptive 
>> unloading and was not called since then
>> +           ((nmethod*)nm)->method()->set_saved_code(NULL);
>> +           ((nmethod*)nm)->method()->invocation_counter()->reset();
>> +           ((nmethod*)nm)->method()->backedge_counter()->reset();
>> +           ((nmethod*)nm)->method()->set_interpreter_invocation_count(0);
>> +           ((nmethod*)nm)->method()->set_method_data(NULL);
>> +           ((nmethod*)nm)->make_not_entrant();
>> +         }
>>
> Yes I got carried away here erasing the mdo. I saw other places can 
> reallocate mdo if necessary, but not here. The effect I was going for 
> was to slow the stampede of recompiles. I will change this and retest it.

I think you should only reset counters. Execution phases change
could go back so I would prefer to keep profiling information in MDO:

+           ((nmethod*)nm)->method()->set_saved_code(NULL);
+           ((nmethod*)nm)->method()->invocation_counter()->reset();
+           ((nmethod*)nm)->method()->backedge_counter()->reset();
+           ((nmethod*)nm)->make_not_entrant();

> 
> The one below looks like a too eager assert in 
> SimpleCompPolicy::method_invocation_event(), since the next if block 
> below the assert again checks "&& UseCompiler)" -- what do you think?

I think it is race. method_invocation_event() is called only when UseCompiler == true:

     if (!method->has_compiled_code() && UseCompiler) {
       CompilationPolicy::policy()->method_invocation_event(method, CHECK_NULL);

It seems other thread set UseCompiler to false between the check and assert.

Vladimir

> 
> 
>> % java -XX:ReservedCodeCacheSize=6M -XX:+PrintCodeCacheExtension 
>> -XX:+UseCodeCacheFlushing -XX:MinCodeCacheFlushingInterval=300 
>> -XX:+PrintCompilation -XX:+PrintMethodFlushing -jar SPECjvm2008.jar 
>> -ikv compiler.sunflow compiler.compiler
>>
>> #  Internal Error 
>> (/net/irkutsk/export/home/kvn/work2/hg/4360113/src/share/vm/runtime/compilationPolicy.cpp:124), 
>> pid=26531, tid=51
>> #  Error: assert(UseCompiler || CompileTheWorld,"UseCompiler should be 
>> set by now.")
>>
>> Current thread (0x08c84400):  JavaThread "BenchmarkThread 
>> compiler.sunflow 6" [_thread_in_vm, id=51, stack(0xfa2c9000,0xfa319000)]
>>
>> Stack: [0xfa2c9000,0xfa319000],  sp=0xfa317ad8,  free space=13afa319000k
>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, 
>> C=native code)
>> V  [libjvm.so+0x170235a] void VMError::report(outputStream*) + 0x5be
>> V  [libjvm.so+0x1703336] void VMError::report_and_die() + 0x586
>> V  [libjvm.so+0x7bf95d] void report_assertion_failure(const 
>> char*,int,const char*) + 0x61
>> V  [libjvm.so+0x6ccd4e] void 
>> SimpleCompPolicy::method_invocation_event(methodHandle,Thread*) + 0x20a
>> V  [libjvm.so+0xa68949] 
>> nmethod*InterpreterRuntime::frequency_counter_overflow_inner(JavaThread*,unsigned 
>> char*) + 0x1551
>>
>>
>> Eric Caspole wrote:
>>> 4360113: Evict nmethods when code cache gets full
>>> http://cr.openjdk.java.net/~ecaspole/4360113/
>>> In this change, under a flag and off by default, when compilers 
>>> notice the code cache is getting full, they will call a vm op that 
>>> calls new code that attempts to speculatively unload the oldest half 
>>> of the nmethods (based on the compile job id) by hiding the 
>>> methodOop's ref to the nmethod in the new _saved_code field. Then 
>>> execution resumes. After inline cache cleaning, callers will have to 
>>> go back to the methodOop to get the verified entry point.
>>> At that point, the methodOop's _code field is restored from the 
>>> _saved_code field and the methodOop/nmethod go back to their normal 
>>> state.
>>> If a method so marked is not called by the second sweep cycle after 
>>> the one where forced unloading happened, the nmethod will be marked 
>>> non-entrant and got rid of by normal sweeping. That gives the app a 
>>> few seconds to make progress and call its hottest methods.
>>> We chose to target the oldest half of nmethods due to a customer 
>>> experience with a long-running app server, and despite multiple 
>>> redeployments of the same web app, something was preventing old 
>>> instances of the web app from ever getting unloaded. In that case, 
>>> they ran into the code cache full problem so the most recent 
>>> redeployment was running interpreter only. We have also observed that 
>>> for many applications a lot of methods get compiled and used during 
>>> the startup phase that are never used again.
>>> In this change there is also a timer based backoff, default of 30 
>>> seconds, so that if the normal state of the app is constantly 
>>> triggering unloading, the unloading will stop and it will fall back 
>>> to the existing situation of disabling the compiler.
>>> In my testing, this allows the program to quickly resume normal 
>>> operation with no noticeable performance degradation.
>>> Thanks for your comments,
>>> Eric
>>
> 
>