Request review: 4360113: Evict nmethods when code cache gets full

Thu Jan 7 14:06:31 PST 2010

On Jan 7, 2010, at 1:48 PM, Eric Caspole wrote:

> 
> On Jan 7, 2010, at 11:13 AM, Tom Rodriguez wrote:
> 
>> In the sweeper when you are traversing the live nmethods you can use alive_nmethod to skip non-nmethods like this:
>> 
>> nmethod* nm = CodeCache::alive_nmethod(CodeCache::first());
>> 
>> nm = CodeCache::alive_nmethod(CodeCache::nmethod(nm));
>> 
>> This also gets rid of all the casts to nmethod*.  You might introduce simple wrappers for first_nmethod() and next_nmethod().  I thought they actually existed but I can't find them.
> 
> This is good, thanks, I will look into it.
> 
>> 
>> I thought we weren't going to reset the counters when throwing away code?
>> 
> 
> I reset less counters than before but I was hoping to slow a stampede of recompiles after the cleaning, to reduce the chance of very soon ending up having to do another cleaning. If you don't think that is a good idea, that is OK with me and I'll remove it.

Maybe it should be decay() instead of reset().

> 
>> Is it really necessary for speculative_disconnect_nmethods to run in a safepoint?
> 
> Probably not but I had to start somewhere... that being said I hope frequent code cache cleaning never gets on somebody's "hot path."

True.

> 
>> 
>> Why is this cast needed?
>> 
>> ((nmethod*)nm)->is_locked_by_vm()
> 
> good eye.
> 
>> 
>> One thing I don't really like is the new field in methodOop since it makes methodOops larger.  I'd like to avoid doing that if we could, particularly since the field is really only used in extreme cases.  Would it be possible to simply manage a list of these nmethods in the code cache instead?
> 
> Yes, I see what you mean, but I think that would make it all a little more complicated. Such a list would have to be managed on the sweeper side too.   I'll have to think about it more.

I agree that it's a little more complicated because you have to manage the list but it will end up being localized more I think.

tom

> 
>> 
>> tom
>> 
>> 
>> On Jan 6, 2010, at 2:44 PM, Eric Caspole wrote:
>> 
>>> Hi Vladimir,
>>> I updated my webrev to .02: I added a new field into CompileBroker to use for toggling the compiler on/off when the code cache gets full or when enough space is later cleared, to avoid changing UseCompiler on the fly. Also I consolidated all the places that had some "turn off compiler" code into one place in CompileBroker as we discussed before the holidays.
>>> Regards,
>>> Eric
>>> 
>>> 
>>> On Dec 16, 2009, at 3:29 PM, Vladimir Kozlov wrote:
>>> 
>>>> 
>>>> Eric Caspole wrote:
>>>>> I see what you mean. If there are other already disconnected methods (nm->method()->code() != nm) due to normal operation, can I safely assume they are already marked not entrant and they will go away by normal sweeping? Or should I hurry them along here? I  see how not
>>>> 
>>>> The assert(nm->method()->code()->is_not_entrant() is enough, I think.
>>>> 
>>>>> having the extra check means I might reset the counters many times for no reason, I'll change that.
>>>> 
>>>>>> It seems other thread set UseCompiler to false between the check and assert.
>>>>> So that means this could have happened in the existing system if the broker shut off the compiler due to fullness, racing with the java thread making this call. Yuck.
>>>> 
>>>> Yes. With your changes it could happened more frequently.
>>>> I think, it is related to what we talked about today on meeting:
>>>> 
>>>> # Consolidate "shut off compiler" logic into one place, and reconsider logic to turn compilers on and off.  Currently this logic is spread across three different places even without the unloading change.
>>>> 
>>>> May be we should use separate flag SuspendCompilation when we think
>>>> codecache could be freed and switch of UseCompiler when we give up.
>>>> 
>>>> Thanks,
>>>> Vladimir
>>>> 
>>>>>> 
>>>>>> Vladimir
>>>>>> 
>>>>>>>> % java -XX:ReservedCodeCacheSize=6M -XX:+PrintCodeCacheExtension -XX:+UseCodeCacheFlushing -XX:MinCodeCacheFlushingInterval=300 -XX:+PrintCompilation -XX:+PrintMethodFlushing -jar SPECjvm2008.jar -ikv compiler.sunflow compiler.compiler
>>>>>>>> 
>>>>>>>> #  Internal Error (/net/irkutsk/export/home/kvn/work2/hg/4360113/src/share/vm/runtime/compilationPolicy.cpp:124), pid=26531, tid=51
>>>>>>>> #  Error: assert(UseCompiler || CompileTheWorld,"UseCompiler should be set by now.")
>>>>>>>> 
>>>>>>>> Current thread (0x08c84400):  JavaThread "BenchmarkThread compiler.sunflow 6" [_thread_in_vm, id=51, stack(0xfa2c9000,0xfa319000)]
>>>>>>>> 
>>>>>>>> Stack: [0xfa2c9000,0xfa319000],  sp=0xfa317ad8,  free space=13afa319000k
>>>>>>>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
>>>>>>>> V  [libjvm.so+0x170235a] void VMError::report(outputStream*) + 0x5be
>>>>>>>> V  [libjvm.so+0x1703336] void VMError::report_and_die() + 0x586
>>>>>>>> V  [libjvm.so+0x7bf95d] void report_assertion_failure(const char*,int,const char*) + 0x61
>>>>>>>> V  [libjvm.so+0x6ccd4e] void SimpleCompPolicy::method_invocation_event(methodHandle,Thread*) + 0x20a
>>>>>>>> V  [libjvm.so+0xa68949] nmethod*InterpreterRuntime::frequency_counter_overflow_inner(JavaThread*,unsigned char*) + 0x1551
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Eric Caspole wrote:
>>>>>>>>> 4360113: Evict nmethods when code cache gets full
>>>>>>>>> http://cr.openjdk.java.net/~ecaspole/4360113/
>>>>>>>>> In this change, under a flag and off by default, when compilers notice the code cache is getting full, they will call a vm op that calls new code that attempts to speculatively unload the oldest half of the nmethods (based on the compile job id) by hiding the methodOop's ref to the nmethod in the new _saved_code field. Then execution resumes. After inline cache cleaning, callers will have to go back to the methodOop to get the verified entry point.
>>>>>>>>> At that point, the methodOop's _code field is restored from the _saved_code field and the methodOop/nmethod go back to their normal state.
>>>>>>>>> If a method so marked is not called by the second sweep cycle after the one where forced unloading happened, the nmethod will be marked non-entrant and got rid of by normal sweeping. That gives the app a few seconds to make progress and call its hottest methods.
>>>>>>>>> We chose to target the oldest half of nmethods due to a customer experience with a long-running app server, and despite multiple redeployments of the same web app, something was preventing old instances of the web app from ever getting unloaded. In that case, they ran into the code cache full problem so the most recent redeployment was running interpreter only. We have also observed that for many applications a lot of methods get compiled and used during the startup phase that are never used again.
>>>>>>>>> In this change there is also a timer based backoff, default of 30 seconds, so that if the normal state of the app is constantly triggering unloading, the unloading will stop and it will fall back to the existing situation of disabling the compiler.
>>>>>>>>> In my testing, this allows the program to quickly resume normal operation with no noticeable performance degradation.
>>>>>>>>> Thanks for your comments,
>>>>>>>>> Eric
>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
> 
>