RFR(S): 8020151: PSR:PERF Large performance regressions when code cache is filled

Vladimir Kozlov vladimir.kozlov at oracle.com
Mon Sep 16 11:46:37 PDT 2013


Yes, it is good.

Thanks,
Vladimir

On 9/16/13 6:20 AM, Albert Noll wrote:
> Hi,
>
> thanks again for the review. I addressed your comments.
> Here is the new webrev:
>
> http://cr.openjdk.java.net/~anoll/8020151/webrev.03/
> <http://cr.openjdk.java.net/%7Eanoll/8020151/webrev.03/>
>
> The version is currently in jprt. If you are fine with this version we
> can send the
> binaries to Scott so he can test it. Please let me know if you are fine
> with that.
>
> Best,
> Albert
>
>
>
> On 14.09.2013 00:00, Vladimir Kozlov wrote:
>> sweeper:
>>
>> Nice comment about nmethod removal!
>>
>> I thought you will remove next check in mark_active_nmethods(), we
>> talked about that:
>>
>> if (CompiledIC_lock->is_locked() || Patching_lock->is_locked()) return;
>>
>> Swap next lines:
>>
>>  223   // Only set hotness counter
>>  224   } else {
>>
>> Next comment does not match the check (flushed_memory > 0), also
>> rename it to 'freed_memory':
>>
>>  367   // Note that typically several kB are released per sweep cycle
>> (16MB).
>>
>> Next original comment was correct:
>>  397   // since the locks acquired below might safepoint.
>>
>> It means that safepoint may happen during lock, may be:
>>
>>  397   // since safepoint may happen during acquired below locks.
>>
>> Move next comment below if ((nm->hotness_counter() < threshold) check.
>>
>> I think you need to explain what threshold formula means:
>>
>> +        double threshold = -reset_val +
>> (CodeCache::reverse_free_ratio() * NmethodSweepActivity);
>>
>> The less space we have - the bigger reverse_free_ratio(), when it
>> reaches reset_val/NmethodSweepActivity threshold become > 0 and we
>> start mark nmethods.
>>
>> What happened to sort_nmethods_by_hotness()?
>>
>> arguments.cpp: add checks that specified NmethodSweepActivity and
>> NmethodSweepFraction has reasonable values.
>>
>> Additional comments:
>>
>> I talked with Igor and we think that instead of setting and checking
>> traversal_count we need to add several additional states to the
>> nmethod and have well defined state transition machine.
>>
>> Thanks,
>> Vladimir
>>
>> On 9/13/13 1:11 PM, Albert Noll wrote:
>>> Hi Valadimir,
>>>
>>> thanks for looking at this. See comments inline:
>>>
>>> On 13.09.2013 21:16, Vladimir Kozlov wrote:
>>>> We need to ask PSR (Scott Oaks) to test these changes before we push
>>>> them. Prepare JPRT build bundles for them.
>>>>
>>> Started jprt (Stockholm) more than 5 hours ago - still in the queue.
>>>> I am glad that we can remove speculatively_disconnect without any
>>>> performance issues. But we need to ask Scott to test it.
>>> Yes, we have to have test the patch on the original program which
>>> exposed the bug.
>>>>
>>>> You need to fix an other bug (8022968) filed by Chris Pl. when C2
>>>> failed to allocate scratch buffer we do not report full codecache.
>>>> Otherwise we continue to compile and fail without sweeping. Currently
>>>> handle_full_code_cache() is called only from ciEnv::register_method().
>>>> See calls init_scratch_buffer_blob() and cb->initialize() in
>>>> Compile::init_buffer().
>>>>
>>> Yes, I'll look at this on Monday. I think it should be fairly
>>> independent of this change.
>>>> Also you need to coordinate your changes with Chris's changes for:
>>>>
>>>> 8016270: "CodeCache is full" message and compilation disabled, but
>>>> codecache is not actually full
>>>>
>>>> it switch off compilation gradually based on failed compilation method
>>>> size.
>>>>
>>> I have to look this bug in more detail.
>>>> About changes:
>>>>
>>>> nmethod.cpp:
>>>>
>>>> Adjust comment that race is during sweeping:
>>>>
>>>>     // transition its state from 'not_entrant' to 'zombie'
>>>>     // during CodeCache sweeping without having to wait
>>>>     // for stack scanning.
>>>>
>>>> mark_as_seen_on_stack() has assert which you will hit. I surprise that
>>>> you don't not. You can relax the assert to check is_alive() instead.
>>>>
>>> Thanks for catching this. That was a last second change.
>>>> compileBroker.cpp:
>>>>
>>>> Why you need this? And why scaling?:
>>>>
>>>> +        // Only one thread at a time can do sweeping. Scale the
>>>> +        // wait time according to the number of compiler threads.
>>>> +        wait_time = 100 * CICompilerCount;
>>>>
>>> I am not sure if we need this. The idea behind is that currently only
>>> one thread can sweep
>>> at a time. The scaling has the effect that - on average - every 100ms
>>> there is at least one
>>> sweep if the code cache is full.
>>>> Swap next lines:
>>>>
>>>> 1953         NMethodSweeper::possibly_sweep();
>>>> 1954         NMethodSweeper::log_sweep("disable_compiler");
>>>>
>>>>
>>> I'll change that.
>>>> I will review sweeper.cpp changes later.
>>>>
>>> Many thanks,
>>> Albert
>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 9/13/13 8:48 AM, Albert Noll wrote:
>>>>> Hi,
>>>>>
>>>>> thanks again for the feedback. This is the updated version.
>>>>> http://cr.openjdk.java.net/~anoll/8020151/webrev.02/
>>>>> <http://cr.openjdk.java.net/%7Eanoll/8020151/webrev.02/>
>>>>>
>>>>> @Vladimir:
>>>>> I hope I could fix the code issues properly. I tried your suggested
>>>>> change in JavaCalls::call_helper() but it did not lead
>>>>> to more code reanimation.
>>>>>
>>>>> @Chris P.
>>>>> Thanks again for your feedback. You are right, the algorithm could be
>>>>> made more efficient.
>>>>> However, I did not consider that code is being re-animated, so it
>>>>> probably does not help
>>>>> to reduce fragmentation. We are working on a patch where we have
>>>>> multiple code heaps.
>>>>> This should reduce fragmentation, since we can put subs, adapters,
>>>>> code
>>>>> compiled with
>>>>> different tiers into different heaps. More about this will follow most
>>>>> likely next week.
>>>>>
>>>>> I did experiments that show the performance of the current version.
>>>>> Short version:
>>>>> The current patch performs, with the used benchmarks, significantly
>>>>> better than
>>>>> the default way of handling a full code cache. This version has no
>>>>> performance regression
>>>>> when the code cache does not filled up.
>>>>>
>>>>> More detailed data can be found here:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8020151
>>>>>
>>>>>
>>>>> Change since last version (webrev.01) :
>>>>>
>>>>> Since the standard way of handling a full code cache
>>>>> (speculatively_disconnect + re-animate)
>>>>> performs worse (and does not solve the problem) I see no reason to
>>>>> keep.
>>>>> As a result, I removed
>>>>> that code.
>>>>>
>>>>> This is a large change to an import part of Hotspot; detailed and
>>>>> critical reviews are very welcome.
>>>>>
>>>>> Best,
>>>>> Albert
>>>>>
>>>>>
>>>>> On 04.09.2013 22:33, Vladimir Kozlov wrote:
>>>>>> On 9/4/13 4:07 AM, Albert Noll wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> here is an updated version of the patch. In summary, changes to the
>>>>>>> previous version are:
>>>>>>>
>>>>>>> 1) Adapt NmethodSweepFraction based on the code cache size:
>>>>>>>      Before the patch, NmethodSweepFraction was defined as a
>>>>>>> constant
>>>>>>> value.
>>>>>>>      I.e., NmethodSweepFraction was 16 for a code cache size of
>>>>>>> 32m as
>>>>>>> well as a
>>>>>>>      code cache size of 256m. This seems unreasonable, since
>>>>>>> NmethodSweepFraction
>>>>>>>      was originally introduce to keep the time spent in the sweeper
>>>>>>> (note that sweeping
>>>>>>>      was originally done at safepoints) at a reasonable level.
>>>>>>
>>>>>> Albert,
>>>>>>
>>>>>> General question first. Do we still need to sweep in fractions since
>>>>>> we don't do sweeping at safepoint and it stops for safepoints? The
>>>>>> only drawback is one Compiler thread is used for sweeping and can't
>>>>>> compile when it sweeps. But we will address that later by using
>>>>>> special sweeping thread.
>>>>>>
>>>>>>>      The patch adapts NmethodSweepFraction to the
>>>>>>> ReservedCodeCacheSize.
>>>>>>> I.e.,
>>>>>>>      each sweep operation roughly  covers 16m of the code cache.
>>>>>>
>>>>>> Why you stop at 16?:
>>>>>>
>>>>>> +    } else if (ReservedCodeCacheSize > (256 * M)) {
>>>>>> +      FLAG_SET_DEFAULT(NmethodSweepFraction, 16);
>>>>>>
>>>>>> ReservedCodeCacheSize could be 2GB so one sweep have to process
>>>>>> 128Mb.
>>>>>> Next should cover all cases, I think:
>>>>>>
>>>>>> FLAG_SET_DEFAULT(NmethodSweepFraction, 1 + ReservedCodeCacheSize /
>>>>>> (16
>>>>>> * M))
>>>>>>
>>>>>>>
>>>>>>> 2) The parameter NmethodSweepFraction is replaced by a function that
>>>>>>> computes the
>>>>>>>      NmethodSweepFraction based on the code cache size. See
>>>>>>> 'sweeper.cpp' for a detailed
>>>>>>>      description
>>>>>>>
>>>>>>> 3) The initial hotness counter depends on the code cache size. More
>>>>>>> specifically, the initial
>>>>>>>      hotness counter is 2*ReservedCodeCacheSize. As a result, a
>>>>>>> method
>>>>>>> stays longer in the
>>>>>>>      code cache if ReservedCodeCacheSize is larger.
>>>>>>
>>>>>> I don't like spreading '(ReservedCodeCacheSize / M) * 2' through the
>>>>>> code. Use NMethodSweeper::get_hotness_counter_reset_val() in nmethod
>>>>>> constuctors.
>>>>>>
>>>>>> Also you need to check for min value if ReservedCodeCacheSize < 1Mb.
>>>>>>
>>>>>> Also drop 'get_' from name. We usually don't use 'get_' in accessor
>>>>>> methods. the same for nmethod::get_hotness_counter().
>>>>>>
>>>>>> I already asked you to make hotness_counter_decay member of
>>>>>> nmethod so
>>>>>> that you don't need to pass it to dec_hotness_counter(). Or remove it
>>>>>> at all since it is always 1:
>>>>>>
>>>>>> void dec_hotness_counter() { _hotness_counter--;  }
>>>>>>
>>>>>> I would prefer all code which modifies _hotness_counter be in nmethod
>>>>>> class.
>>>>>>
>>>>>>>
>>>>>>> 4) Newly compiled methods are guaranteed to not be evicted by the
>>>>>>> sweeper for 10
>>>>>>>      sweep cycles. This ensures the newly compiled methods are not
>>>>>>> immediately made
>>>>>>>      not entrant after compialtion.
>>>>>>>
>>>>>>> 5) The hotness counter is reset EVERY TIME active methods are
>>>>>>> scanned.
>>>>>>> In the previous version
>>>>>>>      the hotness counter was only reset after a full sweep cycle of
>>>>>>> the
>>>>>>> code cache. Resetting the
>>>>>>>     hotness counter more frequently provides a better hotness
>>>>>>> coverage
>>>>>>> of methods.
>>>>>>>
>>>>>>> 6) Methods are flushed in blocks of 1m. The algorithm computes the
>>>>>>> average hotness of a
>>>>>>>      a nmethod block and evicts the entire block. This should reduce
>>>>>>> fragmentation.
>>>>>>>
>>>>>>> Please let me know what you think about these changes. Performance
>>>>>>> results can be found at:
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8020151
>>>>>>> I will continuously provide more results.
>>>>>>>
>>>>>>> Here is the new webrev:
>>>>>>> http://cr.openjdk.java.net/~anoll/8020151/webrev.01/
>>>>>>> <http://cr.openjdk.java.net/%7Eanoll/8020151/webrev.01/>
>>>>>>
>>>>>> Very nice comment in sweeper.hpp. Could you please extend it?
>>>>>>
>>>>>> 2) sweep nmethods
>>>>>>
>>>>>> Sweeping is currently done by Compiler thread between compilations or
>>>>>> at least each 5 sec (NmethodSweepCheckInterval) when CodeCache is
>>>>>> full.
>>>>>>
>>>>>> The comment is not cleare that nmethods could be marked as
>>>>>> not-entrant
>>>>>> by different code (deoptimization, dependency invalidation, replace
>>>>>> old nmethod) and not just by sweeper (which does that only for
>>>>>> flushed
>>>>>> nmethods).
>>>>>>
>>>>>> Each nmethod's state change happens during separate sweeps. It may
>>>>>> take at least 3 sweeps before nmethod's space is freed.
>>>>>>
>>>>>> 3) code cache flushing
>>>>>> ... as a VM operation at safepoint.
>>>>>>
>>>>>> Flashing VM operation requires safepoint.
>>>>>>
>>>>>> As you pointed current code re-animates nmethods only if we get
>>>>>> compilation request. It means that it depends on profiling counters.
>>>>>> It could be the case that a method have to be executed in interpreter
>>>>>> for some time again to hit compilation threshold.
>>>>>>
>>>>>> I think we need to re-animate nmethod immediately during java call:
>>>>>>
>>>>>> JavaCalls::call_helper()
>>>>>>
>>>>>>   if (CompilationPolicy::must_be_compiled(method)) {
>>>>>>     CompileBroker::compile_method(method, InvocationEntryBci,
>>>>>>
>>>>>> CompilationPolicy::policy()->initial_compile_level(),
>>>>>>                                   methodHandle(), 0,
>>>>>> "must_be_compiled", CHECK);
>>>>>> + } else if (UseCodeCacheFlushing) {
>>>>>> +   nmethod* saved = CodeCache::reanimate_saved_code(method());
>>>>>> +   if (saved != NULL) {
>>>>>> +     method->set_code(method, saved);
>>>>>> +   }
>>>>>>   }
>>>>>>
>>>>>> Can you verify how this work?
>>>>>>
>>>>>> I will send additional comments when look through the rest of code.
>>>>>>
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>>
>>>>>>>
>>>>>>> Best,
>>>>>>> Albert
>>>>>>>
>>>>>>>
>>>>>>> On 22.08.2013 15:16, Albert Noll wrote:
>>>>>>>> Hi Igor,
>>>>>>>>
>>>>>>>> thanks again for your comments. You are right. I will run some
>>>>>>>> benchmarks to see if removing
>>>>>>>> the disconnect logic is feasible or not.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Albert
>>>>>>>>
>>>>>>>> On 22.08.2013 12:39, Igor Veresov wrote:
>>>>>>>>> It's worth a try. But the "hotness" logic is probabilistic and
>>>>>>>>> imprecise - it will notice only methods that are on stack during
>>>>>>>>> safepoints. Those are going to be pretty narrow snapshots of
>>>>>>>>> activity. I suspect that for large flat profiles (like enterprise
>>>>>>>>> apps and friends) you could be missing methods that are rather
>>>>>>>>> warm
>>>>>>>>> in reality, which will cause recompilation oscillations.
>>>>>>>>> Although if
>>>>>>>>> the statistics are allowed to accumulate enough may be it's
>>>>>>>>> going to
>>>>>>>>> work out, an experiment will tell. The patch is a good start, the
>>>>>>>>> stack sampling, IMO, is totally the right approach for
>>>>>>>>> filtering out
>>>>>>>>> the hot methods.
>>>>>>>>>
>>>>>>>>> The "disconnect" logic on the other hand is sort of precise.
>>>>>>>>> Although, looking now at the code it's not quite clear to me
>>>>>>>>> how it
>>>>>>>>> works, it doesn't seem to be any patching going on to divert the
>>>>>>>>> control for the cases when the nmethod is called directly or
>>>>>>>>> through
>>>>>>>>> an IC. So I guess it's not really a full disconnect? Anyways, in
>>>>>>>>> theory, with some work, we can make the disconnect logic to
>>>>>>>>> precisely
>>>>>>>>> measure the time the method is inactive. Which should provide
>>>>>>>>> precise
>>>>>>>>> information about the warm/cold methods.
>>>>>>>>>
>>>>>>>>> Btw, also just noticed a bunch of flaws in the interaction of the
>>>>>>>>> disconnect logic and tiered. The nmethod's "reconnection"
>>>>>>>>> happens in
>>>>>>>>> CompileBroker::compile_method(), which firstly will be called by
>>>>>>>>> tiered only after a rather substantial number of invocations in
>>>>>>>>> the
>>>>>>>>> interpreter (up to 128), and secondly will be subject to all the
>>>>>>>>> prioritization rules (it probably should not), and also we don't
>>>>>>>>> check if the comp level of the reanimated nmethod matches the
>>>>>>>>> request. If the disconnect logic is to stay, the interpreter
>>>>>>>>> should
>>>>>>>>> be able to know if the method has saved code and be able to call
>>>>>>>>> into
>>>>>>>>> the runtime immediately to reanimate it.
>>>>>>>>>
>>>>>>>>> igor
>>>>>>>>>
>>>>>>>>> On Aug 21, 2013, at 10:27 PM, Albert Noll <albert.noll at oracle.com
>>>>>>>>> <mailto:albert.noll at oracle.com>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Igor,
>>>>>>>>>>
>>>>>>>>>> thanks for looking at the patch. Actually, I think - just as
>>>>>>>>>> Vladimir pointed out - that we can get rid of
>>>>>>>>>> the "disconnect" logic. We now have the hotness of a method
>>>>>>>>>> and if
>>>>>>>>>> the code cache fills up, we and
>>>>>>>>>> we decide to schedule the method for removal, we set it to
>>>>>>>>>> not_entrant.
>>>>>>>>>> It seems that adding the method to the list of disconnected
>>>>>>>>>> methods
>>>>>>>>>> just buys a little more time until we decide to make the method
>>>>>>>>>> not-entrant. However, we can have the same effect by setting the
>>>>>>>>>> threshold differently.
>>>>>>>>>>
>>>>>>>>>> What do you think?
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Albert
>>>>>>>>>>
>>>>>>>>>> On 22.08.2013 10:02, Igor Veresov wrote:
>>>>>>>>>>> May be instead of "(_traversals > _last_flush_traversal_id +
>>>>>>>>>>> 2)" we
>>>>>>>>>>> should timestamp a method when it's disconnected, and then use a
>>>>>>>>>>> rule like if a method has been disconnected for k *
>>>>>>>>>>> reverse_free_ratio() seconds then it's ok to kill it. We can
>>>>>>>>>>> also
>>>>>>>>>>> sort the nmethods that pass that filter by the amount of time
>>>>>>>>>>> they
>>>>>>>>>>> were disconnected and select most likely candidates for
>>>>>>>>>>> flushing.
>>>>>>>>>>> This should allow to basically do disconnect/flush in every
>>>>>>>>>>> traversal, which should make things faster. Timestamps would be
>>>>>>>>>>> obtained only once per traversal or something like that. What do
>>>>>>>>>>> you think?
>>>>>>>>>>>
>>>>>>>>>>> Pretty cool idea to reverse-prioritize disconnects on hotness.
>>>>>>>>>>>
>>>>>>>>>>> igor
>>>>>>>>>>>
>>>>>>>>>>> On Aug 21, 2013, at 4:42 AM, Albert Noll <albert.noll at oracle.com
>>>>>>>>>>> <mailto:albert.noll at oracle.com>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> could I have reviews for this patch? Please note
>>>>>>>>>>>> that I do not yet feel very confident with the sweeper,
>>>>>>>>>>>> so please take a close look.
>>>>>>>>>>>>
>>>>>>>>>>>> jbs: https://jbs.oracle.com/bugs/browse/JDK-8020151
>>>>>>>>>>>> webrev: http://cr.openjdk.java.net/~anoll/8020151/webrev.00/
>>>>>>>>>>>> <http://cr.openjdk.java.net/%7Eanoll/8020151/webrev.00/>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Many thanks in advance,
>>>>>>>>>>>> Albert
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Problem: There can be large performance regressions when the
>>>>>>>>>>>> code
>>>>>>>>>>>> cache fills up. There are
>>>>>>>>>>>> several reasons for the performance regression: First (1), when
>>>>>>>>>>>> the code cache is full and methods
>>>>>>>>>>>> are speculatively disconnected, the oldest methods (based on
>>>>>>>>>>>> compilation ID) are scheduled for
>>>>>>>>>>>> flushing. This can result in flushing hot methods. Second (2),
>>>>>>>>>>>> when compilation is disabled due to a full
>>>>>>>>>>>> code cache, the number of sweeps can go down. A lower number of
>>>>>>>>>>>> sweep operations results
>>>>>>>>>>>> in slower method flushing.
>>>>>>>>>>>>
>>>>>>>>>>>> Solution:
>>>>>>>>>>>> Introduce a hotness counter that is set to a particular value
>>>>>>>>>>>> (e.g., 100) when there is an activation
>>>>>>>>>>>> of the method during stack scanning. The counter is
>>>>>>>>>>>> decremented by
>>>>>>>>>>>> 1 every time the sweeper
>>>>>>>>>>>> is invoked.
>>>>>>>>>>>>
>>>>>>>>>>>> ad (1):
>>>>>>>>>>>>   A VM operation that speculatively disconnects nmethods,
>>>>>>>>>>>> selects
>>>>>>>>>>>> the methods that should be
>>>>>>>>>>>>   flushed based on the hotness. For example, if 50% of the code
>>>>>>>>>>>> cache shall be flushed, we flush
>>>>>>>>>>>>   those methods that have not been active while stack scanning
>>>>>>>>>>>> for
>>>>>>>>>>>> the longest time. Note that
>>>>>>>>>>>>   while this strategy is more likely to flush cold methods,
>>>>>>>>>>>> it is
>>>>>>>>>>>> not clear to what extent the new
>>>>>>>>>>>>   strategy fragments the code cache.
>>>>>>>>>>>>
>>>>>>>>>>>>   Changes in
>>>>>>>>>>>> NMethodSweeper::speculative_disconnect_nmethods(bool
>>>>>>>>>>>> is_full)
>>>>>>>>>>>>
>>>>>>>>>>>> ad (2)
>>>>>>>>>>>>   Currently, methods are removed from the code cache if:
>>>>>>>>>>>>     a) code cache is full
>>>>>>>>>>>>     b) class is unloaded
>>>>>>>>>>>>     c) method is replaced by another version (i.e., compiled
>>>>>>>>>>>> with
>>>>>>>>>>>> a different tier)
>>>>>>>>>>>>     d) deopt
>>>>>>>>>>>>
>>>>>>>>>>>>    The current patch adds a 5-th possibility to remove a method
>>>>>>>>>>>> from the code cache.
>>>>>>>>>>>>    In particular, if a method has not been active during stack
>>>>>>>>>>>> scanning for a long-enough
>>>>>>>>>>>>    amount of time, the method is removed from the code
>>>>>>>>>>>> cache. The
>>>>>>>>>>>> amount of time
>>>>>>>>>>>>    required to flush the method depends on the available
>>>>>>>>>>>> space in
>>>>>>>>>>>> the code cache.
>>>>>>>>>>>>
>>>>>>>>>>>>    Here is one example: If a method was seen on a stack the
>>>>>>>>>>>> hotness counter
>>>>>>>>>>>>    is set to 100. A sweep operation takes roughly place every
>>>>>>>>>>>> 100ms. I.e., it takes
>>>>>>>>>>>>    100ms * 100 = 10s until the hotness counter reaches 0. The
>>>>>>>>>>>> threshold that determines
>>>>>>>>>>>>    if a method should be removed from the code cache is
>>>>>>>>>>>> calculated
>>>>>>>>>>>> as follows:
>>>>>>>>>>>>
>>>>>>>>>>>>    threshold = -100 + (CodeCache::reverse_free_ratio() *
>>>>>>>>>>>> NMethodSweepActivity)
>>>>>>>>>>>>
>>>>>>>>>>>>     For example, if 25% of the code cache is free,
>>>>>>>>>>>> reverse_free_ratio returns 4.
>>>>>>>>>>>>     The default value of NMethodSweepActivity is 10. As a
>>>>>>>>>>>> result,
>>>>>>>>>>>> threshold = -60.
>>>>>>>>>>>>     Consequently, all methods that have a hotness value smaller
>>>>>>>>>>>> than -60 (which
>>>>>>>>>>>>     means they have not been seen on the stack for 16s) are
>>>>>>>>>>>> scheduled to be flushed
>>>>>>>>>>>>     from the code cache. See an illustration of the threshold
>>>>>>>>>>>> as a
>>>>>>>>>>>> function of the available
>>>>>>>>>>>>     code cache in threshold.pdf
>>>>>>>>>>>>
>>>>>>>>>>>>     Note that NMethodSweepActivity is a parameter that can be
>>>>>>>>>>>> specified via a -XX
>>>>>>>>>>>>     flag.
>>>>>>>>>>>>
>>>>>>>>>>>> Changes in NMethodSweeper::sweep_code_cache()
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> A very preliminary performance evaluation looks promising. I
>>>>>>>>>>>> used
>>>>>>>>>>>> the DaCapo
>>>>>>>>>>>> benchmarks where a series of benchmarks is executed in the
>>>>>>>>>>>> same VM
>>>>>>>>>>>> instance.
>>>>>>>>>>>> See performance.pdf . The x-axis shows the benchmarks.
>>>>>>>>>>>> Assume we
>>>>>>>>>>>> have 2 benchmarks
>>>>>>>>>>>> (BM). The execution sequence is as follows:
>>>>>>>>>>>>
>>>>>>>>>>>> BM1 (Run 1-1)
>>>>>>>>>>>> BM1 (Run 2-1)
>>>>>>>>>>>> BM2 (Run 1-1)
>>>>>>>>>>>> BM2 (Run 2-1)
>>>>>>>>>>>>
>>>>>>>>>>>> BM1 (Run 1-2)
>>>>>>>>>>>> BM1 (Run 2-2)
>>>>>>>>>>>> BM2 (Run 1-2)
>>>>>>>>>>>> BM2 (Run 2-2)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> A value larger than 0 on the x-axis indicates that the version
>>>>>>>>>>>> including the proposed patch is faster.
>>>>>>>>>>>> I.e., the values are calculated as follows: (T_original /
>>>>>>>>>>>> T_with_patch) - 1. T is the execution time
>>>>>>>>>>>> (wall clock time) of the benchmark. ReservedCodeCacheSize is
>>>>>>>>>>>> set
>>>>>>>>>>>> to 50m.  I used three runs and
>>>>>>>>>>>> the arithmetic average to compare the numbers. I know, we need
>>>>>>>>>>>> much more data, however,
>>>>>>>>>>>> I think we can see a trend.
>>>>>>>>>>>>
>>>>>>>>>>>> The current patch does not trigger a warning that the code
>>>>>>>>>>>> cache
>>>>>>>>>>>> is full and compilation has been
>>>>>>>>>>>> disabled.
>>>>>>>>>>>>
>>>>>>>>>>>> Please let me know that you think.
>>>>>>>>>>>> <threshold.pdf><performance.pdf>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>
>


More information about the hotspot-compiler-dev mailing list