RFR(S): 8020151: PSR:PERF Large performance regressions when code cache is filled
Albert Noll
albert.noll at oracle.com
Thu Aug 22 06:16:37 PDT 2013
Hi Igor,
thanks again for your comments. You are right. I will run some
benchmarks to see if removing
the disconnect logic is feasible or not.
Best,
Albert
On 22.08.2013 12:39, Igor Veresov wrote:
> It's worth a try. But the "hotness" logic is probabilistic and
> imprecise - it will notice only methods that are on stack during
> safepoints. Those are going to be pretty narrow snapshots of activity.
> I suspect that for large flat profiles (like enterprise apps and
> friends) you could be missing methods that are rather warm in reality,
> which will cause recompilation oscillations. Although if the
> statistics are allowed to accumulate enough may be it's going to work
> out, an experiment will tell. The patch is a good start, the stack
> sampling, IMO, is totally the right approach for filtering out the hot
> methods.
>
> The "disconnect" logic on the other hand is sort of precise. Although,
> looking now at the code it's not quite clear to me how it works, it
> doesn't seem to be any patching going on to divert the control for the
> cases when the nmethod is called directly or through an IC. So I guess
> it's not really a full disconnect? Anyways, in theory, with some
> work, we can make the disconnect logic to precisely measure the time
> the method is inactive. Which should provide precise information about
> the warm/cold methods.
>
> Btw, also just noticed a bunch of flaws in the interaction of the
> disconnect logic and tiered. The nmethod's "reconnection" happens in
> CompileBroker::compile_method(), which firstly will be called by
> tiered only after a rather substantial number of invocations in the
> interpreter (up to 128), and secondly will be subject to all the
> prioritization rules (it probably should not), and also we don't check
> if the comp level of the reanimated nmethod matches the request. If
> the disconnect logic is to stay, the interpreter should be able to
> know if the method has saved code and be able to call into the runtime
> immediately to reanimate it.
>
> igor
>
> On Aug 21, 2013, at 10:27 PM, Albert Noll <albert.noll at oracle.com
> <mailto:albert.noll at oracle.com>> wrote:
>
>> Hi Igor,
>>
>> thanks for looking at the patch. Actually, I think - just as Vladimir
>> pointed out - that we can get rid of
>> the "disconnect" logic. We now have the hotness of a method and if
>> the code cache fills up, we and
>> we decide to schedule the method for removal, we set it to not_entrant.
>> It seems that adding the method to the list of disconnected methods
>> just buys a little more time until we decide to make the method
>> not-entrant. However, we can have the same effect by setting the
>> threshold differently.
>>
>> What do you think?
>>
>> Best,
>> Albert
>>
>> On 22.08.2013 10:02, Igor Veresov wrote:
>>> May be instead of "(_traversals > _last_flush_traversal_id + 2)" we
>>> should timestamp a method when it's disconnected, and then use a
>>> rule like if a method has been disconnected for k *
>>> reverse_free_ratio() seconds then it's ok to kill it. We can also
>>> sort the nmethods that pass that filter by the amount of time they
>>> were disconnected and select most likely candidates for flushing.
>>> This should allow to basically do disconnect/flush in every
>>> traversal, which should make things faster. Timestamps would be
>>> obtained only once per traversal or something like that. What do you
>>> think?
>>>
>>> Pretty cool idea to reverse-prioritize disconnects on hotness.
>>>
>>> igor
>>>
>>> On Aug 21, 2013, at 4:42 AM, Albert Noll <albert.noll at oracle.com
>>> <mailto:albert.noll at oracle.com>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> could I have reviews for this patch? Please note
>>>> that I do not yet feel very confident with the sweeper,
>>>> so please take a close look.
>>>>
>>>> jbs: https://jbs.oracle.com/bugs/browse/JDK-8020151
>>>> webrev: http://cr.openjdk.java.net/~anoll/8020151/webrev.00/
>>>> <http://cr.openjdk.java.net/%7Eanoll/8020151/webrev.00/>
>>>>
>>>>
>>>> Many thanks in advance,
>>>> Albert
>>>>
>>>>
>>>> Problem: There can be large performance regressions when the code
>>>> cache fills up. There are
>>>> several reasons for the performance regression: First (1), when the
>>>> code cache is full and methods
>>>> are speculatively disconnected, the oldest methods (based on
>>>> compilation ID) are scheduled for
>>>> flushing. This can result in flushing hot methods. Second (2), when
>>>> compilation is disabled due to a full
>>>> code cache, the number of sweeps can go down. A lower number of
>>>> sweep operations results
>>>> in slower method flushing.
>>>>
>>>> Solution:
>>>> Introduce a hotness counter that is set to a particular value
>>>> (e.g., 100) when there is an activation
>>>> of the method during stack scanning. The counter is decremented by
>>>> 1 every time the sweeper
>>>> is invoked.
>>>>
>>>> ad (1):
>>>> A VM operation that speculatively disconnects nmethods, selects
>>>> the methods that should be
>>>> flushed based on the hotness. For example, if 50% of the code
>>>> cache shall be flushed, we flush
>>>> those methods that have not been active while stack scanning for
>>>> the longest time. Note that
>>>> while this strategy is more likely to flush cold methods, it is
>>>> not clear to what extent the new
>>>> strategy fragments the code cache.
>>>>
>>>> Changes in NMethodSweeper::speculative_disconnect_nmethods(bool
>>>> is_full)
>>>>
>>>> ad (2)
>>>> Currently, methods are removed from the code cache if:
>>>> a) code cache is full
>>>> b) class is unloaded
>>>> c) method is replaced by another version (i.e., compiled with a
>>>> different tier)
>>>> d) deopt
>>>>
>>>> The current patch adds a 5-th possibility to remove a method
>>>> from the code cache.
>>>> In particular, if a method has not been active during stack
>>>> scanning for a long-enough
>>>> amount of time, the method is removed from the code cache. The
>>>> amount of time
>>>> required to flush the method depends on the available space in
>>>> the code cache.
>>>>
>>>> Here is one example: If a method was seen on a stack the hotness
>>>> counter
>>>> is set to 100. A sweep operation takes roughly place every
>>>> 100ms. I.e., it takes
>>>> 100ms * 100 = 10s until the hotness counter reaches 0. The
>>>> threshold that determines
>>>> if a method should be removed from the code cache is calculated
>>>> as follows:
>>>>
>>>> threshold = -100 + (CodeCache::reverse_free_ratio() *
>>>> NMethodSweepActivity)
>>>>
>>>> For example, if 25% of the code cache is free,
>>>> reverse_free_ratio returns 4.
>>>> The default value of NMethodSweepActivity is 10. As a result,
>>>> threshold = -60.
>>>> Consequently, all methods that have a hotness value smaller
>>>> than -60 (which
>>>> means they have not been seen on the stack for 16s) are
>>>> scheduled to be flushed
>>>> from the code cache. See an illustration of the threshold as a
>>>> function of the available
>>>> code cache in threshold.pdf
>>>>
>>>> Note that NMethodSweepActivity is a parameter that can be
>>>> specified via a -XX
>>>> flag.
>>>>
>>>> Changes in NMethodSweeper::sweep_code_cache()
>>>>
>>>>
>>>> A very preliminary performance evaluation looks promising. I used
>>>> the DaCapo
>>>> benchmarks where a series of benchmarks is executed in the same VM
>>>> instance.
>>>> See performance.pdf . The x-axis shows the benchmarks. Assume we
>>>> have 2 benchmarks
>>>> (BM). The execution sequence is as follows:
>>>>
>>>> BM1 (Run 1-1)
>>>> BM1 (Run 2-1)
>>>> BM2 (Run 1-1)
>>>> BM2 (Run 2-1)
>>>>
>>>> BM1 (Run 1-2)
>>>> BM1 (Run 2-2)
>>>> BM2 (Run 1-2)
>>>> BM2 (Run 2-2)
>>>>
>>>>
>>>> A value larger than 0 on the x-axis indicates that the version
>>>> including the proposed patch is faster.
>>>> I.e., the values are calculated as follows: (T_original /
>>>> T_with_patch) - 1. T is the execution time
>>>> (wall clock time) of the benchmark. ReservedCodeCacheSize is set to
>>>> 50m. I used three runs and
>>>> the arithmetic average to compare the numbers. I know, we need much
>>>> more data, however,
>>>> I think we can see a trend.
>>>>
>>>> The current patch does not trigger a warning that the code cache is
>>>> full and compilation has been
>>>> disabled.
>>>>
>>>> Please let me know that you think.
>>>> <threshold.pdf><performance.pdf>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20130822/57d007c0/attachment-0001.html
More information about the hotspot-compiler-dev
mailing list