RFR(S): 8027593: performance drop with constrained codecache starting with hs25 b111
Albert Noll
albert.noll at oracle.com
Sun Nov 10 23:38:36 PST 2013
Hi Igor,
thanks for looking into this. My only concern with printing the warning
under -XX:+PrintCodeCache is
that we change existing behavior of PrintCodeCache. If this is not an
issue, I am fine with it.
I think that the cleanest solution is to introduce a new product flag,
(e.g., -XX:+PrintFullCodeCacheWarnings) and default that value to true.
I would set the default value to true,
since the code cache is not expected to become full with default code
cache sizes. If the code cache
becomes full nevertheless or the user sets a small code cache size we
could print a warning like this:
> Compiler was disabled because there is insufficient contiguous free
space in the code cache.
> Free space: 2k requested size: 4k
> Try to increase code cache with -XX:ReservedCodeCacheSize= or try to
increase code cache
> sweeper activity with -XX:NmethodSweepActivity= (default value is 10).
> Disable this warning with: -XX:-PrintFullCodeCacheWarnings
I would not print all the information that we print right now, because I
think it is too detailed.
E.g., I am not sure if it is helpful to print the bounds if the code
cache. Also, I think we should
substract CodeCacheMinimumFreeSpace from unallocated_capacity. It is
confusing that we
run out of code cache (and disable compilation) although there is still
500k left.
Please let me know what you think.
Best,
Albert
On 11/08/2013 11:44 PM, Igor Veresov wrote:
> Hi Albert,
>
> I talked with Vladimir and we have a suggestion about the warning. How about we print it only the first time by default and every time if PrintCodeCache is set? The fact that it is printed even once should suggest the user that the code cache size is something that needs attention, and that the VM is already operating with the constraint code cache space.
>
> Thanks!
> igor
>
> On Nov 8, 2013, at 7:32 AM, Albert Noll <albert.noll at oracle.com> wrote:
>
>> Hi Vladimir,
>>
>> thanks for looking at the patch. I hope that this version addresses all issues that
>> you brought up. Here is a high-level overview of the changes since the last version:
>>
>> * We keep track of code cache state changes that also happen outside the sweeper.
>> I re-installed notify, which is now called report_state_change() and is doing the job.
>> report_state_change() checks if enough state has changed and enables the sweeper
>> (it sets _should_sweep) to true. We reset _bytes_changed only once we have finished
>> a while sweep cycle. That seems to make sense to me.
>>
>> * I added code that prints out every 10th warning that the compiler has been disabled.
>> I also added a warning that compilation has been enabled again. I think if we print a message
>> that compilation has been disabled, we should also print a message (maybe not a warning)
>> that was enabled again.
>>
>> Here is the new webrev:
>> http://cr.openjdk.java.net/~anoll/8027593/webrev.02/
>>
>> Best,
>> Albert
>>
>> On 11/07/2013 11:04 PM, Vladimir Kozlov wrote:
>>> On 11/7/13 1:36 PM, Vladimir Kozlov wrote:
>>>> Nice work, Albert
>>>>
>>>> One concern is transition "alive -> not_entrant" is counted only when
>>>> the nmethod needs to be flushed because you removed in notify() in
>>>> nmethod::make_not_entrant_or_zombie(). Also make_zombie() is called from
>>>> other places.
>>>> I think _bytes_changed should be updated by NMethodSweeper::notify() so
>>>> don't remove it from nmethod.cpp. _bytes_changed should be reset when we
>>>> finished sweep instead of before each sweep.
>>> May be reset in both places actually. One to check that there were updates between sweeps and an other during sweep (as you do currently).
>>>
>>> Thanks,
>>> Vladimir
>>>
>>>> Can you keep comments in code which initialize static variables (first
>>>> changes in sweeper.cpp)?
>>>>
>>>> Please narrow your main comment, chars 90 chars per line.
>>>>
>>>> What is the second place? (initialization should not be count):
>>>>
>>>> + // of the two places where should_sweep can be set to true.
>>>>
>>>> You need to clear state in the comment conditions when we sweep. Like
>>>> you did in the replay:
>>>>
>>>> (1) The code cache is getting full
>>>> (2) There are sufficient state changes in the last sweep.
>>>> (3) We have not been sweeping for 'some time'
>>>>
>>>> Split into 2 lines:
>>>>
>>>> + int wait_until_next_sweep = (ReservedCodeCacheSize / (16 * M)) -
>>>> time_since_last_sweep - CodeCache::reverse_free_ratio();
>>>>
>>>> In the print format do not use %p, use PTR_FORMAT for 'nm'.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 11/7/13 3:27 AM, Albert Noll wrote:
>>>>> The previous mail contains an error. See inline.
>>>>>
>>>>> Albert
>>>>>
>>>>>
>>>>> On 11/07/2013 12:20 PM, Albert Noll wrote:
>>>>>> Vladimir, Igor, John, thanks for looking at the patch.
>>>>>> Here is the updated webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~anoll/8027593/webrev.01/
>>>>>>
>>>>>> Please see comments inline.
>>>>>>
>>>>>>
>>>>>> On 11/06/2013 02:52 AM, John Rose wrote:
>>>>>>> Good idea.
>>>>>>>
>>>>>>> -- John (on my iPhone)
>>>>>>>
>>>>>>> On Nov 5, 2013, at 3:11 PM, Igor Veresov <igor.veresov at oracle.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Looks good. It’s not related to the change, but could you please
>>>>>>>> consider adding some printing under PrintMethodFlushing && Verbose
>>>>>>>> for the case when the method is made not entrant and include hotness
>>>>>>>> and threshold values?
>>>>>>>>
>>>>>>>> igor
>>>>>> I also agree. I added the print.
>>>>>>>> On Nov 5, 2013, at 10:09 AM, Vladimir Kozlov
>>>>>>>> <vladimir.kozlov at oracle.com> wrote:
>>>>>>>>
>>>>>>>>> On 11/5/13 6:44 AM, Albert Noll wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> could I get reviews for this small patch?
>>>>>>>>>>
>>>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8027593
>>>>>>>>>> webrev: http://cr.openjdk.java.net/~anoll/8027593/webrev.00/
>>>>>>>>>>
>>>>>>>>>> Problem: The implementation of the sweeper (8020151) causes a
>>>>>>>>>> performance regression for small code cache sizes. There
>>>>>>>>>> are two issues that cause this regression:
>>>>>>>>>> 1) NmethodSweepFraction is only adjusted according to the
>>>>>>>>>> ReservedCodecacheSize if TieredCompilation is enabled. As a
>>>>>>>>>> result, NmethodSweepFraction remains 16 (if TieredCompilation is
>>>>>>>>>> not used). This is way too large for small code cache
>>>>>>>>>> sizes (e.g., <5m).
>>>>>>>>>> 2) _request_mark_phase (sweeper.cpp) is initialized to false. As a
>>>>>>>>>> result, mark_active_nmethods() did not set
>>>>>>>>>> _invocations and _current, which results in not invoking the
>>>>>>>>>> sweeper (calling sweep_code_cache()) at all. When
>>>>>>>>>> TieredCompilation is enabled this was not an issue, since
>>>>>>>>>> NmethodSweeper::notify() (which sets _request_mark_phase) is
>>>>>>>>>> called much more frequently.
>>>>>>>>>>
>>>>>>>>>> Solution: 1) Move setting of NmethodSweepFraction so that it is
>>>>>>>>>> always executed.
>>>>>>>>> Good.
>>>>>>>>>
>>>>>> The place where I moved the adaption of NmethodSweepFraction is not
>>>>>> good, since the
>>>>>> the code cache size is adapted later for tiered. The current version
>>>>>> should be fine.
>>>>>>>>>> Solution: 2) Remove need_marking_phase(),
>>>>>>>>>> request_nmethod_marking(), and reset_nmetod_marking().
>>>>>>>>>> I think that these checks are not needed since
>>>>>>>>>> 8020151, since we do stack scanning of
>>>>>>>>>> active nmethods irrespective of the value of
>>>>>>>>>> what need_marking_phase() returns. Since
>>>>>>>>>> the patch removes need_marking_phase() printing
>>>>>>>>>> out the warning (line 327 in
>>>>>>>>>> sweeper.cpp) is incorrect, i.e., we continue to
>>>>>>>>>> invoke the sweeper. I removed the warning
>>>>>>>>>> and the associated code.
>>>>>>>>> Don't put mark_active_nmethods() under !UseCodeCacheFlushing
>>>>>>>>> condition. We always want to reclaim space in codecache.
>>>>>> Done.
>>>>>>>>> To do nmethod marking at each safepoint is fine (we have to do it
>>>>>>>>> to make sure we did not miss live nmethods). But with your changes
>>>>>>>>> each safepoint triggers sweep. Do we really need sweep so
>>>>>>>>> frequently? Why to sweep if there were no nmethods state change and
>>>>>>>>> there is enough space in CodeCache. So I am not sure about removing
>>>>>>>>> _request_mark_phase, init it with 'true' is okay.
>>>>>> I agree. The current patch contains a more sophisticated logic to
>>>>>> determine when we call the
>>>>>> sweeper. The bottom line is that we do not invoke the sweeper only if:
>>>>> !!!!We DO INVOKE the sweeper only if:
>>>>>> (1) The code cache is getting full and/or
>>>>>> (2) There are sufficient state changes in the last sweep.
>>>>> !!!!!(3) We have not been sweeping for 'some time'
>>>>>> The patch contains a detailed description + examples of the logic. I
>>>>>> tested the patch
>>>>>> with small code cache sizes (specjvm98 + <4m code cache), medium-sized
>>>>>> code cache
>>>>>> (128m + nashorn + octane), and large code cache (240m + nashorn +
>>>>>> octane). The measurements
>>>>>> indicate that with the current logic in place, we can reduce the
>>>>>> number of sweeps by 50% for
>>>>>> medium code cache sizes without increasing the maximum used code cache
>>>>>> size. The difference
>>>>>> is more or less not significant.
>>>>>>
>>>>>> Please let me know what you think about it. The main disadvantage I
>>>>>> see with this change is that
>>>>>> it makes reasoning about the sweeper harder than it was before.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>>> The warning was useless so it is okay to remove it and
>>>>>>>>> corresponding code.
>>>>>>>>>
>>>>>>>>>> Also, I think that we can either remove -XX:MethodFlushing or
>>>>>>>>>> -XX:UseCodeCacheFlushing. Since 8020151, one of them is
>>>>>>>>>> redundant and can be removed. I am not quite sure if we should do
>>>>>>>>>> that now so it is not included in the patch.
>>>>>>>>> It is for separate change.
>>>>>>>>>
>>>>>>>>> MethodFlushing is obsolete because we can not run VM without
>>>>>>>>> codecache sweeping because we loose performance since we go into
>>>>>>>>> Interpreter after codecache filled. Did you tried to run with it
>>>>>>>>> OFF? I think it is good candidate to go.
>>>>>>>>>
>>>>>>>>> The problem with UseCodeCacheFlushing is it becomes famous so you
>>>>>>>>> can't remove it easily. But don't replace MethodFlushing with it. I
>>>>>>>>> think code which currently guarded by MethodFlushing should be
>>>>>>>>> executed unconditionally.
>>>>>>>>>
>>>>>>>>> In arguments.cpp there is table for obsolete flags:
>>>>>>>>> static ObsoleteFlag obsolete_jvm_flags[] = {
>>>>>>>>>
>>>>>>>>> jdk8 is major release so we can change flags. Add MethodFlushing
>>>>>>>>> there to remove it in jdk9:
>>>>>>>>> { "MethodFlushing", JDK_Version::jdk(8), JDK_Version::jdk(9) },
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Vladimir
>>>>>> I'll file a new bug to remove the flag. I guess this change will most
>>>>>> likely only make it into 8uXX.
>>>>>>>>>> Testing
>>>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8027593 also shows a
>>>>>>>>>> performance evaluation.
>>>>>>>>>>
>>>>>>>>>> Many thanks for looking at the patch.
>>>>>>>>>> Best,
>>>>>>>>>> Albert
More information about the hotspot-compiler-dev
mailing list