RFR 8012371: Adjust Tiered compile threshold according to available space in codecache
Christian Thalinger
christian.thalinger at oracle.com
Tue May 14 11:41:03 PDT 2013
On May 14, 2013, at 11:21 AM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> http://cr.openjdk.java.net/~kvn/8012371/webrev
This seems to be a good change. Did you also do a refworkload run?
-- Chris
>
> On 5/14/13 6:23 AM, Albert Noll wrote:
>
> Hi,
>
> I think I found a solution to the code cache fill-up-problem.
>
> The problem with the previous solution was that the threshold for
> recompilation is increased equally
> for different tiers. As a result, peak performance was not reached if
> the code cache was "rather full", since frequently invoked methods that
> WERE compiled to C2 in a non-tiered version WERE NOT compiled with C2
> when using tiered compilation.
>
> The proposed solution to the problem is that we start increasing the
> threshold rather early (e.g., if the code cache is filled up by 50%),
> and do not increase the threshold for C2 compilation. As a result, we
> have enough space for C2 code (we reach peak performance).
> The drawback of this solution, of course, is that tiered compilation
> potentially performs worse than if we provide more code cache. However,
> this solution should not perform worse compared to not using tiered
> compilation.
>
> I evaluated the proposed changes using the nashorn benchmarks with
> ReservedCodeCacheSize=80m letting all benchmarks run in the same JVM
> instance. We start increaseing the threshold for recompilation (not for
> recompiling to C2) when the code cache is filled up by 50%. The result
> is that
> the warning that the code cache is filled up and compilation stops is
> not printed out. Furthermore, we achieve similar peak performance
> compared to non-tiered but a faster startup time.
>
>
> Many thanks for your comments,
> Albert
>
> On 07/05/2013 22:38, Vladimir Kozlov wrote:
>> And add product flag for initial ratio value so people can adjust it
>> as they wish.
>>
>> Vladimir
>>
>> On 5/7/13 9:40 AM, Vladimir Kozlov wrote:
>>> Albert,
>>>
>>> You should start using Nashorn/octane for performance testing since
>>> TieredCompilatation has big effect on it. Roland can help you with it.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 5/7/13 9:08 AM, Vladimir Kozlov wrote:
>>>> On 5/7/13 6:49 AM, Albert Noll wrote:
>>>>> Hi Vladimir,
>>>>>
>>>>> I performed a preliminary evaluation of the effects on the size of
>>>>> generated code.
>>>>> I used the eclipse benchmark from the DaCapo benchmarks.
>>>>> In the test, I limited the ReservedCodeCacheSize to 32m
>>>>>
>>>>> With the changes in advancedThresholdPolicy, 2 runs generate 76mb
>>>>> code.
>>>>> Without the changes, 2 runs generate 116mb code.
>>>>
>>>> This is good but I mostly concern about effect on performance, startup
>>>> and peek. Also look on codecache usage with default size at the end of
>>>> execution. Use -XX:PrintCompilation which has time stamps (first
>>>> number)
>>>> in output to see have behavior change. Note, third number in output is
>>>> compilation type: 3 - C1 with profiling, 4 - C2.
>>>>
>>>> Sorry about dexp() suggestion, yes it needs to be called at correct
>>>> thread state.
>>>>
>>>> And I made mistake with my suggested expressions. If we want to scale
>>>> only for 25% and less space we need:
>>>>
>>>> if (free_reverse_ratio > 4.) {
>>>> k *= exp(free_reverse_ratio - 4.);
>>>>
>>>> But I will leave it to you to determine best ratio value by experiments
>>>> to get best results: get the same startup and peek with less codecache.
>>>> May be your 50% will be better value.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>>>
>>>>> Best,
>>>>> Albert
>>>>>
>>>>> On 05/07/2013 02:13 PM, Albert Noll wrote:
>>>>>> Hi Vladimir,
>>>>>>
>>>>>> thank you very much for your feedback. I made the changes as you
>>>>>> proposed.
>>>>>> I could not use SharedRuntime::dexp(d), since the VM crashed (see
>>>>>> below). Rick
>>>>>> explained me why: (the current thread is in a wrong state).
>>>>>>
>>>>>> What do you think of the current version? Do you think we need to
>>>>>> evaluate the
>>>>>> performance impact of that change?
>>>>>>
>>>>>> Best,
>>>>>> Albert
>>>>>>
>>>>>> P.S.: Is it OK if I ask you for early feedback, or should I just send
>>>>>> out an RFR?
>>>>>>
>>>>>> # To suppress the following error report, specify this argument
>>>>>> # after -XX: or in .hotspotrc: SuppressErrorAt=/gcLocker.cpp:223
>>>>>> #
>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>> #
>>>>>> # Internal Error
>>>>>> (/export/anoll/JDK-8012371/src/share/vm/memory/gcLocker.cpp:223),
>>>>>> pid=5663, tid=140406973806336
>>>>>> # Error: ShouldNotReachHere()
>>>>>> #
>>>>>> # JRE version: Java(TM) SE Runtime Environment (8.0-b86) (build
>>>>>> 1.8.0-ea-b86)
>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM
>>>>>> (25.0-b32-internal-fastdebug mixed mode linux-amd64 compressed oops)
>>>>>> # Failed to write core dump. Core dumps have been disabled. To enable
>>>>>> core dumping, try "ulimit -c unlimited" before starting Java again
>>>>>> #
>>>>>> # An error report file with more information is saved as:
>>>>>> # /export/anoll/hs_err_pid5663.log
>>>>>> #
>>>>>> # If you would like to submit a bug report, please visit:
>>>>>> # http://bugreport.sun.com/bugreport/crash.jsp
>>>>>> #
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 05/06/2013 11:01 PM, Vladimir Kozlov wrote:
>>>>>>> I don't think it should be only C1 specific code. Before Tiered code
>>>>>>> we had decay counters code for all compilers. So I think it
>>>>>>> should be
>>>>>>> the same now.
>>>>>>>
>>>>>>> Make new method:
>>>>>>> double CodeCache::free_space_ratio()
>>>>>>>
>>>>>>> Also subtract CodeCacheMinimumFreeSpace to get correct size
>>>>>>> available
>>>>>>> for JIT code.
>>>>>>>
>>>>>>> I would prefer to have one scaling expression which starts with *1
>>>>>>> and end with e**k. But it would be nice to have switch (from one
>>>>>>> expression to an other) the same (graph without steps). For example,
>>>>>>> the next code will sharply increase scale by 2 which is not good:
>>>>>>> + k += (free_ratio < 0.50) ? 1/free_ratio : 0;
>>>>>>>
>>>>>>> Also you use 2 divisions when you could use just one. And 50% empty
>>>>>>> is too early. If we start at 25% and use SharedRuntime::dexp(d) I
>>>>>>> think we can simplify code:
>>>>>>>
>>>>>>> double free_reverse_ratio = max_capacity / unallocated_capacity;
>>>>>>> if (free_reverse_ratio > 2.) {
>>>>>>> k *= SharedRuntime::dexp(free_reverse_ratio - 2.);
>>>>>>> }
>>>>>>>
>>>>>>> Regards,
>>>>>>> Vladimir
>>>>>>>
>>>>>>> On 5/6/13 6:00 AM, Albert Noll wrote:
>>>>>>>> Hi Vladimir,
>>>>>>>>
>>>>>>>> I looked at: https://jbs.oracle.com/bugs/browse/JDK-8012371 .
>>>>>>>> I attached a possible solution to this mail. Could I get some
>>>>>>>> early feedback from you?
>>>>>>>>
>>>>>>>> Many thanks,
>>>>>>>> Albert
>>>>>>
>>>>>
>
>
>
>
More information about the hotspot-compiler-dev
mailing list