RFR 8012371: Adjust Tiered compile threshold according to available space in codecache

Christian Thalinger christian.thalinger at oracle.com
Tue May 14 11:41:03 PDT 2013


On May 14, 2013, at 11:21 AM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:

> http://cr.openjdk.java.net/~kvn/8012371/webrev

This seems to be a good change.  Did you also do a refworkload run?

-- Chris

> 
> On 5/14/13 6:23 AM, Albert Noll wrote:
> 
> Hi,
> 
> I think I found a solution to the code cache fill-up-problem.
> 
> The problem with the previous solution was that the threshold for
> recompilation is increased equally
> for different tiers. As a result, peak performance was not reached if
> the code cache was "rather full", since frequently invoked methods that
> WERE compiled to C2 in a non-tiered version WERE NOT compiled with C2
> when using tiered compilation.
> 
> The proposed solution to the problem is that we start increasing the
> threshold rather early (e.g., if the code cache is filled up by 50%),
> and do not increase the threshold for C2 compilation. As a result, we
> have enough space for C2 code (we reach peak performance).
> The drawback of this solution, of course, is that tiered compilation
> potentially performs worse than if we provide more code cache. However,
> this solution should not perform worse compared to not using tiered
> compilation.
> 
> I evaluated the proposed changes using the nashorn benchmarks with
> ReservedCodeCacheSize=80m  letting all benchmarks run in the same JVM
> instance. We start increaseing the threshold for recompilation (not for
> recompiling to C2) when the code cache is filled up by 50%. The result
> is that
> the warning that the code cache is filled up and compilation stops is
> not printed out. Furthermore, we achieve similar peak performance
> compared to non-tiered but a faster startup time.
> 
> 
> Many thanks for your comments,
> Albert
> 
> On 07/05/2013 22:38, Vladimir Kozlov wrote:
>> And add product flag for initial ratio value so people can adjust it
>> as they wish.
>> 
>> Vladimir
>> 
>> On 5/7/13 9:40 AM, Vladimir Kozlov wrote:
>>> Albert,
>>> 
>>> You should start using Nashorn/octane for performance testing since
>>> TieredCompilatation has big effect on it. Roland can help you with it.
>>> 
>>> Thanks,
>>> Vladimir
>>> 
>>> On 5/7/13 9:08 AM, Vladimir Kozlov wrote:
>>>> On 5/7/13 6:49 AM, Albert Noll wrote:
>>>>> Hi Vladimir,
>>>>> 
>>>>> I performed a preliminary evaluation of the effects on the size of
>>>>> generated code.
>>>>> I used the eclipse benchmark from the DaCapo benchmarks.
>>>>> In the test, I limited the ReservedCodeCacheSize to 32m
>>>>> 
>>>>> With the changes in advancedThresholdPolicy, 2 runs generate 76mb
>>>>> code.
>>>>> Without the changes, 2 runs generate 116mb code.
>>>> 
>>>> This is good but I mostly concern about effect on performance, startup
>>>> and peek. Also look on codecache usage with default size at the end of
>>>> execution. Use -XX:PrintCompilation which has time stamps (first
>>>> number)
>>>> in output to see have behavior change. Note, third number in output is
>>>> compilation type: 3 - C1 with profiling, 4 - C2.
>>>> 
>>>> Sorry about dexp() suggestion, yes it needs to be called at correct
>>>> thread state.
>>>> 
>>>> And I made mistake with my suggested expressions. If we want to scale
>>>> only for 25% and less space we need:
>>>> 
>>>>  if (free_reverse_ratio > 4.) {
>>>>    k *= exp(free_reverse_ratio - 4.);
>>>> 
>>>> But I will leave it to you to determine best ratio value by experiments
>>>> to get best results: get the same startup and peek with less codecache.
>>>> May be your 50% will be better value.
>>>> 
>>>> Thanks,
>>>> Vladimir
>>>> 
>>>>> 
>>>>> Best,
>>>>> Albert
>>>>> 
>>>>> On 05/07/2013 02:13 PM, Albert Noll wrote:
>>>>>> Hi Vladimir,
>>>>>> 
>>>>>> thank you very much for your feedback. I made the changes as you
>>>>>> proposed.
>>>>>> I could not use SharedRuntime::dexp(d), since the VM crashed (see
>>>>>> below). Rick
>>>>>> explained me why: (the current thread is in a wrong state).
>>>>>> 
>>>>>> What do you think of the current version? Do you think we need to
>>>>>> evaluate the
>>>>>> performance impact of that change?
>>>>>> 
>>>>>> Best,
>>>>>> Albert
>>>>>> 
>>>>>> P.S.: Is it OK if I ask you for early feedback, or should I just send
>>>>>> out an RFR?
>>>>>> 
>>>>>> # To suppress the following error report, specify this argument
>>>>>> # after -XX: or in .hotspotrc: SuppressErrorAt=/gcLocker.cpp:223
>>>>>> #
>>>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>>>> #
>>>>>> #  Internal Error
>>>>>> (/export/anoll/JDK-8012371/src/share/vm/memory/gcLocker.cpp:223),
>>>>>> pid=5663, tid=140406973806336
>>>>>> #  Error: ShouldNotReachHere()
>>>>>> #
>>>>>> # JRE version: Java(TM) SE Runtime Environment (8.0-b86) (build
>>>>>> 1.8.0-ea-b86)
>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM
>>>>>> (25.0-b32-internal-fastdebug mixed mode linux-amd64 compressed oops)
>>>>>> # Failed to write core dump. Core dumps have been disabled. To enable
>>>>>> core dumping, try "ulimit -c unlimited" before starting Java again
>>>>>> #
>>>>>> # An error report file with more information is saved as:
>>>>>> # /export/anoll/hs_err_pid5663.log
>>>>>> #
>>>>>> # If you would like to submit a bug report, please visit:
>>>>>> #   http://bugreport.sun.com/bugreport/crash.jsp
>>>>>> #
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On 05/06/2013 11:01 PM, Vladimir Kozlov wrote:
>>>>>>> I don't think it should be only C1 specific code. Before Tiered code
>>>>>>> we had decay counters code for all compilers. So I think it
>>>>>>> should be
>>>>>>> the same now.
>>>>>>> 
>>>>>>> Make new method:
>>>>>>>  double CodeCache::free_space_ratio()
>>>>>>> 
>>>>>>> Also subtract CodeCacheMinimumFreeSpace to get correct size
>>>>>>> available
>>>>>>> for JIT code.
>>>>>>> 
>>>>>>> I would prefer to have one scaling expression which starts with *1
>>>>>>> and end with e**k. But it would be nice to have switch (from one
>>>>>>> expression to an other) the same (graph without steps). For example,
>>>>>>> the next code will sharply increase scale by 2 which is not good:
>>>>>>> +        k += (free_ratio < 0.50) ? 1/free_ratio : 0;
>>>>>>> 
>>>>>>> Also you use 2 divisions when you could use just one. And 50% empty
>>>>>>> is too early. If we start at 25% and use SharedRuntime::dexp(d) I
>>>>>>> think we can simplify code:
>>>>>>> 
>>>>>>> double free_reverse_ratio = max_capacity / unallocated_capacity;
>>>>>>> if (free_reverse_ratio > 2.) {
>>>>>>>  k *= SharedRuntime::dexp(free_reverse_ratio - 2.);
>>>>>>> }
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Vladimir
>>>>>>> 
>>>>>>> On 5/6/13 6:00 AM, Albert Noll wrote:
>>>>>>>> Hi Vladimir,
>>>>>>>> 
>>>>>>>> I looked at: https://jbs.oracle.com/bugs/browse/JDK-8012371 .
>>>>>>>> I attached a possible solution to this mail. Could I get some
>>>>>>>> early feedback from you?
>>>>>>>> 
>>>>>>>> Many thanks,
>>>>>>>> Albert
>>>>>> 
>>>>> 
> 
> 
> 
> 



More information about the hotspot-compiler-dev mailing list