[9] RFR(S): 8059606: Enable per-method usage of CompileThresholdScaling (per-method compilation thresholds)
Vladimir Kozlov
vladimir.kozlov at oracle.com
Tue Jan 20 19:09:19 UTC 2015
Looks good.
Thanks,
Vladimir
On 1/20/15 10:46 AM, Zoltán Majó wrote:
> Hi Vladimir,
>
>
> thank you for the feedback!
>
> On 01/20/2015 06:48 PM, Vladimir Kozlov wrote:
>> We usually exit with error message if a flag's value is incorrect -
>> vm_exit_during_initialization().
>> We never overwrite incorrect value and hide what we did.
>
> I was not aware of that convention. I changed the check in arguments.cpp
> so that the VM exits if the *global* CompileThresholdScaling flag has a
> negative value (instead of overwriting it).
>
> Currently, CompileCommand=option does not accept negative values for
> flags of type double, so a *per-method* CompileThresholdScaling cannot
> have a negative value.
>
> But the behavior of CompileCommand might change in the future, so I
> changed the
>
> Arguments::scaled_freq_log(intx freq_log, scale)
> Arguments::scaled_compile_threshold(intx threshold, double scale)
>
> methods to return the unscaled value of parameters freq_log and
> threshold, respectively, if scale is negative.
>
> I hope this change is fine. Flags specified using CommandCommand=option
> are always listed on the output at startup, so maybe this does not count
> as hiding overwritten flag values.
>
> Here is the new webrev:
> http://cr.openjdk.java.net/~zmajo/8059606/webrev.05/
>
> Thank you and best regards,
>
>
> Zoltan
>
>>
>> Thanks,
>> Vladimir
>>
>> On 1/20/15 5:29 AM, Zoltán Majó wrote:
>>> Hi Vladimir,
>>>
>>>
>>> thanks for the review!
>>>
>>> On 01/19/2015 07:36 PM, Vladimir Kozlov wrote:
>>>> What was changed in globalDefinitions.hpp? Webrev shows nothing
>>>> (sometimes it does not show spacing changes).
>>>
>>> Yes, it was some whitespace I've removed around line 1150:
>>>
>>> - uintptr_t p = 1;
>>> + uintptr_t p = 1;
>>>
>>> The newest webrev does not show that either. In addition to that,
>>> I've updated two comments in the newest webrev.
>>>
>>>> In arguments.cpp, please, add check that CompileThresholdScaling is
>>>> not negative.
>>>
>>> I added the check. If CompileThresholdScaling is negative, we set its
>>> value to 1.0 (the default value).
>>>
>>> Here is the new webrev:
>>> http://cr.openjdk.java.net/~zmajo/8059606/webrev.04/
>>>
>>> The webrev link is the same as the one in my latest reply to John.
>>>
>>> All JPRT tests pass.
>>>
>>> Thank you!
>>>
>>> Best regards,
>>>
>>>
>>> Zoltan
>>>
>>>>
>>>> Otherwise this looks good to me.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 1/19/15 8:56 AM, Zoltán Majó wrote:
>>>>> Hi John,
>>>>>
>>>>>
>>>>> On 01/16/2015 08:59 PM, John Rose wrote:
>>>>>> On Jan 16, 2015, at 5:05 AM, Zoltán Majó <zoltan.majo at oracle.com>
>>>>>> wrote:
>>>>>>> Here is the new webrev:
>>>>>>> http://cr.openjdk.java.net/~zmajo/8059606/webrev.02/
>>>>>> Reviewed, with some comments.
>>>>>
>>>>> thank you for your review and for the feedback! Please see detailed
>>>>> comments below.
>>>>>
>>>>>> Overall, I like the cleanups along the way.
>>>>>>
>>>>>> The basic idea of replacing a hard-coded 'mask' with an
>>>>>> addressable variable is sound and nicely executed.
>>>>>>
>>>>>> I suppose that idea by itself is "S" small, but this really is a
>>>>>> "M" or even "L" change, as Vladimir says, especially
>>>>>> since the enhanced logic is spread all around many files.
>>>>>
>>>>> I agree that this is not a small change any more, but I would like
>>>>> to keep the subject of this discussion unchanged so
>>>>> that we don't move to a different thread (unless you or Vladimir
>>>>> think it is better to change it).
>>>>>
>>>>>> How have you regression tested this?
>>>>>
>>>>> I used our performance infrastructure. I collected performance data
>>>>> on 6 different architectures for 41 benchmark
>>>>> programs/suites. The data show that per-method compilation
>>>>> thresholds do not result in a statistically significant
>>>>> change of performance. One benchmark, Footprint3-Client, degrades
>>>>> ~0.5% on the X86 Client VM, but I think that is
>>>>> negligible.
>>>>>
>>>>>> Have you verified that the compilation sequence doesn't change for
>>>>>> a non-trivial use case? A slip in the assembly
>>>>>> code (for example) might cause a comparison against a garbage mask
>>>>>> or limit that could cause compilation decisions to
>>>>>> go crazy quietly. I didn't spot any such bug, but they are hard
>>>>>> to spot and sometimes quiet.
>>>>>
>>>>> I did extensive manual testing on all architectures targeted by the
>>>>> patch.
>>>>>
>>>>> I checked that a target method (a method for which per-method
>>>>> compilation thresholds have been specified) is indeed
>>>>> compiled sooner/later than methods for which global compilation
>>>>> thresholds are in effect. I ran tests for all
>>>>> combinations of +/-TieredCompilation, +/-ProfileInterpreter, and
>>>>> +/-UseOnStackReplacement
>>>>> on each architecture that we support. While working on this issue
>>>>> I've discovered and reported two interpreter-related
>>>>> bugs, 8068652 and 8068505.
>>>>>
>>>>> I cannot, unfortunately, guarantee that my changes are error-free,
>>>>> but I did my best to catch any possible error that I
>>>>> can think of and that I can check.
>>>>>
>>>>>> In the sparc assembly code (more than one file), the live range of
>>>>>> Rcounters has increased, since it is used to supply
>>>>>> limits as well as to update the counter (which happens early in
>>>>>> the code).
>>>>>>
>>>>>> To make it easier to maintain the code, I suggest renaming
>>>>>> Rcounters to G3_method_counters.
>>>>>> (As you can see, when a register has a logical name but has a
>>>>>> complicated live range, we add the hardware name is to
>>>>>> the logical name, to make it easier to spot interfering uses, when
>>>>>> manually editing code.)
>>>>>
>>>>> Thanks for the suggestion, I've changed the register's name.
>>>>>
>>>>>> If scale==0.0 is a valid input checked specially in compileBroker,
>>>>>> perhaps the effect of a zero should be documented?
>>>>>> Suggest adding to globals.hpp:
>>>>>> "; values greater than one delay counter overflow; zero forces
>>>>>> counter overflow immediately"
>>>>>> "; can be set as a per-method option."
>>>>>
>>>>> If CompileThresholdScaling is set to 0.0, all methods are interpreted.
>>>>>
>>>>> The reason is that CompileThreshold==0 has been historically
>>>>> equivalent to setting -Xint to true. Setting
>>>>> CompileThresholdScaling to 0.0 scales down CompileThreshold to 0
>>>>> and we wanted to keep VM behavior consistent when we've
>>>>> added support for the global CompileThresholdScaling flag in
>>>>> JDK-8059604.
>>>>>
>>>>> If you think it would be good if we changed the meaning of
>>>>> CompileThreshold==0, please let me know. I'll file an RFE for
>>>>> it and change it. As CompileThreshold is a product flag, we will
>>>>> need CCC approval for that change.
>>>>>
>>>>> As you've suggested, I added a comment to globals.hpp that
>>>>> precisely describes the behavior of CompileThresholdScaling.
>>>>>
>>>>>> Question: What if both a global scale and a method option for
>>>>>> scale are both set? Is the global one ignored? Do
>>>>>> they multiply? It's worth specifying it explicitly (and checking
>>>>>> that the logic DTRT).
>>>>>
>>>>> Global and per-method values multiply. That behavior is now
>>>>> described in the comment in globals.hpp.
>>>>>
>>>>>> Question: How are the log values (like Tier0InvokeNotifyFreqLog
>>>>>> or the result of get_scaled_freq_log) constrained to
>>>>>> fit in a reasonable range? (I would suppose that range is 0..31
>>>>>> or less.) Should we have range clipping code in the
>>>>>> scaler functions?
>>>>>
>>>>> Currently InvocationCounter::number_of_count_bits=29 bits are
>>>>> reserved for counting. As a result, the value of the log2
>>>>> of the notification frequency can be at most 30. I updated the
>>>>> source code accordingly.
>>>>>
>>>>>> It would give notably simpler code, in MethodData::init, to use a
>>>>>> branch-free setup of tier_0_invoke_notify_freq_log
>>>>>> etc. Set scale = 1.0 and then update it conditionally.
>>>>>> Special-case scale=1.0 in get_scaled_freq_log to taste.
>>>>>
>>>>> I did that.
>>>>>
>>>>>> Same comment about less-branchy code for methodCounters.hpp. (It's
>>>>>> better to have a one-way branch that sets up
>>>>>> 'scale' than a two-way branch with duplicate setups of one or more
>>>>>> variables.)
>>>>>
>>>>> I changed the code according to your suggestions.
>>>>>
>>>>>> In MethodCounters, I think the conditional scaling of
>>>>>> _interpreter_backward_branch_limit is going to confuse someone,
>>>>>> at some point. It should be scaled, unconditionally, like its
>>>>>> sibling variables. (That would remove another somewhat
>>>>>> verbose initialization branch!)
>>>>>
>>>>> The value of the *global* interpreter backward branch limit
>>>>> (InvocationCounter::InterpreterBackwardBranchLimit) is
>>>>> computed based on the value of CompileThreshold (in
>>>>> InvocationCounter::reinitialize()). The backward branch limit is
>>>>> assigned a different value depending on whether interpreter
>>>>> profiling is enabled or not.
>>>>>
>>>>> The logic of the *per-method* interpreter backward branch limit
>>>>> (MethodCounters::_interpreter_backward_branch_limit) is
>>>>> intended to be identical to that of the global branch limit.
>>>>> Therefore, I'm afraid that we have to keep the if-then-else
>>>>> construct initializing _interpreter_backward_branch_limit in
>>>>> methodCounters.hpp. I hope that I understood right what
>>>>> you've previously suggested.
>>>>>
>>>>>> Small style nit: The noise-word "get_" is discouraged in the
>>>>>> style doc:
>>>>>>
>>>>>>> • Getter accessor names are noun phrases, with no "get_" noise
>>>>>>> word. Boolean getters can also begin with "is_" or
>>>>>>> "has_".
>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/StyleGuide
>>>>>> Arguments.cpp follows this rule partially (in older code maybe?).
>>>>>> It would be better to decrease counter-examples to
>>>>>> the rule instead of increase them.
>>>>>>
>>>>>> Bigger style nit: Since the functions are not getting a preset
>>>>>> value (from the arguments) but rather normalizing a
>>>>>> provided argument value, I suggest naming them
>>>>>> "scale_compile_threshold" (i.e., a verb phrase instead of a noun
>>>>>> phrase). Again from the style doc:
>>>>>>
>>>>>>> • Other method names are verb phrases, as if commands to the
>>>>>>> receiver.
>>>>>
>>>>> I did not know that, thanks for pointing it out to me. I changed
>>>>>
>>>>> get_scaled_compile_threshold -> scaled_compiled_threshold
>>>>> get_scaled_freq_log -> scaled_freq_log.
>>>>>
>>>>>> Since you are providing overloads of the scaling functions, the
>>>>>> header file should either contain inline code for the
>>>>>> convenience methods, or else document how the optional argument
>>>>>> ('scale') defaults. I'd prefer inline code, since it
>>>>>> is simple. It's as much text to document with a comment as just
>>>>>> to supply the inline overload.
>>>>>
>>>>> I inlined the convenience methods into the header file.
>>>>>
>>>>> Here is the new webrev:
>>>>> http://cr.openjdk.java.net/~zmajo/8059606/webrev.03/
>>>>>
>>>>>>
>>>>>> As I said before, nice work!
>>>>>
>>>>> Thank you!
>>>>>
>>>>> Best regards,
>>>>>
>>>>>
>>>>> Zoltan
>>>>>
>>>>>>
>>>>>> — John
>>>>>
>>>
>
More information about the hotspot-compiler-dev
mailing list