[9] RFR(S): 8059606: Enable per-method usage of CompileThresholdScaling (per-method compilation thresholds)

Tue Jan 20 17:48:42 UTC 2015

We usually exit with error message if a flag's value is incorrect - vm_exit_during_initialization().
We never overwrite incorrect value and hide what we did.

Thanks,
Vladimir

On 1/20/15 5:29 AM, Zoltán Majó wrote:
> Hi Vladimir,
>
>
> thanks for the review!
>
> On 01/19/2015 07:36 PM, Vladimir Kozlov wrote:
>> What was changed in globalDefinitions.hpp? Webrev shows nothing (sometimes it does not show spacing changes).
>
> Yes, it was some whitespace I've removed around line 1150:
>
> -  uintptr_t p =  1;
> +  uintptr_t p = 1;
>
> The newest webrev does not show that either. In addition to that, I've updated two comments in the newest webrev.
>
>> In arguments.cpp, please, add check that CompileThresholdScaling is not negative.
>
> I added the check. If CompileThresholdScaling is negative, we set its value to 1.0 (the default value).
>
> Here is the new webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.04/
>
> The webrev link is the same as the one in my latest reply to John.
>
> All JPRT tests pass.
>
> Thank you!
>
> Best regards,
>
>
> Zoltan
>
>>
>> Otherwise this looks good to me.
>>
>> Thanks,
>> Vladimir
>>
>> On 1/19/15 8:56 AM, Zoltán Majó wrote:
>>> Hi John,
>>>
>>>
>>> On 01/16/2015 08:59 PM, John Rose wrote:
>>>> On Jan 16, 2015, at 5:05 AM, Zoltán Majó <zoltan.majo at oracle.com> wrote:
>>>>> Here is the new webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.02/
>>>> Reviewed, with some comments.
>>>
>>> thank you for your review and for the feedback! Please see detailed comments below.
>>>
>>>> Overall, I like the cleanups along the way.
>>>>
>>>> The basic idea of replacing a hard-coded 'mask' with an addressable variable is sound and nicely executed.
>>>>
>>>> I suppose that idea by itself is "S" small, but this really is a "M" or even "L" change, as Vladimir says, especially
>>>> since the enhanced logic is spread all around many files.
>>>
>>> I agree that this is not a small change any more, but I would like to keep the subject of this discussion unchanged so
>>> that we don't move to a different thread (unless you or Vladimir think it is better to change it).
>>>
>>>> How have you regression tested this?
>>>
>>> I used our performance infrastructure. I collected performance data on 6 different architectures for 41 benchmark
>>> programs/suites. The data show that per-method compilation thresholds do not result in a statistically significant
>>> change of performance. One benchmark, Footprint3-Client, degrades ~0.5% on the X86 Client VM, but I think that is
>>> negligible.
>>>
>>>> Have you verified that the compilation sequence doesn't change for a non-trivial use case?  A slip in the assembly
>>>> code (for example) might cause a comparison against a garbage mask or limit that could cause compilation decisions to
>>>> go crazy quietly.  I didn't spot any such bug, but they are hard to spot and sometimes quiet.
>>>
>>> I did extensive manual testing on all architectures targeted by the patch.
>>>
>>> I checked that a target method (a method for which per-method compilation thresholds have been specified) is indeed
>>> compiled sooner/later than methods for which global compilation thresholds are in effect. I ran tests for all
>>> combinations of +/-TieredCompilation, +/-ProfileInterpreter, and +/-UseOnStackReplacement
>>> on each architecture that we support. While working on this issue I've discovered and reported two interpreter-related
>>> bugs, 8068652 and 8068505.
>>>
>>> I cannot, unfortunately, guarantee that my changes are error-free, but I did my best to catch any possible error that I
>>> can think of and that I can check.
>>>
>>>> In the sparc assembly code (more than one file), the live range of Rcounters has increased, since it is used to supply
>>>> limits as well as to update the counter (which happens early in the code).
>>>>
>>>> To make it easier to maintain the code, I suggest renaming Rcounters to G3_method_counters.
>>>> (As you can see, when a register has a logical name but has a complicated live range, we add the hardware name is to
>>>> the logical name, to make it easier to spot interfering uses, when manually editing code.)
>>>
>>> Thanks for the suggestion, I've changed the register's name.
>>>
>>>> If scale==0.0 is a valid input checked specially in compileBroker, perhaps the effect of a zero should be documented?
>>>> Suggest adding to globals.hpp:
>>>>     "; values greater than one delay counter overflow; zero forces counter overflow immediately"
>>>>     "; can be set as a per-method  option."
>>>
>>> If CompileThresholdScaling is set to 0.0, all methods are interpreted.
>>>
>>> The reason is that CompileThreshold==0 has been historically equivalent to setting -Xint to true. Setting
>>> CompileThresholdScaling to 0.0 scales down CompileThreshold to 0 and we wanted to keep VM behavior consistent when we've
>>> added support for the global CompileThresholdScaling flag in JDK-8059604.
>>>
>>> If you think it would be good if we changed the meaning of CompileThreshold==0, please let me know. I'll file an RFE for
>>> it and change it. As CompileThreshold is a product flag, we will need CCC approval for that change.
>>>
>>> As you've suggested, I added a comment to globals.hpp that precisely describes the behavior of CompileThresholdScaling.
>>>
>>>> Question:  What if both a global scale and a method option for scale are both set?  Is the global one ignored?  Do
>>>> they multiply?  It's worth specifying it explicitly (and checking that the logic DTRT).
>>>
>>> Global and per-method values multiply. That behavior is now described in the comment in globals.hpp.
>>>
>>>> Question:  How are the log values (like Tier0InvokeNotifyFreqLog or the result of get_scaled_freq_log) constrained to
>>>> fit in a reasonable range?  (I would suppose that range is 0..31 or less.)  Should we have range clipping code in the
>>>> scaler functions?
>>>
>>> Currently InvocationCounter::number_of_count_bits=29 bits are reserved for counting. As a result, the value of the log2
>>> of the notification frequency can be at most 30. I updated the source code accordingly.
>>>
>>>> It would give notably simpler code, in MethodData::init, to use a branch-free setup of tier_0_invoke_notify_freq_log
>>>> etc.  Set scale = 1.0 and then update it conditionally. Special-case scale=1.0 in get_scaled_freq_log to taste.
>>>
>>> I did that.
>>>
>>>> Same comment about less-branchy code for methodCounters.hpp.  (It's better to have a one-way branch that sets up
>>>> 'scale' than a two-way branch with duplicate setups of one or more variables.)
>>>
>>> I changed the code according to your suggestions.
>>>
>>>> In MethodCounters, I think the conditional scaling of _interpreter_backward_branch_limit is going to confuse someone,
>>>> at some point.  It should be scaled, unconditionally, like its sibling variables.  (That would remove another somewhat
>>>> verbose initialization branch!)
>>>
>>> The value of the *global* interpreter backward branch limit (InvocationCounter::InterpreterBackwardBranchLimit) is
>>> computed based on the value of CompileThreshold (in InvocationCounter::reinitialize()). The backward branch limit is
>>> assigned a different value depending on whether interpreter profiling is enabled or not.
>>>
>>> The logic of the *per-method* interpreter backward branch limit (MethodCounters::_interpreter_backward_branch_limit) is
>>> intended to be identical to that of the global branch limit. Therefore, I'm afraid that we have to keep the if-then-else
>>> construct initializing _interpreter_backward_branch_limit in methodCounters.hpp. I hope that I understood right what
>>> you've previously suggested.
>>>
>>>> Small style nit:  The noise-word "get_" is discouraged in the style doc:
>>>>
>>>>>   • Getter accessor names are noun phrases, with no "get_" noise word. Boolean getters can also begin with "is_" or
>>>>> "has_".
>>>>>    https://wiki.openjdk.java.net/display/HotSpot/StyleGuide
>>>> Arguments.cpp follows this rule partially (in older code maybe?). It would be better to decrease counter-examples to
>>>> the rule instead of increase them.
>>>>
>>>> Bigger style nit:  Since the functions are not getting a preset value (from the arguments) but rather normalizing a
>>>> provided argument value, I suggest naming them "scale_compile_threshold" (i.e., a verb phrase instead of a noun
>>>> phrase).  Again from the style doc:
>>>>
>>>>>   • Other method names are verb phrases, as if commands to the receiver.
>>>
>>> I did not know that, thanks for pointing it out to me. I changed
>>>
>>> get_scaled_compile_threshold -> scaled_compiled_threshold
>>> get_scaled_freq_log -> scaled_freq_log.
>>>
>>>> Since you are providing overloads of the scaling functions, the header file should either contain inline code for the
>>>> convenience methods, or else document how the optional argument ('scale') defaults.  I'd prefer inline code, since it
>>>> is simple.  It's as much text to document with a comment as just to supply the inline overload.
>>>
>>> I inlined the convenience methods into the header file.
>>>
>>> Here is the new webrev: http://cr.openjdk.java.net/~zmajo/8059606/webrev.03/
>>>
>>>>
>>>> As I said before, nice work!
>>>
>>> Thank you!
>>>
>>> Best regards,
>>>
>>>
>>> Zoltan
>>>
>>>>
>>>> — John
>>>
>