[9] RFR(S): 8059606: Enable per-method usage of CompileThresholdScaling (per-method compilation thresholds)

Wed Jan 21 09:57:09 UTC 2015

Thank you, Vladimir, for the review!

Best regards,

Zoltan

On 01/20/2015 08:09 PM, Vladimir Kozlov wrote:
> Looks good.
>
> Thanks,
> Vladimir
>
> On 1/20/15 10:46 AM, Zoltán Majó wrote:
>> Hi Vladimir,
>>
>>
>> thank you for the feedback!
>>
>> On 01/20/2015 06:48 PM, Vladimir Kozlov wrote:
>>> We usually exit with error message if a flag's value is incorrect -
>>> vm_exit_during_initialization().
>>> We never overwrite incorrect value and hide what we did.
>>
>> I was not aware of that convention. I changed the check in arguments.cpp
>> so that the VM exits if the *global* CompileThresholdScaling flag has a
>> negative value (instead of overwriting it).
>>
>> Currently, CompileCommand=option does not accept negative values for
>> flags of type double, so a *per-method* CompileThresholdScaling cannot
>> have a negative value.
>>
>> But the behavior of CompileCommand might change in the future, so I
>> changed the
>>
>> Arguments::scaled_freq_log(intx freq_log, scale)
>> Arguments::scaled_compile_threshold(intx threshold, double scale)
>>
>> methods to return the unscaled value of parameters freq_log and
>> threshold, respectively, if scale is negative.
>>
>> I hope this change is fine. Flags specified using CommandCommand=option
>> are always listed on the output at startup, so maybe this does not count
>> as hiding overwritten flag values.
>>
>> Here is the new webrev:
>> http://cr.openjdk.java.net/~zmajo/8059606/webrev.05/
>>
>> Thank you and best regards,
>>
>>
>> Zoltan
>>
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 1/20/15 5:29 AM, Zoltán Majó wrote:
>>>> Hi Vladimir,
>>>>
>>>>
>>>> thanks for the review!
>>>>
>>>> On 01/19/2015 07:36 PM, Vladimir Kozlov wrote:
>>>>> What was changed in globalDefinitions.hpp? Webrev shows nothing
>>>>> (sometimes it does not show spacing changes).
>>>>
>>>> Yes, it was some whitespace I've removed around line 1150:
>>>>
>>>> -  uintptr_t p =  1;
>>>> +  uintptr_t p = 1;
>>>>
>>>> The newest webrev does not show that either. In addition to that,
>>>> I've updated two comments in the newest webrev.
>>>>
>>>>> In arguments.cpp, please, add check that CompileThresholdScaling is
>>>>> not negative.
>>>>
>>>> I added the check. If CompileThresholdScaling is negative, we set its
>>>> value to 1.0 (the default value).
>>>>
>>>> Here is the new webrev:
>>>> http://cr.openjdk.java.net/~zmajo/8059606/webrev.04/
>>>>
>>>> The webrev link is the same as the one in my latest reply to John.
>>>>
>>>> All JPRT tests pass.
>>>>
>>>> Thank you!
>>>>
>>>> Best regards,
>>>>
>>>>
>>>> Zoltan
>>>>
>>>>>
>>>>> Otherwise this looks good to me.
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> On 1/19/15 8:56 AM, Zoltán Majó wrote:
>>>>>> Hi John,
>>>>>>
>>>>>>
>>>>>> On 01/16/2015 08:59 PM, John Rose wrote:
>>>>>>> On Jan 16, 2015, at 5:05 AM, Zoltán Majó <zoltan.majo at oracle.com>
>>>>>>> wrote:
>>>>>>>> Here is the new webrev:
>>>>>>>> http://cr.openjdk.java.net/~zmajo/8059606/webrev.02/
>>>>>>> Reviewed, with some comments.
>>>>>>
>>>>>> thank you for your review and for the feedback! Please see detailed
>>>>>> comments below.
>>>>>>
>>>>>>> Overall, I like the cleanups along the way.
>>>>>>>
>>>>>>> The basic idea of replacing a hard-coded 'mask' with an
>>>>>>> addressable variable is sound and nicely executed.
>>>>>>>
>>>>>>> I suppose that idea by itself is "S" small, but this really is a
>>>>>>> "M" or even "L" change, as Vladimir says, especially
>>>>>>> since the enhanced logic is spread all around many files.
>>>>>>
>>>>>> I agree that this is not a small change any more, but I would like
>>>>>> to keep the subject of this discussion unchanged so
>>>>>> that we don't move to a different thread (unless you or Vladimir
>>>>>> think it is better to change it).
>>>>>>
>>>>>>> How have you regression tested this?
>>>>>>
>>>>>> I used our performance infrastructure. I collected performance data
>>>>>> on 6 different architectures for 41 benchmark
>>>>>> programs/suites. The data show that per-method compilation
>>>>>> thresholds do not result in a statistically significant
>>>>>> change of performance. One benchmark, Footprint3-Client, degrades
>>>>>> ~0.5% on the X86 Client VM, but I think that is
>>>>>> negligible.
>>>>>>
>>>>>>> Have you verified that the compilation sequence doesn't change for
>>>>>>> a non-trivial use case?  A slip in the assembly
>>>>>>> code (for example) might cause a comparison against a garbage mask
>>>>>>> or limit that could cause compilation decisions to
>>>>>>> go crazy quietly.  I didn't spot any such bug, but they are hard
>>>>>>> to spot and sometimes quiet.
>>>>>>
>>>>>> I did extensive manual testing on all architectures targeted by the
>>>>>> patch.
>>>>>>
>>>>>> I checked that a target method (a method for which per-method
>>>>>> compilation thresholds have been specified) is indeed
>>>>>> compiled sooner/later than methods for which global compilation
>>>>>> thresholds are in effect. I ran tests for all
>>>>>> combinations of +/-TieredCompilation, +/-ProfileInterpreter, and
>>>>>> +/-UseOnStackReplacement
>>>>>> on each architecture that we support. While working on this issue
>>>>>> I've discovered and reported two interpreter-related
>>>>>> bugs, 8068652 and 8068505.
>>>>>>
>>>>>> I cannot, unfortunately, guarantee that my changes are error-free,
>>>>>> but I did my best to catch any possible error that I
>>>>>> can think of and that I can check.
>>>>>>
>>>>>>> In the sparc assembly code (more than one file), the live range of
>>>>>>> Rcounters has increased, since it is used to supply
>>>>>>> limits as well as to update the counter (which happens early in
>>>>>>> the code).
>>>>>>>
>>>>>>> To make it easier to maintain the code, I suggest renaming
>>>>>>> Rcounters to G3_method_counters.
>>>>>>> (As you can see, when a register has a logical name but has a
>>>>>>> complicated live range, we add the hardware name is to
>>>>>>> the logical name, to make it easier to spot interfering uses, when
>>>>>>> manually editing code.)
>>>>>>
>>>>>> Thanks for the suggestion, I've changed the register's name.
>>>>>>
>>>>>>> If scale==0.0 is a valid input checked specially in compileBroker,
>>>>>>> perhaps the effect of a zero should be documented?
>>>>>>> Suggest adding to globals.hpp:
>>>>>>>     "; values greater than one delay counter overflow; zero forces
>>>>>>> counter overflow immediately"
>>>>>>>     "; can be set as a per-method  option."
>>>>>>
>>>>>> If CompileThresholdScaling is set to 0.0, all methods are 
>>>>>> interpreted.
>>>>>>
>>>>>> The reason is that CompileThreshold==0 has been historically
>>>>>> equivalent to setting -Xint to true. Setting
>>>>>> CompileThresholdScaling to 0.0 scales down CompileThreshold to 0
>>>>>> and we wanted to keep VM behavior consistent when we've
>>>>>> added support for the global CompileThresholdScaling flag in
>>>>>> JDK-8059604.
>>>>>>
>>>>>> If you think it would be good if we changed the meaning of
>>>>>> CompileThreshold==0, please let me know. I'll file an RFE for
>>>>>> it and change it. As CompileThreshold is a product flag, we will
>>>>>> need CCC approval for that change.
>>>>>>
>>>>>> As you've suggested, I added a comment to globals.hpp that
>>>>>> precisely describes the behavior of CompileThresholdScaling.
>>>>>>
>>>>>>> Question:  What if both a global scale and a method option for
>>>>>>> scale are both set?  Is the global one ignored?  Do
>>>>>>> they multiply?  It's worth specifying it explicitly (and checking
>>>>>>> that the logic DTRT).
>>>>>>
>>>>>> Global and per-method values multiply. That behavior is now
>>>>>> described in the comment in globals.hpp.
>>>>>>
>>>>>>> Question:  How are the log values (like Tier0InvokeNotifyFreqLog
>>>>>>> or the result of get_scaled_freq_log) constrained to
>>>>>>> fit in a reasonable range?  (I would suppose that range is 0..31
>>>>>>> or less.)  Should we have range clipping code in the
>>>>>>> scaler functions?
>>>>>>
>>>>>> Currently InvocationCounter::number_of_count_bits=29 bits are
>>>>>> reserved for counting. As a result, the value of the log2
>>>>>> of the notification frequency can be at most 30. I updated the
>>>>>> source code accordingly.
>>>>>>
>>>>>>> It would give notably simpler code, in MethodData::init, to use a
>>>>>>> branch-free setup of tier_0_invoke_notify_freq_log
>>>>>>> etc.  Set scale = 1.0 and then update it conditionally.
>>>>>>> Special-case scale=1.0 in get_scaled_freq_log to taste.
>>>>>>
>>>>>> I did that.
>>>>>>
>>>>>>> Same comment about less-branchy code for methodCounters.hpp. (It's
>>>>>>> better to have a one-way branch that sets up
>>>>>>> 'scale' than a two-way branch with duplicate setups of one or more
>>>>>>> variables.)
>>>>>>
>>>>>> I changed the code according to your suggestions.
>>>>>>
>>>>>>> In MethodCounters, I think the conditional scaling of
>>>>>>> _interpreter_backward_branch_limit is going to confuse someone,
>>>>>>> at some point.  It should be scaled, unconditionally, like its
>>>>>>> sibling variables.  (That would remove another somewhat
>>>>>>> verbose initialization branch!)
>>>>>>
>>>>>> The value of the *global* interpreter backward branch limit
>>>>>> (InvocationCounter::InterpreterBackwardBranchLimit) is
>>>>>> computed based on the value of CompileThreshold (in
>>>>>> InvocationCounter::reinitialize()). The backward branch limit is
>>>>>> assigned a different value depending on whether interpreter
>>>>>> profiling is enabled or not.
>>>>>>
>>>>>> The logic of the *per-method* interpreter backward branch limit
>>>>>> (MethodCounters::_interpreter_backward_branch_limit) is
>>>>>> intended to be identical to that of the global branch limit.
>>>>>> Therefore, I'm afraid that we have to keep the if-then-else
>>>>>> construct initializing _interpreter_backward_branch_limit in
>>>>>> methodCounters.hpp. I hope that I understood right what
>>>>>> you've previously suggested.
>>>>>>
>>>>>>> Small style nit:  The noise-word "get_" is discouraged in the
>>>>>>> style doc:
>>>>>>>
>>>>>>>>   • Getter accessor names are noun phrases, with no "get_" noise
>>>>>>>> word. Boolean getters can also begin with "is_" or
>>>>>>>> "has_".
>>>>>>>> https://wiki.openjdk.java.net/display/HotSpot/StyleGuide
>>>>>>> Arguments.cpp follows this rule partially (in older code maybe?).
>>>>>>> It would be better to decrease counter-examples to
>>>>>>> the rule instead of increase them.
>>>>>>>
>>>>>>> Bigger style nit:  Since the functions are not getting a preset
>>>>>>> value (from the arguments) but rather normalizing a
>>>>>>> provided argument value, I suggest naming them
>>>>>>> "scale_compile_threshold" (i.e., a verb phrase instead of a noun
>>>>>>> phrase).  Again from the style doc:
>>>>>>>
>>>>>>>>   • Other method names are verb phrases, as if commands to the
>>>>>>>> receiver.
>>>>>>
>>>>>> I did not know that, thanks for pointing it out to me. I changed
>>>>>>
>>>>>> get_scaled_compile_threshold -> scaled_compiled_threshold
>>>>>> get_scaled_freq_log -> scaled_freq_log.
>>>>>>
>>>>>>> Since you are providing overloads of the scaling functions, the
>>>>>>> header file should either contain inline code for the
>>>>>>> convenience methods, or else document how the optional argument
>>>>>>> ('scale') defaults.  I'd prefer inline code, since it
>>>>>>> is simple.  It's as much text to document with a comment as just
>>>>>>> to supply the inline overload.
>>>>>>
>>>>>> I inlined the convenience methods into the header file.
>>>>>>
>>>>>> Here is the new webrev:
>>>>>> http://cr.openjdk.java.net/~zmajo/8059606/webrev.03/
>>>>>>
>>>>>>>
>>>>>>> As I said before, nice work!
>>>>>>
>>>>>> Thank you!
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>>
>>>>>> Zoltan
>>>>>>
>>>>>>>
>>>>>>> — John
>>>>>>
>>>>
>>