RFR (M): 8074037: Refactor the G1GCPhaseTime logging to make it easier to add new phases
Bengt Rutisson
bengt.rutisson at oracle.com
Tue Mar 10 09:48:08 UTC 2015
Hi Erik,
Sorry for the late reply. I was on vacation for a couple of days.
On 2015-03-05 10:14, Erik Österlund wrote:
> Hi Kirk,
>
> Thank you for the feedback.
>
> I share your concern making small fixes at a time in general. However, as pointed out by Vitaly, this would not be an issue here. Because instead of picking a single universal time unit to be used everywhere (such as seconds) which would cause such problems, my approach is having a single unified time /type/, for which the time unit is a configurable parameter. The type knows the difference between units and can convert them (implicitly or explicitly as we prefer). Introducing such a time type is both backward compatible in logged output presented to external tools and internally in code.
>
> I think this approach is better than picking a single universal time unit, stored in "double" without further type information embedded. Because there is often a good reason why time units exist in the first place and not a single one is used. Whatever champion time unit is picked, is likely to be be awkward in some situations. :)
>
> Currently different units are passed around as primitive values ignorant of time units, instead annotated with a postfix in parameter/variable names implicitly annotating which time unit is used, and comments documenting which time unit is returned from functions. All I propose really is to strengthen the type for times to embed this knowledge explicitly and check these assumptions for us at compile time to reduce errors and improve readability/maintainability.
>
> If this is not relevant to the current change, I should perhaps make a new thread if anyone is interested in such a change? Don't want to clutter this thread too much. Thought I'd check if others drink my kind of tea first. :)
I like the idea of having the type system help out with managing the
time units. However, as suggested by you and others, I think that should
be handled as a separate issue. Please file a bug for it - or let me
know if I should file the bug.
To get this email thread back on track - I'm still looking for reviews
for the latest webrev:
http://cr.openjdk.java.net/~brutisso/8074037/webrev.03/
Thanks,
Bengt
>
> Thanks,
> Erik
>
> On 05/03/15 01:26, Kirk Pepperdine wrote:
>> Hi Erik,
>>
>> In principle I agree that times should be unified. However I am concerned that we maintain backwards compatibility and only making changes to the current format if there is a compelling reason to do My reason for this is that making little piece mean changes here and there really complicates the downstream tool chain. If you were to make one big sweeping change that fixed this every where once and for all, I’d be happy to take the downstream hit. But doing this small bits at a time would be worse than fixing this particular “problem”.
>>
>> Kind regards,
>> Kirk Pepperdine
>>
>> On Mar 4, 2015, at 10:16 PM, Erik Österlund <erik.osterlund at lnu.se> wrote:
>>
>>> Hi Bengt,
>>>
>>> I see there are changes concerning time units and conversions, milliseconds vs seconds. This has bugged me too so thought I'd just add my opinion in this matter.
>>>
>>> Whenever dealing with a logical units such as time (seconds, milliseconds, etc) or size (words, bytes) that are mixed all over the place (sometimes for good reasons), I find it less error prone to let the type system help out automatically by annotating the types with their logical unit. As motivation, in this review there have been some human errors regarding time units, and this easily happens when conversions are manually performed.
>>>
>>> So personally I would have defined a Time type, letting time have both a physical type (e.g. double, float) AND a logical unit annotation (e.g. seconds, milliseconds) looking something like this:
>>>
>>> enum TimeUnit {
>>> nanoseconds = 1000000000,
>>> microseconds = 1000000,
>>> milliseconds = 1000,
>>> seconds = 1
>>> };
>>>
>>> template <TimeUnit T, typename S>
>>> class Time {
>>> S _quantity;
>>>
>>> static inline S convert(const S quantity, TimeUnit from, TimeUnit to) {
>>> if (from == to) {
>>> return quantity;
>>> } else {
>>> return quantity * S(to) / S(from);
>>> }
>>> }
>>> public:
>>> Time(S quantity) : _quantity(quantity) {}
>>> template <TimeUnit U, typename V>
>>> Time(Time<U, V> other) : _quantity(convert(other.value(), U, T)) {}
>>> template <TimeUnit U>
>>> Time<U, S> operator =(Time<U, S> other) { _quantity = convert(other.value(), U, T); }
>>>
>>> S value() const { return _quantity; }
>>> operator S() const { return value(); }
>>> Time<T, S> operator +(Time<T, S> other) const { return value() + other.value(); }
>>> Time<T, S> operator -(Time<T, S> other) const { return value() - other.value(); }
>>> };
>>>
>>> ...and then with some convenient type aliases:
>>>
>>> typedef Time<seconds, double> time_default_t;
>>> typedef Time<seconds, double> time_secs_t;
>>> typedef Time<milliseconds, double> time_millis_t;
>>> typedef Time<microseconds, double> time_micros_t;
>>> typedef Time<microseconds, double> time_nanos_t;
>>>
>>> ...you could do stuff like this:
>>>
>>> time_millis_t time_ms = 100.0; // value is 100.0
>>> double time_prim_ms = time_ms; // value is 100.0
>>> Time<seconds, float> time_secs = time_ms; // value is 0.1f; automatically converts [milliseconds, double] to [seconds, float]
>>> time_default_t time_default = time_ms + time_secs; // value is 0.2; default unit is [seconds, double]
>>>
>>> ...and whenever there is a mismatch between time units that you send and the units expected in a function, it will automatically be converted to the appropriate expected type/unit. If you don't want to convert units automatically because you are scared of performance issues or something, then the type system can still at the very least help finding implicit conversions that you might want to eliminate automatically and reveal bugs related to such unit mismatches rapidly at compile time instead of after error prone manual verification.
>>>
>>> I don't know if this type of solution is interesting to you as maybe you want a single universal time unit to be used instead (such as seconds) in order to solve the problem, but thought I'd throw it out there anyway. Oh and even then I think it's clearer in a solution like this what the default time unit is supposed to be by just introspecting time_default_t instead of just "double". What do you think?
>>>
>>> Thanks,
>>> /Erik
>>>
>>> On 04/03/15 16:43, Bengt Rutisson wrote:
>>>> Hi again,
>>>>
>>>> Yet another update to this webrev. I realized that I can move the
>>>> WorkerDataArray class from the hpp file to the cpp file. I think that's
>>>> worth doing to show that it is just in internal class.
>>>>
>>>> Also, I renamed the GCPhase* enums and variables to GCParPhase* to make
>>>> it possible to use the GCPhase* names for a possible future extension to
>>>> store the "single time phases" in something similar to the _gc_phases array.
>>>>
>>>> Here's an updated webrev:
>>>> http://cr.openjdk.java.net/~brutisso/8074037/webrev.03/
>>>>
>>>> And here's the diff compared to the last one:
>>>> http://cr.openjdk.java.net/~brutisso/8074037/webrev.02-03.diff/
>>>>
>>>> Thanks,
>>>> Bengt
>>>>
>>>>
>>>> On 2015-03-03 17:10, Bengt Rutisson wrote:
>>>>> Hi all,
>>>>>
>>>>> After some testing I noticed that I had done all my testing with
>>>>> StringDedup enabled. The code did not work properly with it disabled
>>>>> since StringDedup reported times for full GCs and concurrent marking
>>>>> even though it was not logged anywhere. I changed the code to only
>>>>> report times when the times will be reported as part of a GC.
>>>>>
>>>>> Here's an updated webrev:
>>>>> http://cr.openjdk.java.net/~brutisso/8074037/webrev.02/
>>>>>
>>>>> And a diff compared to the last review:
>>>>> http://cr.openjdk.java.net/~brutisso/8074037/webrev.01-02.diff/
>>>>>
>>>>> Thanks,
>>>>> Bengt
>>>>>
>>>>> On 2015-03-02 17:15, Bengt Rutisson wrote:
>>>>>> Hi again,
>>>>>>
>>>>>> Here is an updated webrev:
>>>>>> http://cr.openjdk.java.net/~brutisso/8074037/webrev.01/
>>>>>>
>>>>>> And here's the diff compared to the last webrev:
>>>>>> http://cr.openjdk.java.net/~brutisso/8074037/webrev.00-01.diff/
>>>>>>
>>>>>> I tried to clean up the seconds vs. milliseconds confusion by making
>>>>>> it G1GCPhaseTimes' responsibility to handle the conversion in all
>>>>>> cases. So, the interface is now that times are reported in seconds
>>>>>> (for example G1GCPhaseTimes::record_time_secs()) but returned in
>>>>>> milliseconds (for example G1GCPhaseTimes::average_time_ms()). I
>>>>>> suffixed the method names "_secs" and "_ms" to make it easier to
>>>>>> follow the code.
>>>>>>
>>>>>> I'm not very happy about this inconsistency, but I want it to be an
>>>>>> intermediate step on the way to getting rid of os::elapsedTime()
>>>>>> completely.
>>>>>>
>>>>>> StefanK also suggested to move the logging code out to its own class.
>>>>>> This simplified the logging quite a bit I think. The class,
>>>>>> G1GCPhasePrinter, has to know a lot about both G1GCPhaseTimes and
>>>>>> WorkerDataArray, but I still think it is nice to have the logging
>>>>>> separated out.
>>>>>>
>>>>>> I didn't read Thomas' email with comments until after I had prepared
>>>>>> this updated webrev. But I think I have addressed the issues brought
>>>>>> up there. Thanks for looking at this, Thomas!
>>>>>>
>>>>>> Bengt
>>>>>>
>>>>>> On 2015-03-02 09:23, Bengt Rutisson wrote:
>>>>>>> Hi Kim,
>>>>>>>
>>>>>>> Thanks for looking at this!
>>>>>>>
>>>>>>> On 2015-02-28 00:43, Kim Barrett wrote:
>>>>>>>> On Feb 27, 2015, at 10:38 AM, Bengt Rutisson
>>>>>>>> <bengt.rutisson at oracle.com> wrote:
>>>>>>>>> Can I have a couple of reviews for this cleanup change?
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~brutisso/8074037/webrev.00/
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8074037
>>>>>>>>>
>>>>>>>>> […]
>>>>>>>>> The patch also introduces a new class called
>>>>>>>>> G1GCPhaseTimesTracker, which is a scoped object to simplify time
>>>>>>>>> measurements. I've also tried to move conversion of times down to
>>>>>>>>> when we print the log messages instead of when we store them. This
>>>>>>>>> removes several "* 1000.0" in the code. Eventually I would like to
>>>>>>>>> replace os::elapsedTime() with os::elapsed_counter(). But that
>>>>>>>>> should be done as a separate patch.
>>>>>>>> Before I proceed any further with this review, I feel like I'm getting
>>>>>>>> lost in unit changes. In some places values of msecs used to be used
>>>>>>>> but now we're using secs, right? But it seems like there might be some
>>>>>>>> confusion about that, and I'm not sure whether it's me or the code
>>>>>>>> that's confused.
>>>>>>>>
>>>>>>>> For example:
>>>>>>>>
>>>>>>>> src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp
>>>>>>>> 1079 cost_per_card_ms =
>>>>>>>> phase_times()->average_time(G1GCPhaseTimes::UpdateRS) / (double)
>>>>>>>> _pending_cards;
>>>>>>>> 1080 _cost_per_card_ms_seq->add(cost_per_card_ms);
>>>>>>>>
>>>>>>>> I think the values that have been recorded in the phase_times are
>>>>>>>> secs, and average_time appears to just use the values as is. So
>>>>>>>> shouldn't that now be cost_per_card_secs? And the value added to
>>>>>>>> _cost_per_card_ms_seq needs to be cost_per_card_secs * 1000.0?
>>>>>>> Right! Good catch! I was so happy to get rid of some of the "*
>>>>>>> 1000.0" statements sprinkled all over the G1 code that I forgot to
>>>>>>> add some of them back ;)
>>>>>>>
>>>>>>> I'll send out an updated webrev soon.
>>>>>>>
>>>>>>> Bengt
>>>>>>>
>>>>>>>> Oh how I wish for a units package!
>>>>>>>>
>>>>
More information about the hotspot-gc-dev
mailing list