RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast Linux machines with Total safepoint time 0 ms
David Holmes
david.holmes at oracle.com
Tue Jul 30 12:12:13 UTC 2019
Hi Matthias,
On 30/07/2019 9:25 pm, Baesken, Matthias wrote:
> Hello JC / David, here is a second webrev :
>
> http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.1/
>
> It moves the thread dump execution into a method
> executeThreadDumps(long) , and also adds while loops (but with a
> limitation for the number of thread dumps, really don’t
> want to cause timeouts etc.). I removed a check for
> MAX_VALUE_FOR_PASS because we cannot go over Long.MAX_VALUE .
I don't think executeThreadDumps is worth factoring out like out.
The handling of NUM_THREAD_DUMPS is a bit confusing. I'd rather it
remains a constant 100, and then you set a simple loop iteration count
limit. Further with the proposed code when you get here:
85 NUM_THREAD_DUMPS = NUM_THREAD_DUMPS * 2;
you don't even know what value you may be starting with.
But I was thinking of simply:
long value = 0;
do {
Thread.getAllStackTraces();
value = mbean.getTotalSafepointTime();
} while (value == 0);
We'd only hit a timeout if something is completely broken - which is fine.
Overall tests like this are not very useful, yet very fragile.
Thanks,
David
> Hope you like this version better.
>
> Best regards, Matthias
>
> *From:*Jean Christophe Beyler <jcbeyler at google.com>
> *Sent:* Dienstag, 30. Juli 2019 05:39
> *To:* David Holmes <david.holmes at oracle.com>
> *Cc:* Baesken, Matthias <matthias.baesken at sap.com>;
> hotspot-dev at openjdk.java.net; serviceability-dev
> <serviceability-dev at openjdk.java.net>
> *Subject:* Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails
> on fast Linux machines with Total safepoint time 0 ms
>
> Hi Matthias,
>
> I wonder if you should not do what David is suggesting and then put that
> whole code (the while loop) in a helper method. Below you have a
> calculation again using value2 (which I wonder what the added value of
> it is though) but anyway, that value2 could also be 0 at some point, no?
>
> So would it not be best to just refactor the getAllStackTraces and
> calculate safepoint time in a helper method for both value / value2
> variables?
>
> Thanks,
>
> Jc
>
> On Mon, Jul 29, 2019 at 7:50 PM David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>> wrote:
>
> Hi Matthias,
>
> On 29/07/2019 8:20 pm, Baesken, Matthias wrote:
> > Hello , please review this small test fix .
> >
> > The test
> test/jdk/sun/management/HotspotRuntimeMBean/GetTotalSafepointTime.java
> fails sometimes on fast Linux machines with this error message :
> >
> > java.lang.RuntimeException: Total safepoint time illegal value: 0
> ms (MIN = 1; MAX = 9223372036854775807)
> >
> > looks like the total safepoint time is too low currently on these
> machines, it is < 1 ms.
> >
> > There might be several ways to handle this :
> >
> > * Change the test in a way that it might generate nigher
> safepoint times
> > * Allow safepoint time == 0 ms
> > * Offer an additional interface that gives safepoint times
> with finer granularity ( currently the HS has safepoint time values
> in ns , see jdk/src/hotspot/share/runtime/safepoint.cpp
> SafepointTracing::end
> >
> > But it is converted on ms in this code
> >
> > 114jlong RuntimeService::safepoint_time_ms() {
> > 115 return UsePerfData ?
> > 116
> Management::ticks_to_ms(_safepoint_time_ticks->get_value()) : -1;
> > 117}
> >
> > 064jlong Management::ticks_to_ms(jlong ticks) {
> > 2065 assert(os::elapsed_frequency() > 0, "Must be non-zero");
> > 2066 return (jlong)(((double)ticks /
> (double)os::elapsed_frequency())
> > 2067 * (double)1000.0);
> > 2068}
> >
> >
> >
> > Currently I go for the first attempt (and try to generate
> higher safepoint times in my patch) .
>
> Yes that's probably best. Coarse-grained timing on very fast machines
> was bound to eventually lead to problems.
>
> But perhaps a more future-proof approach is to just add a do-while loop
> around the stack dumps and only exit when we have a non-zero safepoint
> time?
>
> Thanks,
> David
> -----
>
> > Bug/webrev :
> >
> > https://bugs.openjdk.java.net/browse/JDK-8228658
> >
> > http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.0/
> >
> >
> > Thanks, Matthias
> >
>
>
> --
>
> Thanks,
>
> Jc
>
More information about the serviceability-dev
mailing list