RFR [14] 8151678: com/sun/jndi/ldap/LdapTimeoutTest.java failed due to timeout on DeadServerNoTimeoutTest is incorrect
Martin Buchholz
martinrb at google.com
Fri Sep 6 19:02:38 UTC 2019
Martin's rules for test methods:
- never have more than 100 millisecond total expected runtime
- never have more than 12 millisecond blocking time for any single operation
- yet be able to survive a one-time 5 second thread suspension at any time
---
adjusting wait time looks like an anti-pattern to me. You can't stop a
rare 5-second thread suspension at line 387.
385 final long waitTime = hiMillis -
386 NANOSECONDS.toMillis(System.nanoTime() - startNanos);
// (2) adjust wait time
387
388 try {
389 T r = task.get(waitTime, MILLISECONDS); // (3) wait for
the task to complete
On Fri, Sep 6, 2019 at 9:33 AM Pavel Rappo <pavel.rappo at oracle.com> wrote:
> > On 6 Sep 2019, at 15:59, Martin Buchholz <martinrb at google.com> wrote:
> >
> > I took another look at LdapTimeoutTest.java.
>
> Many thanks!
>
> > I was surprised to see RIGHT_MARGIN. In jsr166 we succeed in having one
> timeout that is long enough to "never happen". I'm still advocating the 10
> second value.
> >
> > I was surprised to see LEFT_MARGIN. We just fixed Thread.sleep, so we
> have no known problems with JDK methods returning early - you can trust
> timed get!
> > You start measuring, by calling nanoTime, before you start the activity
> you are measuring, so there should be no need for LEFT_MARGIN.
>
> You raised many good points. Let me try to address them.
>
> 1. RIGHT_MARGIN is not used for checking that the activity has stuck
> infinitely (assertIncompletion). INFINITY_MILLIS is used for that.
> RIGHT_MARGIN is used for checking that the activity takes some predefined
> amount of time (roughly).
>
> 2. As for the numeric value of INFINITY_MILLIS, the reason I chose 20
> seconds is twofold. Firstly, the code under test is subject to different
> timeouts. Every timeout value should be distinctive. Otherwise, how would I
> differentiate between them? For example, if I chose INFINITY_MILLIS to be
> 10 seconds how would I know if the code is stuck due to the read timeout of
> 10 seconds or the "infinite timeout"? Secondly, I must take into account
> slow machines. Really slow virtual machines. Hence, minimal timeouts
> (read/connect) have a magnitude of seconds and tens of seconds and the
> "rightmost", infinite timeout, is 20 seconds.
>
> 3. LEFT_MARGIN might no longer be needed due to the fact that no timed
> methods return early (actually there is a comment about it inside those two
> assert methods).
>
> > We have some fresh thread-awaiting code here:
> >
> http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/test/tck/JSR166TestCase.java?view=markup#l1443
>
> Interesting.
>
> > Instead of communicating startTime from the test thread back to the main
> thread, I would do my loMillis checking in the test thread, and hiMillis
> checking in the main thread, like e.g. compare with a fresh test method
> testTimedOffer
> >
> http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/test/tck/ArrayBlockingQueueTest.java?view=markup#l394
>
> Understood. However, that might be a matter of taste. I prefer to have all
> the calculations and error handling in one place. Unless there's a good
> reason I wouldn't change it.
>
> > Timeouts should be adjusted via Utils.adjustTimeout
>
> That makes perfect sense. I never knew this method existed. Thanks!
>
> > On Fri, Sep 6, 2019 at 4:31 AM Pavel Rappo <pavel.rappo at oracle.com>
> wrote:
> > Martin, thanks for having a look at it.
> >
> > I'd appreciate if you could have a look at the timeout measuring
> mechanics in assertCompletion/assertIncompletion specifically, maybe to
> spot something that is grossly inadequate.
> >
> > I tried to accommodate some usual suspects of timeout measurements
> failures. I understand that since we're not working with real-time systems,
> my attempts to build bullet-proof measurement mechanics are futile.
> >
> > -Pavel
> >
> > > On 30 Aug 2019, at 18:19, Martin Buchholz <martinrb at google.com> wrote:
> > >
> > > Not really a review, but:
> > >
> > > For many years we've been using 10 seconds (scaled by timeout factor)
> as a duration long enough that a timeout is a real failure.
> > > Which is close to your own 20 seconds. Of course, no value is surely
> safe.
> > >
> > > Probably, parallel testing infrastructure for timeouts should be a
> test library method. I do something similar in JSR166TestCase
> > >
> > > /**
> > > * Runs all the given actions in parallel, failing if any fail.
> > > * Useful for running multiple variants of tests that are
> > > * necessarily individually slow because they must block.
> > > */
> > > void testInParallel(Action ... actions) {
> > > ExecutorService pool = Executors.newCachedThreadPool();
> > > try (PoolCleaner cleaner = cleaner(pool)) {
> > >
> > > On Fri, Aug 30, 2019 at 6:23 AM Daniel Fuchs <daniel.fuchs at oracle.com>
> wrote:
> > > On 30/08/2019 13:54, Pavel Rappo wrote:
> > > > Updated,
> > > >
> > > > http://cr.openjdk.java.net/~prappo/8151678/webrev.01/
> > > >
> > >
> > > Changes look good!
> > >
> > > best regards,
> > >
> > > -- daniel
> >
>
>
More information about the core-libs-dev
mailing list