RFR: 8273315: Parallelize and increase timeouts for java/foreign/TestMatrix.java test

Tue Sep 7 14:34:37 UTC 2021

On Fri, 3 Sep 2021 09:53:53 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> This test runs a lot of configurations, and spends a lot of time serially. This is especially pronounced when run in prospective tier4 runs (JDK-8273314). There are reports of multi-hour runs (see JDK-8271613). We can parallelize the test configurations for this test to make it hurt less. Also, timeouts need to be increased for `TestUpcall` test configs, because some of them are very slow in fastdebug mode. 
> 
> Sample run:
> 
> 
> $ time CONF=linux-x86_64-server-fastdebug make run-test TEST=java/foreign/TestMatrix.java | ts -s
> 00:00:00 Building target 'run-test' in configuration 'linux-x86_64-server-fastdebug'
> 00:00:03 Test selection 'java/foreign/TestMatrix.java', will run:
> 00:00:03 * jtreg:test/jdk/java/foreign/TestMatrix.java
> 00:00:03 
> 00:00:03 Running test 'jtreg:test/jdk/java/foreign/TestMatrix.java'
> 00:00:31 Passed: java/foreign/TestMatrix.java#UpcallHighArity-TTFT
> 00:00:31 Passed: java/foreign/TestMatrix.java#UpcallHighArity-FFFF
> 00:00:31 Passed: java/foreign/TestMatrix.java#UpcallHighArity-FFTF
> 00:00:31 Passed: java/foreign/TestMatrix.java#UpcallHighArity-TTTT
> 00:00:31 Passed: java/foreign/TestMatrix.java#UpcallHighArity-FTTT
> 00:00:31 Passed: java/foreign/TestMatrix.java#UpcallHighArity-FFTT
> 00:00:31 Passed: java/foreign/TestMatrix.java#UpcallHighArity-FFFT
> 00:00:31 Passed: java/foreign/TestMatrix.java#UpcallHighArity-TTTF
> 00:00:31 Passed: java/foreign/TestMatrix.java#UpcallHighArity-FTTF
> 00:00:31 Passed: java/foreign/TestMatrix.java#UpcallHighArity-TFFT
> 00:00:31 Passed: java/foreign/TestMatrix.java#UpcallHighArity-TFTF
> 00:00:32 Passed: java/foreign/TestMatrix.java#UpcallHighArity-TFFF
> 00:00:32 Passed: java/foreign/TestMatrix.java#UpcallHighArity-FTFF
> 00:00:35 Passed: java/foreign/TestMatrix.java#UpcallHighArity-TFTT
> 00:00:35 Passed: java/foreign/TestMatrix.java#UpcallHighArity-TTFF
> 00:00:38 Passed: java/foreign/TestMatrix.java#UpcallHighArity-FTFT
> 00:01:50 Passed: java/foreign/TestMatrix.java#Downcall-FF
> 00:02:27 Passed: java/foreign/TestMatrix.java#Downcall-TF
> 00:03:03 Passed: java/foreign/TestMatrix.java#Downcall-FT
> 00:03:47 Passed: java/foreign/TestMatrix.java#Downcall-TT
> 00:04:17 Passed: java/foreign/TestMatrix.java#Upcall-FTFF
> 00:04:23 Passed: java/foreign/TestMatrix.java#Upcall-TFFF
> 00:05:46 Passed: java/foreign/TestMatrix.java#Upcall-TTFF
> 00:06:03 Passed: java/foreign/TestMatrix.java#Upcall-TFFT
> 00:06:44 Passed: java/foreign/TestMatrix.java#Upcall-FTFT
> 00:08:24 Passed: java/foreign/TestMatrix.java#Upcall-TFTF
> 00:08:39 Passed: java/foreign/TestMatrix.java#Upcall-TTFT
> 00:09:16 Passed: java/foreign/TestMatrix.java#Upcall-FTTF
> 00:09:19 Passed: java/foreign/TestMatrix.java#Upcall-TFTT
> 00:10:01 Passed: java/foreign/TestMatrix.java#Upcall-FTTT
> 00:10:37 Passed: java/foreign/TestMatrix.java#Upcall-TTTF
> 00:12:16 Passed: java/foreign/TestMatrix.java#Upcall-TTTT
> 00:12:17 Test results: passed: 32
> 00:12:21 
> 00:12:21 ==============================
> 00:12:21 Test summary
> 00:12:21 ==============================
> 00:12:21    TEST                                              TOTAL  PASS  FAIL ERROR   
> 00:12:21    jtreg:test/jdk/java/foreign/TestMatrix.java          32    32     0     0   
> 00:12:21 ==============================
> 
> real	12m20.538s
> user	131m54.043s
> sys	0m59.944s
> 
> 
> If we don't parallelize, then those 130 minutes are spent serially.

So, what is the policy for defining developers tests that are not meant to be ran on every build infra, but are meant to be run on a more casual basis by developers working in a particular area? When we added TestMatrix we made sure to exclude it from the relevant groups. I suspect other tests might have followed the same approach. But if we have a test that automatically catches all excluded tests, and folks start running this "excluded tests" group by default, what are developers supposed to do?

I guess there's a reason why tests might not have been included in a tier. Defining a blanket rule which re-adds all excluded tests seems like a questionable move to me. Surely in the future, keeping this in mind, developers will probably refrain from pushing these tests to OpenJDK altogether, and store them somewhere private - which doesn't seem great.

So, I understand you have a fix which parallelize the test execution (great!), but it seems like we're talking past each other a bit, in the sense that you (or any other) should have never picked up this test in an automatic test run in the first place.

Also, execution time is part of the picture, albeit the most visible one, since it causes timeouts and failures. But what about CPU cycles? Sure, if we parallelize, we can get better execution time - but you still end up wasting CPU cycles on a test which you are not meant to run in the first place. Is this the right thing to do?

I believe that, ultimately, this is a (test) policy issue, which should be discussed accordingly in an OpenJDK mailing list.

-------------

PR: https://git.openjdk.java.net/jdk/pull/5358