RFR 8184445: JShell tests: fail intermittently if tests are run in high concurrent mode.
Robert Field
robert.field at oracle.com
Thu Mar 1 06:12:27 UTC 2018
An updated version with only the launching tests in exclusiveAccess and
the common harness code moved up into lib:
http://cr.openjdk.java.net/~rfield/8184445v1.webrev/
Warning: this takes this fix from tiny to massive.
Much effort has already been put into hardening JShell's launching
networking code. There is probably more that could be done, but I
wouldn't know what.
-Robert
On 02/27/18 18:14, Joseph D. Darcy wrote:
> Hi Robert,
>
> I'd prefer if only the launching tests and other known failures were
> segregated into a non-concurrent area. That would still let ~3/4 of
> the tests proceed normally.
>
> As a follow-up, can an RFE be filed to harden the intermittently
> failing tests against concurrent networking?
>
> Thanks,
>
> -Joe
>
> On 2/27/2018 12:21 PM, Robert Field wrote:
>> OK, I did a survey of all the JShell bugs. There are over a dozen
>> intermittent test failures, almost all are probably network related.
>> But if we limit to just intermittent failures to launch, then there
>> are seven.
>>
>> There are 17 tests of launching configuration, and 75 'normal'
>> tests. So, the launching configuration tests do fail
>> disproportionately, 3 mentioned failures vs 5 mentioned failing files.
>>
>> The bug that highlighted the concurrent testing -- "JShell tests:
>> fail intermittently if tests are run in high concurrent mode":
>> https://bugs.openjdk.java.net/browse/JDK-8184445
>> mentioned 'several' issues, the two included JTR files are,
>> tellingly, normal tests.
>>
>> The non-launching intermittent failures are all normal tests.
>>
>> So, where does that leave us? I could reduce the failures a bit at
>> low time-cost by putting the launching configuration tests in the
>> exclusiveAccess.dirs. Or, I could, at considerable testing cost,
>> address the broad swath.
>>
>> -Robert
>>
>> On 02/26/18 17:28, joe darcy wrote:
>>> Hi Robert,
>>>
>>> On 2/26/2018 10:57 AM, Robert Field wrote:
>>>>
>>>>
>>>> On 02/26/18 10:23, joe darcy wrote:
>>>>> Hi Robert,
>>>>>
>>>>> The fix looks acceptable in terms of addressing the problem, but
>>>>> is there a sense of how this might impact running time of the test
>>>>> suite?
>>>>>
>>>>> Phrased differently, are there plans to make the tests more robust
>>>>> to concurrent runs in the future?
>>>>
>>>> Hi Joe,
>>>>
>>>> There is a lot of network connection happening in these tests, most
>>>> of which is in layers we don't control (JDI). We have been trying
>>>> to lower the risk and we don't see failures running the tests
>>>> ourselves, but intermittent failures scattered through the suite
>>>> during testing (e.g. mach5) have been a constant problem.
>>>>
>>>> We will see the impact on test duration. Default connection has
>>>> three-level fail-over, the tests of other connection modes see
>>>> failure far more frequently, so, if necessary, we can look at
>>>> tuning this.
>>>>
>>>
>>> From some quick checking, there are about 80 tests in that
>>> directory. From one sample point on my laptop, the tests took a good
>>> long while to run. If some of the tests can be reliably run
>>> concurrently, I'd much prefer to see a subset of tests moved to a
>>> sheltered directory.
>>>
>>> Thanks,
>>>
>>> -Joe
>>
>
More information about the kulla-dev
mailing list