RFR (XS) 8220688: [TESTBUG] runtime/NMT/MallocStressTest.java timed out
coleen.phillimore at oracle.com
coleen.phillimore at oracle.com
Thu May 23 11:13:24 UTC 2019
I've downloaded all the recent hangs from this test and couldn't find it
doing anything that deadlocks, except lots of GC. I've updated the bug
with comments. I still think taking out the sleep, that lets the
allocator threads run until they OOM, will make this test more
reliable. I don't see any evidence of another bug from the logs we have.
thanks,
Coleen
On 5/21/19 7:51 AM, coleen.phillimore at oracle.com wrote:
>
>
> On 5/20/19 10:25 PM, David Holmes wrote:
>> Hi Coleen,
>>
>> This is fine to push, but some discussion below ...
>>
>> On 21/05/2019 6:50 am, coleen.phillimore at oracle.com wrote:
>>> Summary: reduce number of threads and iterate rather than sleep.
>>
>> Reducing the load of this test may be reasonable ... though perhaps
>> it should be configured based on available CPUs rather than fixed
>> numbers (just a thought).
>
> I'm thinking even the minimal configuration can deal with 50 threads,
> which is why I chose it. It's not worth doing anything more
> complicated here.
>>
>> But this test rarely times out (and some failures of the test have
>> been incorrectly attributed with this bug - the 10 second attach
>> timeout is not at all the same issue as timing out after 1h25m!), so
>> your testing is unlikely to have encountered the conditions in which
>> the timeout actually manifests. The test normally runs in a few
>> minutes on linux so the 1h25m timeout is extreme and seems unlikely
>> to be due simply to the load the test produces. So I would not be
>> surprised if we continue to see occasional timeouts.
>
> I don't know about the attach timeouts which I've seen in lots of
> tests. This test uses jcmd to read the NMT info. There are several
> tests that do that in the test suite, but this particular test times
> out *a lot*. Looking at the test, the load is very high and there's
> a long sleep, so a variable attach response will trigger a timeout for
> this particular test. That is why I want to reduce the load and
> eliminate the long sleep. You can always open a new bug for the
> attach mechanism variation, if one doesn't already exist.
>
> Thanks,
> Coleen
>>
>> Further, the huge variation in execution time for this test across
>> different platforms e.g. macOS takes 45 minutes!, suggests there may
>> be something else at play here.
>>
>> Thanks,
>> David
>>
>>> Ran test 20x without timeout and faster now. Left /timeout in test
>>> because it doesn't hurt.
>>>
>>> open webrev at
>>> http://cr.openjdk.java.net/~coleenp/2019/8220688.01/webrev
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8220688
>>>
>>> Thanks,
>>> Coleen
>
More information about the hotspot-runtime-dev
mailing list