RFR 8222671 : thread_large/thread_large.java times out on MacOSX
David Holmes
david.holmes at oracle.com
Thu Aug 27 02:08:48 UTC 2020
Hi Gerard,
I would have preferred the @bug to remain for traceability purposes
(without having to check repo history).
I would also have kept the main/othervm as this is something of a stress
test and I would be concerned that the hosting VM might be de-stabilised
in some way when a huge number of threads are created.
But I don't insist on any further changes and this test has already
consumed far too many resources IMO so feel free to push as-is.
Thanks,
David
On 20/08/2020 5:55 am, gerard ziemski wrote:
> hi Igor,
>
> I have responses to your questions inline:
>
> bug link at https://bugs.openjdk.java.net/browse/JDK-8222671
> open webrev at http://cr.openjdk.java.net/~gziemski/8222671_rev5
> testing: Mach5 hs-tier1,2,3,4,5 in progress...
>
>
> On 8/19/20 12:25 PM, Igor Ignatyev wrote:
>> Hi Gerard,
>>
>> please see my comments inline.
>>
>> -- Igor
>>
>>> On Aug 19, 2020, at 9:53 AM, gerard ziemski
>>> <gerard.ziemski at oracle.com> wrote:
>>>
>>> hi Igor,
>>>
>>> Thank you for the review. Here is the webrev, but please see my
>>> comments inline for more:
>>>
>>> bug link at https://bugs.openjdk.java.net/browse/JDK-8222671
>>> open webrev at http://cr.openjdk.java.net/~gziemski/8222671_rev4
>>> testing: Mach5 hs-tier1,2,3,4,5 in progress...
>>>
>>>
>>>
>>> On 8/18/20 12:18 PM, Igor Ignatyev wrote:
>>>> Hi Gerard,
>>>>
>>>> looks be
>>>> 1.
>>>>> 26 * @bug 8222671
>>>> @bug tag should refer to a product bug which can be reproduced w/
>>>> this test, not the bug id used to push a test.
>>> The bug 8222671 is the issue here, i.e. timeouts in the existing
>>> test, which is what we are trying to fix, so we have a new test that
>>> replaces the old and less robust one. I did try to find an issue that
>>> originated the old test and I couldn’t, but if you know how to find,
>>> I can add a reference to it in the new test. To me, the old test
>>> looks more like a general functionality test though.
>> I'd just remove @bug tag, b/c from my point of view listing all test
>> bugs which caused modification of the test in @bug is just useless and
>> pollute metadata. others have other opinion though, so I don't insist
>> on removal.
> You are the expert here, removed.
>
>
>>>> 2.
>>>>> 28 * @run main/othervm ThreadCountLimit
>>>> what's the reason to run in othervm mode?
>>> Oh, I just copied that from the existing test, without giving it any
>>> thought - I will remove it.
>> in hotspot tests, we tend not to relay on implicit @run (as it's error
>> prone) and always have explicit `@run`, that's to say, please add
>> `@run main ThreadCountLimit`
> Is there a template for a new test, or a wiki page that explains all
> this? I learned quite a bit writing this test, but it would be nice to
> have some documentation I could use as a reference in the future.
>
>
>>>> 3.
>>>>> 34 public class ThreadCountLimit extends Object {
>>>> you don't need to explicitly extend Object.
>>> True, but I like the code explicit like that if that’s OK.
>> fine w/ me, but it's kinda against regular java code guidelines and
>> doesn't bring any information.
> Fixed.
>
>
>>>> 4. in ThreadCountLimit::run()
>>>>> 67 } catch (InterruptedException ex) {
>>>>> 68 }
>>>> shouldn't this mark the test as failed?
>>> Good question. In a perfect world we would want to know if a thread
>>> got unexpectedly interrupted, but realistically a process might
>>> (will?) exhibit spurious interruptions from the OS sooner or later,
>>> which are not the focus of this test, so instead of marking it a
>>> failure and making someone look into it, only later to determine that
>>> it was just a spurious interrupt, we simply ignore it and focus on
>>> the core goal of this test instead.
>> I'm not sure what you are refer here by 'spurious interruptions', java
>> threads must be interrupted only if someone interrupted them, all
>> other behaviors are JVM bugs and are to be analyzed. given the goal of
>> this test is to check how JVM behaves when lots of threads have been
>> created and started, I see catching unexpected InterruptedException as
>> a valid failure mode of this test.
> I think I might be mixing up native wait/notify on mutexes here, where a
> wait on a native mutex can be spuriously awaken.
>
> According to
> https://www.ibm.com/developerworks/library/j-jtp05236/index.html Thread
> interruption is a cooperative mechanism, and since we are not
> interrupting any of the blocking methods we use in the test ourselves,
> it would be really curious if we ever did get InterruptedException, so
> it might be worth it to report it.
>
> I tightened up the blocking code and its try/catch code block, to make
> it to clear what is the subject to interrupts and we will now throw an
> Error if we ever catch one (never?) Have we ever seen one of those
> unexpected interrupts in our testing so far?
>
>
> Thank you for the feedback.
>
>
> cheers
>
>>>
>>> I do see, however, that in one place I report it as Error and ignore
>>> it in the other instance, so I fixed that inconsistency. I added
>>> comments as well.
>>>
>>>> Thanks,
>>>> -- Igor
>>>>
>>>>> On Aug 18, 2020, at 7:38 AM, gerard ziemski
>>>>> <gerard.ziemski at oracle.com> wrote:
>>>>>
>>>>> Thank you David for the review!
>>>>>
>>>>> I incorporated all your feedback.
>>>>>
>>>>> I'm surprised that "assert" is not available by default in our
>>>>> tests though - what is a reason for that? Of all the places where
>>>>> asserts should be available, a testing suite would seem to be it to
>>>>> me.
>>>>>
>>>>> bug link at https://bugs.openjdk.java.net/browse/JDK-8222671
>>>>> open webrev at http://cr.openjdk.java.net/~gziemski/8222671_rev3
>>>>> testing: passes locally, Mach5 hs-tier1,2,3,4,5 in progress...
>>>>>
>>>>>
>>>>> cheers
>>>>>
>>>>> On 8/18/20 12:15 AM, David Holmes wrote:
>>>>>> Hi Gerard,
>>>>>>
>>>>>> On 18/08/2020 1:28 am, gerard ziemski wrote:
>>>>>>> Thank you David and Igor for the feedback!
>>>>>>>
>>>>>>> I rewrote the test again, this time basing it off
>>>>>>> test/hotspot/jtreg/runtime/Thread/* tests.
>>>>>>>
>>>>>>> For macOS and Linux we create threads until we hit a process
>>>>>>> limit thread count. Windows, however, allows unlimited (?) number
>>>>>>> of threads, though the process gradually slows down as the count
>>>>>>> grows, so I had to add time limit and I set it to 5 sec. I chose
>>>>>>> 5 sec because the value is well below jtreg time out (i.e. 120s),
>>>>>>> but still quite above how long my laptop needs to hit the process
>>>>>>> thread count limit (i.e. my Mac laptop needs about 1 sec to hit
>>>>>>> that limit).
>>>>>>>
>>>>>>> Otherwise there are no other hardwired values or assumptions (I
>>>>>>> hope). What do you think of the test now?
>>>>>> Not unreasonable :) It could still have tried for an upper limit
>>>>>> of 10000 instead of using the time-limit, but it makes no real
>>>>>> difference.
>>>>>>
>>>>>> A few nits/suggestions:
>>>>>>
>>>>>> Please add an @bug line to the test comment.
>>>>>>
>>>>>> 49 if ((this.index % 250) == 0) {
>>>>>>
>>>>>> Outside of the constructor it is poor style to use this.x to refer
>>>>>> to fields methods (unless needed due to local variable shadowing).
>>>>>> This applies throughout the test.
>>>>>>
>>>>>> 54 doWork();
>>>>>>
>>>>>> Not sure why you factored things out the way you did - everything
>>>>>> could go in run().
>>>>>>
>>>>>> 61 setName(""+this.index);
>>>>>>
>>>>>> This could go in constructor, but I presume it adds something to
>>>>>> the workload? Is the use of ""+index deliberate in that regard
>>>>>> (generate string concatenation work), rather than using
>>>>>> String.valueOf(index)?
>>>>>>
>>>>>> 65 assert(this.index == Integer.parseInt(getName()));
>>>>>>
>>>>>> We never run tests with Java asserts enabled so this is "dead" code.
>>>>>>
>>>>>> 87 // Windows path
>>>>>>
>>>>>> Windows path or system with very large ulimit?
>>>>>>
>>>>>> 92 } catch (OutOfMemoryError e) {
>>>>>>
>>>>>> I suggest checking this is actually the "unable to create native
>>>>>> thread" OOME rather than a general Java OOME (if run with a small
>>>>>> heap for example you might get OOME growing the ArrayList).
>>>>>>
>>>>>> 110 String cmd = System.getProperty("sun.java.command");
>>>>>> 111 if (cmd != null &&
>>>>>> !cmd.startsWith("com.sun.javatest.regtest.agent.MainWrapper")) {
>>>>>> 112 // Exit with success in a non-JavaTest environment:
>>>>>> 113 System.exit(0);
>>>>>> 114 }
>>>>>>
>>>>>> I never noticed when this was used in the little set of
>>>>>> runtime/Thread tests! I don't know what the rationale was but it
>>>>>> is completely unnecessary as a normal return from main would be
>>>>>> the equivalent of exit(0) - so the tests work fine without
>>>>>> System.exit when run standalone. Please don't emulate this part of
>>>>>> the existing tests. I'll file a RFE to remove it from the other
>>>>>> tests.
>>>>>>
>>>>>> I won't be around for any further follow up for a week, so feel
>>>>>> free to proceed once Igor is okay with things.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>> bug link at https://bugs.openjdk.java.net/browse/JDK-8222671
>>>>>>> open webrev at http://cr.openjdk.java.net/~gziemski/8222671_rev2
>>>>>>> testing: Mach5 hs-tier1,2,3,4,5 in progress...
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 8/7/20 2:41 PM, Igor Ignatyev wrote:
>>>>>>>>> On Aug 6, 2020, at 12:33 AM, David Holmes
>>>>>>>>> <david.holmes at oracle.com <mailto:david.holmes at oracle.com>> wrote:
>>>>>>>>>
>>>>>>>>> Basing the new test around the existing thread stress tests was
>>>>>>>>> not unreasonable but I don't think we're actively adding to
>>>>>>>>> those tests. I would check with Igor whether it is better to
>>>>>>>>> just add this under runtime/thread rather than vmTestBase.
>>>>>>>> Hi Gerard,
>>>>>>>>
>>>>>>>> David is right, we are trying to get rid of vmTestbase directory
>>>>>>>> and move all tests into corresponding component directories
>>>>>>>> cleaning them up at the same time. so runtime/thread directory
>>>>>>>> sound like a much better location for a new test.
>>>>>>>>
>>>>>>>> I don't think you need to cary all this baggage of 'VM testbase
>>>>>>>> keywords', 'converted from ...' and the rest, I'd suggest you
>>>>>>>> just use the text from 'DESCRIPTION' as @summary. I'd also
>>>>>>>> remove System.exit call and would just make thread009::run to
>>>>>>>> return bool. it's also unclear to me if someone is ever going to
>>>>>>>> change THREADS_EXPECTED , TIMEOUT, and DEBUG_MODE values, I'd
>>>>>>>> just hardcode them in the test and get rid of parsing. it seems
>>>>>>>> that debug output won't clutter output, so you might want to
>>>>>>>> always print them and remove DEBUG_MODE field.
>>>>>>>>
>>>>>>>> using `assert` in test code isn't a good idea as it will work
>>>>>>>> only if someone runs test w/ `-ea` which not all people do. it's
>>>>>>>> better to replace w/ `if (p) throw new Error(...)` or use
>>>>>>>> jdk.test.lib.Asserts class.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> -- Igor
>
More information about the hotspot-runtime-dev
mailing list