[13] RFR(xs) 8227646: [TESTBUG] appcds/SharedArchiveConsistency timed out
Daniel D. Daugherty
daniel.daugherty at oracle.com
Wed Jul 17 13:41:59 UTC 2019
On 7/16/19 11:09 PM, Calvin Cheung wrote:
> Removing the fc.force(true) calls works and the test performs better
> on macosx. It takes about 20s for the test to finish without the call
> vs. up to 9 min. with the call.
>
> updated webrev:
>
> http://cr.openjdk.java.net/~ccheung/8227646/webrev.01/
test/hotspot/jtreg/runtime/appcds/SharedArchiveConsistency.java
I'm assuming that the fc.force(true) calls were there out of an
abundance of caution/paranoia and that there is nothing in the
test that is relying on the data being properly flushed to disk.
Ioi states: "which I think is not necessary in our test"
and I don't see anything obvious in the test is relying on
complete data flushing.
Thumbs up.
Dan
>
> Ran tier1 and 2 tests successfully and repeated tier2 test 10 times on
> a Mac host from which the timeout was observed.
>
> thanks,
>
> Calvin
>
> On 7/16/19 9:09 AM, Calvin Cheung wrote:
>> Hi Ioi,
>>
>> I found 2 fc.force(true) calls in the test. I've removed both and
>> testing it without increasing the timeout value.
>>
>> I test it by running the hotspot_tier2_runtime test group 10 times on
>> mac hosts. Each iteration takes about 30 min. Will let you know about
>> the results.
>>
>> thanks,
>>
>> Calvin
>>
>> On 7/16/19 8:25 AM, Ioi Lam wrote:
>>> HI Calvin,
>>>
>>> Since the test is stuck at here at the timeout:
>>>
>>> at sun.nio.ch.FileDispatcherImpl.force0(java.base at 13-ea/Native Method)
>>> at
>>> sun.nio.ch.FileDispatcherImpl.force(java.base at 13-ea/FileDispatcherImpl.java:82)
>>> at
>>> sun.nio.ch.FileChannelImpl.force(java.base at 13-ea/FileChannelImpl.java:461)
>>> at
>>> SharedArchiveConsistency.writeData(SharedArchiveConsistency.java:166)
>>>
>>> Maybe we should remove the calls to FileChannel.force()? According
>>> to the javadoc, this call is for "ensuring that critical information
>>> is not lost in the event of a system crash", which I think is not
>>> necessary in our test.
>>>
>>> src/java.base/unix/native/libnio/ch/FileDispatcherImpl.c:
>>>
>>> JNIEXPORT jint JNICALL
>>> Java_sun_nio_ch_FileDispatcherImpl_force0(JNIEnv *env, jobject this,
>>> jobject fdo, jboolean md)
>>> {
>>> jint fd = fdval(env, fdo);
>>> int result = 0;
>>>
>>> #ifdef MACOSX
>>> result = fcntl(fd, F_FULLFSYNC);
>>> if (result == -1 && errno == ENOTSUP) {
>>> /* Try fsync() in case F_FULLSYUNC is not implemented on the
>>> file system. */
>>> result = fsync(fd);
>>> }
>>>
>>> https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/fcntl.2.html
>>>
>>>
>>> F_FULLFSYNC Does the same thing as fsync(2) then asks
>>> the drive to
>>> flush all buffered data to the permanent
>>> storage
>>> device (arg is ignored). This is currently
>>> implemented
>>> on HFS, MS-DOS (FAT), and Universal Disk Format
>>> (UDF) file systems. The operation may take
>>> quite a
>>> while to complete.
>>>
>>> Thanks
>>> - Ioi
>>>
>>>
>>> On 7/16/19 7:59 AM, Calvin Cheung wrote:
>>>> Dan,
>>>>
>>>> Thanks for your review!
>>>>
>>>> On 7/16/19 5:56 AM, Daniel D. Daugherty wrote:
>>>>> On 7/16/19 12:31 AM, Calvin Cheung wrote:
>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8227646
>>>>>>
>>>>>> webrev: http://cr.openjdk.java.net/~ccheung/8227646/webrev.00/
>>>>>
>>>>> test/hotspot/jtreg/runtime/appcds/SharedArchiveConsistency.java
>>>>> Does the test intentionally crash in one or more of the test
>>>>> cases?
>>>>> If not, then '-XX:-CreateCoredumpOnCrash' is not really needed.
>>>>> I don't think '-XX:-CreateCoredumpOnCrash' will prevent the
>>>>> timeout
>>>>> handling mechanism from trying to capture a core file in the case
>>>>> of a timeout.
>>>> No, the test does not crash intentionally. Thanks for clarifying
>>>> the -XX:-CreateCoredumpOnCrash. I will revert the change.
>>>>>
>>>>> The test currently timed out with a default total timeout
>>>>> value of
>>>>> 480 seconds; that 480 comes from the default timeout value of 120
>>>>> seconds and the default timeout factor of 4 (480 == 120 * 4).
>>>>>
>>>>> The 'timeout=1000' value will get you a total timeout value of
>>>>> 4000.
>>>>> I suspect that is not what you want.
>>>>>
>>>>> If you specify 'timeout=240', you'll get a total timeout value of
>>>>> 960 seconds (240 * 4).
>>>>
>>>> I've seen the total elapsed time for the test got very close to
>>>> 960s. So to be on the safe side, I would set the timeout=300 as
>>>> follows:
>>>>
>>>> diff --git
>>>> a/test/hotspot/jtreg/runtime/appcds/SharedArchiveConsistency.java
>>>> b/test/hotspot/jtreg/runtime/appcds/SharedArchiveConsistency.java
>>>> --- a/test/hotspot/jtreg/runtime/appcds/SharedArchiveConsistency.java
>>>> +++ b/test/hotspot/jtreg/runtime/appcds/SharedArchiveConsistency.java
>>>> @@ -35,7 +35,7 @@
>>>> * @build sun.hotspot.WhiteBox
>>>> * @compile test-classes/Hello.java
>>>> * @run driver ClassFileInstaller sun.hotspot.WhiteBox
>>>> - * @run main/othervm -Xbootclasspath/a:.
>>>> -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI
>>>> SharedArchiveConsistency
>>>> + * @run main/othervm/timeout=300 -Xbootclasspath/a:.
>>>> -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI
>>>> SharedArchiveConsistency
>>>> */
>>>> import jdk.test.lib.process.OutputAnalyzer;
>>>> import jdk.test.lib.Utils;
>>>>
>>>> I will do more testing with the above timeout before pushing the
>>>> change.
>>>>
>>>> Let me know if you'd like to see another webrev.
>>>>
>>>> thanks,
>>>>
>>>> Calvin
>>>>
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>>>
>>>>>> Increase the timeout to 1000s and add the
>>>>>> -XX:-CreateCoredumpOnCrash option to disable coredump.
>>>>>>
>>>>>> Testing: on 2 macosx hosts on which the timeout was observed.
>>>>>>
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Calvin
>>>>>>
>>>>>
>>>
More information about the hotspot-runtime-dev
mailing list