RFR(S): JDK-8042155 Tests for stack guard pages have to be cleaned up
Coleen Phillimore
coleen.phillimore at oracle.com
Tue May 20 14:59:59 UTC 2014
On 5/20/14, 10:16 AM, Dmitry Samersoff wrote:
> Coleen,
>
>> Is this guaranteed? How many recursions of do_overflow will this
>> take? I still prefer my suggestion of limiting the recursions in the
>> second case of do_overfflow so the test doesn't rely on some
>> environmental setting that may not be true.
> It's guaranteed. We always have libjvm mapped after stack of initial thread.
We do? Really?
>
> Typically we crashes after 136000 recursive calls and it takes 1.071 sec.
>
> But it's worth to limit recursion in second case. I'll do it.
Yes, please do this!
Coleen
>
> -Dmitry
>
>
> On 2014-05-20 17:54, Coleen Phillimore wrote:
>> Dmitry, some comments below.
>>
>> On 5/20/14, 8:04 AM, Dmitry Samersoff wrote:
>>> Coleen,
>>>
>>> Thank you for the review!
>>>
>>> Please see below.
>>>
>>> On 2014-05-20 06:19, Coleen Phillimore wrote:
>>>> I have reviewed the test. I'm happy that you are fixing this test
>>>> because it's been really problematic, but I have a couple of concerns.
>>>> First the java program overflow doesn't allocate anything on the stack
>>>> so when it gets compiled it might take a really long time to run. Can
>>>> you pass -Xint in invoke.c in the option string?
>>> Sure. Will do.
>>>
>>>> Second, the second do_overflow() in run_native_overflow must hit some
>>>> sort of protected page causing it to terminate, why isn't it
>>>> SIG_ACCERR? Maybe the second do_overflow should only recurse the same
>>>> number of times as the first + some small number so it doesn't keep
>>>> running up the stack (until ???)
>>> Both first and second call to do_overflow can:
>>>
>>> 1. *hit a guard page* - we get SIGSEGV with SIG_ACCERR. (OK in first
>>> case, ERR in second)
>>>
>>> 2. *hit non-mapped page* - we get a SIGSEGV with code other than
>>> SIG_ACCERR. (ERR in first case, OK in second)
>> Is this guaranteed? How many recursions of do_overflow will this
>> take? I still prefer my suggestion of limiting the recursions in the
>> second case of do_overfflow so the test doesn't rely on some
>> environmental setting that may not be true.
>>> There is a case that is not covered by this test explicitly - if test
>>> guard page was never been installed and stack starts immediately after
>>> non-stack page it's possible that native do_verflow overwrite whole
>>> non-stack page and hit another protected page and we get SIGSEGV with
>>> SIG_ACCERR.
>>>
>>> Most probably, hotspot would crash in this case.
>>>
>>> Let me know if you like me to check it explicitly. We can read
>>> /proc/self/map and check what page cause SIGSEGV.
>> This seems more complicated and more likely to be affected by the
>> environment, so I think you should not do this for now.
>>
>> thanks,
>> Coleen
>>
>>> -Dmitry
>>>
>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>> On 5/14/14, 5:42 AM, Dmitry Samersoff wrote:
>>>>> Dan,
>>>>>
>>>>> Fixed (in-place, press shift-reload)
>>>>>
>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8042155/webrev.02/
>>>>>
>>>>>> One thing I did notice is that some of the error messages are
>>>>>> identical which would make it more difficult for someone
>>>>>> investigating a future failure to know which failure happened:
>>>>> It shouldn't be an issue because a) test prints banner to stdout with
>>>>> the name of the function it's running b) test terminates after first
>>>>> error
>>>>>
>>>>> -Dmitry
>>>>>
>>>>>
>>>>> On 2014-05-13 21:12, Daniel D. Daugherty wrote:
>>>>>> On 5/13/14 4:18 AM, Dmitry Samersoff wrote:
>>>>>>> Dan,
>>>>>>>
>>>>>>> Thank you for the review!
>>>>>>>
>>>>>>> 1. I use exit(7) to make the situation when the test fails because of
>>>>>>> some error condition clear visible.
>>>>>>>
>>>>>>> 2. I'm not sure it's a good idea to pass test if compilation
>>>>>>> fails, but
>>>>>>> I make this changes to handle it the same way as other tests does
>>>>>>>
>>>>>>> 3. Agree with rest of comments. See new webrev.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8042155/webrev.02/
>>>>>> test/runtime/StackGuardPages/DoOverflow.java
>>>>>> No comments.
>>>>>>
>>>>>> test/runtime/StackGuardPages/invoke.c
>>>>>> I'm good with most of the changes.
>>>>>>
>>>>>> One thing I did notice is that some of the error messages are
>>>>>> identical which would make it more difficult for someone
>>>>>> investigating a future failure to know which failure happened:
>>>>>>
>>>>>> 102 fprintf(stderr, "Test ERROR. Can't load class
>>>>>> DoOverflow\n");
>>>>>> 144 printf("Test ERROR. Can't load class DoOverflow\n");
>>>>>> You can make them unique by including the function
>>>>>> name that generated them
>>>>>>
>>>>>> I just noticed that some error messages still go to stdout,
>>>>>> e.g., line 144 above.
>>>>>> Looks like this comment from the first round was also missed
>>>>>> (the line number is now 243):
>>>>>>
>>>>>>> line 225: You'll end up here if no arguments are passed or
>>>>>>> if the
>>>>>>> wrong arguments are passed. You should have a better usage
>>>>>>> message, e.g.:
>>>>>>>
>>>>>>> void usage() {
>>>>>>> fprintf(stderr, "Usage: invoke test_java_overflow\n");
>>>>>>> fprintf(stderr, " invoke test_native_overflow\n");
>>>>>>> exit(1);
>>>>>>> }
>>>>>> test/runtime/StackGuardPages/testme.sh
>>>>>> No comments.
>>>>>>
>>>>>> test/runtime/6929067/T.java
>>>>>> test/runtime/6929067/Test6929067.sh
>>>>>> test/runtime/6929067/invoke.c
>>>>>> test/runtime/InitialThreadOverflow/DoOverflow.java
>>>>>> test/runtime/InitialThreadOverflow/invoke.c
>>>>>> test/runtime/InitialThreadOverflow/testme.sh
>>>>>> No comments on the deleted files.
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>> -Dmitry
>>>>>>>
>>>>>>>
>>>>>>> On 2014-05-13 01:51, Daniel D. Daugherty wrote:
>>>>>>>> On 5/1/14 8:21 AM, Dmitry Samersoff wrote:
>>>>>>>>> Hi Everyone,
>>>>>>>>>
>>>>>>>>> Please, review test changes:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8042155/webrev.01/
>>>>>>>> test/runtime/StackGuardPages/testme.sh
>>>>>>>> You need to check the command status after running gcc_cmd
>>>>>>>> and if it is not zero, then gratuitously pass the test.
>>>>>>>> Jerry made a similar test fix last week:
>>>>>>>>
>>>>>>>> +if [ $? -ne 0 ] ; then
>>>>>>>> + echo "Compile failed, Ignoring failed compilation and
>>>>>>>> forcing the
>>>>>>>> test to pass"
>>>>>>>> + exit 0
>>>>>>>> +fi
>>>>>>>>
>>>>>>>> This will catch the case where a 64-bit Linux test machine
>>>>>>>> has a compiler, but not the 32-bit build env.
>>>>>>>>
>>>>>>>> test/runtime/StackGuardPages/DoOverflow.java
>>>>>>>> No comments.
>>>>>>>>
>>>>>>>> test/runtime/StackGuardPages/invoke.c
>>>>>>>> line 28: * than detaching from vm thread and overflow
>>>>>>>> stack once
>>>>>>>> again.
>>>>>>>> Typo?: 'than detaching' -> 'Then we detach'
>>>>>>>>
>>>>>>>> line 92: (*_jvm)->AttachCurrentThread(_jvm, (void**)&env,
>>>>>>>> NULL);
>>>>>>>> line 125: (*_jvm)->AttachCurrentThread(_jvm, (void **)&env,
>>>>>>>> NULL);
>>>>>>>> What happens if the JNI AttachCurrentThread() call fails?
>>>>>>>>
>>>>>>>> line 96: printf("Test ERROR. Can't load class DoOverflow\n");
>>>>>>>> line 102: printf("Test ERROR. Can't find method
>>>>>>>> DoOverflow.printIt\n");
>>>>>>>> line 129: printf("Test ERROR. Can't load class DoOverflow\n");
>>>>>>>> line 135: printf("Test ERROR. Can't find method
>>>>>>>> DoOverflow.printAlive\n");
>>>>>>>> line 192: printf("Test ERROR. Can't create JavaVM\n");
>>>>>>>> line 224: printf("Test ERROR. Unknown parameter should be
>>>>>>>> test_java_overflow or test_native_overflow\n");
>>>>>>>> Don't errors usually get printed on stderr?
>>>>>>>>
>>>>>>>> line 106: (*env)->CallStaticVoidMethod(env, class_id,
>>>>>>>> method_id, NULL);
>>>>>>>> What happens if this call fails? Other than the
>>>>>>>> intentional
>>>>>>>> StackOverflowError, but wait, you catch and swallow that
>>>>>>>> error...
>>>>>>>>
>>>>>>>> line 108: (*_jvm)->DetachCurrentThread(_jvm);
>>>>>>>> line 150 (*_jvm)->DetachCurrentThread(_jvm);
>>>>>>>> What happens if this call fails?
>>>>>>>>
>>>>>>>> line 139: (*env)->CallStaticVoidMethod (env, class_id,
>>>>>>>> method_id,
>>>>>>>> NULL);
>>>>>>>> What happens if this call fails?
>>>>>>>>
>>>>>>>> line 153: // For non-initial thread we doesn't unmap
>>>>>>>> Typo: "we doesn't unmap"
>>>>>>>> -> "we don't unmap"
>>>>>>>>
>>>>>>>> line 154: ...same SEGV_ACCERR tring to access it
>>>>>>>> Typos, perhaps: same SEGV signal trying to access it
>>>>>>>>
>>>>>>>> line 156: // We have no ways to check this, so bail out,
>>>>>>>> marking
>>>>>>>> test as succeded
>>>>>>>> Typo: 'no ways' -> 'no way'
>>>>>>>> Typo: 'as succeded' -> 'as succeeded'
>>>>>>>>
>>>>>>>> line 157: printf("Test OK. ...
>>>>>>>> line 171: printf("Test OK. ...
>>>>>>>> Should "OK" be "PASSED"?
>>>>>>>>
>>>>>>>> line 168: printf("Test FAILED. Stack guard page is still
>>>>>>>> there\n");
>>>>>>>> Don't failures usually get printed on stderr?
>>>>>>>>
>>>>>>>> line 225: You'll end up here if no arguments are passed or
>>>>>>>> if the
>>>>>>>> wrong arguments are passed. You should have a better usage
>>>>>>>> message, e.g.:
>>>>>>>>
>>>>>>>> void usage() {
>>>>>>>> fprintf(stderr, "Usage: invoke test_java_overflow\n");
>>>>>>>> fprintf(stderr, " invoke test_native_overflow\n");
>>>>>>>> exit(1);
>>>>>>>> }
>>>>>>>>
>>>>>>>> I noticed that all the exit() calls pass '7'. Why '7'?
>>>>>>>>
>>>>>>>> Also, why 'return 0' or 'return 1' instead of 'exit(0)' or
>>>>>>>> 'exit(1)'?
>>>>>>>>
>>>>>>>>
>>>>>>>> test/runtime/6929067/T.java
>>>>>>>> test/runtime/6929067/Test6929067.sh
>>>>>>>> test/runtime/6929067/invoke.c
>>>>>>>> test/runtime/InitialThreadOverflow/DoOverflow.java
>>>>>>>> test/runtime/InitialThreadOverflow/invoke.c
>>>>>>>> test/runtime/InitialThreadOverflow/testme.sh
>>>>>>>> OK I can see why you want to combine the two tests.
>>>>>>>> I'm guessing that 'InitialThreadOverflow' was
>>>>>>>> derived from the much older '6929067'...
>>>>>>>>
>>>>>>>>
>>>>>>>> Dan
>>>>>>>>
>>>>>>>>> The fix combine tests for 6929067 and 8009062 into single, more
>>>>>>>>> sophisticated test.
>>>>>>>>>
>>>>>>>>> -Dmitry
>>>>>>>>>
>
More information about the hotspot-runtime-dev
mailing list