RFR(S): JDK-8042155 Tests for stack guard pages have to be cleaned up
Dmitry Samersoff
dmitry.samersoff at oracle.com
Wed May 21 17:41:25 UTC 2014
Thank you!
On 2014-05-21 20:09, Daniel D. Daugherty wrote:
> Thumbs up on the new version of test...
>
> Dan
>
>
> On 5/21/14 6:09 AM, Dmitry Samersoff wrote:
>> Coleen,
>>
>> Please, see updated webrev:
>>
>> http://cr.openjdk.java.net/~dsamersoff/JDK-8042155/webrev.03/
>>
>> only invoke.c is changed (added -Xint and limited recursion depth for
>> second run).
>>
>>>> It's guaranteed. We always have libjvm mapped after stack of initial
>>>> thread.
>>> We do? Really?
>> Sure! Memory layout *of this test* is always the same. Unless we change
>> JNI_CreateJavaVM() implementation drastically.
>>
>> -Dmitry
>>
>>
>>
>> On 2014-05-20 18:59, Coleen Phillimore wrote:
>>> On 5/20/14, 10:16 AM, Dmitry Samersoff wrote:
>>>> Coleen,
>>>>
>>>>> Is this guaranteed? How many recursions of do_overflow will this
>>>>> take? I still prefer my suggestion of limiting the recursions in the
>>>>> second case of do_overfflow so the test doesn't rely on some
>>>>> environmental setting that may not be true.
>>>> It's guaranteed. We always have libjvm mapped after stack of initial
>>>> thread.
>>> We do? Really?
>>>> Typically we crashes after 136000 recursive calls and it takes 1.071
>>>> sec.
>>>>
>>>> But it's worth to limit recursion in second case. I'll do it.
>>> Yes, please do this!
>>>
>>> Coleen
>>>> -Dmitry
>>>>
>>>>
>>>> On 2014-05-20 17:54, Coleen Phillimore wrote:
>>>>> Dmitry, some comments below.
>>>>>
>>>>> On 5/20/14, 8:04 AM, Dmitry Samersoff wrote:
>>>>>> Coleen,
>>>>>>
>>>>>> Thank you for the review!
>>>>>>
>>>>>> Please see below.
>>>>>>
>>>>>> On 2014-05-20 06:19, Coleen Phillimore wrote:
>>>>>>> I have reviewed the test. I'm happy that you are fixing this test
>>>>>>> because it's been really problematic, but I have a couple of
>>>>>>> concerns.
>>>>>>> First the java program overflow doesn't allocate anything on the
>>>>>>> stack
>>>>>>> so when it gets compiled it might take a really long time to
>>>>>>> run. Can
>>>>>>> you pass -Xint in invoke.c in the option string?
>>>>>> Sure. Will do.
>>>>>>
>>>>>>> Second, the second do_overflow() in run_native_overflow must hit
>>>>>>> some
>>>>>>> sort of protected page causing it to terminate, why isn't it
>>>>>>> SIG_ACCERR? Maybe the second do_overflow should only recurse the
>>>>>>> same
>>>>>>> number of times as the first + some small number so it doesn't keep
>>>>>>> running up the stack (until ???)
>>>>>> Both first and second call to do_overflow can:
>>>>>>
>>>>>> 1. *hit a guard page* - we get SIGSEGV with SIG_ACCERR. (OK in first
>>>>>> case, ERR in second)
>>>>>>
>>>>>> 2. *hit non-mapped page* - we get a SIGSEGV with code other than
>>>>>> SIG_ACCERR. (ERR in first case, OK in second)
>>>>> Is this guaranteed? How many recursions of do_overflow will this
>>>>> take? I still prefer my suggestion of limiting the recursions in the
>>>>> second case of do_overfflow so the test doesn't rely on some
>>>>> environmental setting that may not be true.
>>>>>> There is a case that is not covered by this test explicitly - if test
>>>>>> guard page was never been installed and stack starts immediately
>>>>>> after
>>>>>> non-stack page it's possible that native do_verflow overwrite whole
>>>>>> non-stack page and hit another protected page and we get SIGSEGV with
>>>>>> SIG_ACCERR.
>>>>>>
>>>>>> Most probably, hotspot would crash in this case.
>>>>>>
>>>>>> Let me know if you like me to check it explicitly. We can read
>>>>>> /proc/self/map and check what page cause SIGSEGV.
>>>>> This seems more complicated and more likely to be affected by the
>>>>> environment, so I think you should not do this for now.
>>>>>
>>>>> thanks,
>>>>> Coleen
>>>>>
>>>>>> -Dmitry
>>>>>>
>>>>>>
>>>>>>> Thanks,
>>>>>>> Coleen
>>>>>>>
>>>>>>> On 5/14/14, 5:42 AM, Dmitry Samersoff wrote:
>>>>>>>> Dan,
>>>>>>>>
>>>>>>>> Fixed (in-place, press shift-reload)
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8042155/webrev.02/
>>>>>>>>
>>>>>>>>> One thing I did notice is that some of the error
>>>>>>>>> messages are
>>>>>>>>> identical which would make it more difficult for someone
>>>>>>>>> investigating a future failure to know which failure
>>>>>>>>> happened:
>>>>>>>> It shouldn't be an issue because a) test prints banner to stdout
>>>>>>>> with
>>>>>>>> the name of the function it's running b) test terminates after
>>>>>>>> first
>>>>>>>> error
>>>>>>>>
>>>>>>>> -Dmitry
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2014-05-13 21:12, Daniel D. Daugherty wrote:
>>>>>>>>> On 5/13/14 4:18 AM, Dmitry Samersoff wrote:
>>>>>>>>>> Dan,
>>>>>>>>>>
>>>>>>>>>> Thank you for the review!
>>>>>>>>>>
>>>>>>>>>> 1. I use exit(7) to make the situation when the test fails
>>>>>>>>>> because of
>>>>>>>>>> some error condition clear visible.
>>>>>>>>>>
>>>>>>>>>> 2. I'm not sure it's a good idea to pass test if compilation
>>>>>>>>>> fails, but
>>>>>>>>>> I make this changes to handle it the same way as other tests does
>>>>>>>>>>
>>>>>>>>>> 3. Agree with rest of comments. See new webrev.
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8042155/webrev.02/
>>>>>>>>> test/runtime/StackGuardPages/DoOverflow.java
>>>>>>>>> No comments.
>>>>>>>>>
>>>>>>>>> test/runtime/StackGuardPages/invoke.c
>>>>>>>>> I'm good with most of the changes.
>>>>>>>>>
>>>>>>>>> One thing I did notice is that some of the error
>>>>>>>>> messages are
>>>>>>>>> identical which would make it more difficult for someone
>>>>>>>>> investigating a future failure to know which failure
>>>>>>>>> happened:
>>>>>>>>>
>>>>>>>>> 102 fprintf(stderr, "Test ERROR. Can't load class
>>>>>>>>> DoOverflow\n");
>>>>>>>>> 144 printf("Test ERROR. Can't load class
>>>>>>>>> DoOverflow\n");
>>>>>>>>> You can make them unique by including the function
>>>>>>>>> name that generated them
>>>>>>>>>
>>>>>>>>> I just noticed that some error messages still go to
>>>>>>>>> stdout,
>>>>>>>>> e.g., line 144 above.
>>>>>>>>> Looks like this comment from the first round was also
>>>>>>>>> missed
>>>>>>>>> (the line number is now 243):
>>>>>>>>>
>>>>>>>>>> line 225: You'll end up here if no arguments are
>>>>>>>>>> passed or
>>>>>>>>>> if the
>>>>>>>>>> wrong arguments are passed. You should have a better
>>>>>>>>>> usage
>>>>>>>>>> message, e.g.:
>>>>>>>>>>
>>>>>>>>>> void usage() {
>>>>>>>>>> fprintf(stderr, "Usage: invoke
>>>>>>>>>> test_java_overflow\n");
>>>>>>>>>> fprintf(stderr, " invoke
>>>>>>>>>> test_native_overflow\n");
>>>>>>>>>> exit(1);
>>>>>>>>>> }
>>>>>>>>> test/runtime/StackGuardPages/testme.sh
>>>>>>>>> No comments.
>>>>>>>>>
>>>>>>>>> test/runtime/6929067/T.java
>>>>>>>>> test/runtime/6929067/Test6929067.sh
>>>>>>>>> test/runtime/6929067/invoke.c
>>>>>>>>> test/runtime/InitialThreadOverflow/DoOverflow.java
>>>>>>>>> test/runtime/InitialThreadOverflow/invoke.c
>>>>>>>>> test/runtime/InitialThreadOverflow/testme.sh
>>>>>>>>> No comments on the deleted files.
>>>>>>>>>
>>>>>>>>> Dan
>>>>>>>>>
>>>>>>>>>> -Dmitry
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2014-05-13 01:51, Daniel D. Daugherty wrote:
>>>>>>>>>>> On 5/1/14 8:21 AM, Dmitry Samersoff wrote:
>>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>>
>>>>>>>>>>>> Please, review test changes:
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8042155/webrev.01/
>>>>>>>>>>> test/runtime/StackGuardPages/testme.sh
>>>>>>>>>>> You need to check the command status after running
>>>>>>>>>>> gcc_cmd
>>>>>>>>>>> and if it is not zero, then gratuitously pass the test.
>>>>>>>>>>> Jerry made a similar test fix last week:
>>>>>>>>>>>
>>>>>>>>>>> +if [ $? -ne 0 ] ; then
>>>>>>>>>>> + echo "Compile failed, Ignoring failed compilation and
>>>>>>>>>>> forcing the
>>>>>>>>>>> test to pass"
>>>>>>>>>>> + exit 0
>>>>>>>>>>> +fi
>>>>>>>>>>>
>>>>>>>>>>> This will catch the case where a 64-bit Linux test
>>>>>>>>>>> machine
>>>>>>>>>>> has a compiler, but not the 32-bit build env.
>>>>>>>>>>>
>>>>>>>>>>> test/runtime/StackGuardPages/DoOverflow.java
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> test/runtime/StackGuardPages/invoke.c
>>>>>>>>>>> line 28: * than detaching from vm thread and overflow
>>>>>>>>>>> stack once
>>>>>>>>>>> again.
>>>>>>>>>>> Typo?: 'than detaching' -> 'Then we detach'
>>>>>>>>>>>
>>>>>>>>>>> line 92: (*_jvm)->AttachCurrentThread(_jvm,
>>>>>>>>>>> (void**)&env,
>>>>>>>>>>> NULL);
>>>>>>>>>>> line 125: (*_jvm)->AttachCurrentThread(_jvm, (void
>>>>>>>>>>> **)&env,
>>>>>>>>>>> NULL);
>>>>>>>>>>> What happens if the JNI AttachCurrentThread() call
>>>>>>>>>>> fails?
>>>>>>>>>>>
>>>>>>>>>>> line 96: printf("Test ERROR. Can't load class
>>>>>>>>>>> DoOverflow\n");
>>>>>>>>>>> line 102: printf("Test ERROR. Can't find method
>>>>>>>>>>> DoOverflow.printIt\n");
>>>>>>>>>>> line 129: printf("Test ERROR. Can't load class
>>>>>>>>>>> DoOverflow\n");
>>>>>>>>>>> line 135: printf("Test ERROR. Can't find method
>>>>>>>>>>> DoOverflow.printAlive\n");
>>>>>>>>>>> line 192: printf("Test ERROR. Can't create JavaVM\n");
>>>>>>>>>>> line 224: printf("Test ERROR. Unknown parameter
>>>>>>>>>>> should be
>>>>>>>>>>> test_java_overflow or test_native_overflow\n");
>>>>>>>>>>> Don't errors usually get printed on stderr?
>>>>>>>>>>>
>>>>>>>>>>> line 106: (*env)->CallStaticVoidMethod(env, class_id,
>>>>>>>>>>> method_id, NULL);
>>>>>>>>>>> What happens if this call fails? Other than the
>>>>>>>>>>> intentional
>>>>>>>>>>> StackOverflowError, but wait, you catch and swallow
>>>>>>>>>>> that
>>>>>>>>>>> error...
>>>>>>>>>>>
>>>>>>>>>>> line 108: (*_jvm)->DetachCurrentThread(_jvm);
>>>>>>>>>>> line 150 (*_jvm)->DetachCurrentThread(_jvm);
>>>>>>>>>>> What happens if this call fails?
>>>>>>>>>>>
>>>>>>>>>>> line 139: (*env)->CallStaticVoidMethod (env, class_id,
>>>>>>>>>>> method_id,
>>>>>>>>>>> NULL);
>>>>>>>>>>> What happens if this call fails?
>>>>>>>>>>>
>>>>>>>>>>> line 153: // For non-initial thread we doesn't unmap
>>>>>>>>>>> Typo: "we doesn't unmap"
>>>>>>>>>>> -> "we don't unmap"
>>>>>>>>>>>
>>>>>>>>>>> line 154: ...same SEGV_ACCERR tring to access it
>>>>>>>>>>> Typos, perhaps: same SEGV signal trying to access it
>>>>>>>>>>>
>>>>>>>>>>> line 156: // We have no ways to check this, so bail out,
>>>>>>>>>>> marking
>>>>>>>>>>> test as succeded
>>>>>>>>>>> Typo: 'no ways' -> 'no way'
>>>>>>>>>>> Typo: 'as succeded' -> 'as succeeded'
>>>>>>>>>>>
>>>>>>>>>>> line 157: printf("Test OK. ...
>>>>>>>>>>> line 171: printf("Test OK. ...
>>>>>>>>>>> Should "OK" be "PASSED"?
>>>>>>>>>>>
>>>>>>>>>>> line 168: printf("Test FAILED. Stack guard page is still
>>>>>>>>>>> there\n");
>>>>>>>>>>> Don't failures usually get printed on stderr?
>>>>>>>>>>>
>>>>>>>>>>> line 225: You'll end up here if no arguments are
>>>>>>>>>>> passed or
>>>>>>>>>>> if the
>>>>>>>>>>> wrong arguments are passed. You should have a better
>>>>>>>>>>> usage
>>>>>>>>>>> message, e.g.:
>>>>>>>>>>>
>>>>>>>>>>> void usage() {
>>>>>>>>>>> fprintf(stderr, "Usage: invoke
>>>>>>>>>>> test_java_overflow\n");
>>>>>>>>>>> fprintf(stderr, " invoke
>>>>>>>>>>> test_native_overflow\n");
>>>>>>>>>>> exit(1);
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> I noticed that all the exit() calls pass '7'. Why '7'?
>>>>>>>>>>>
>>>>>>>>>>> Also, why 'return 0' or 'return 1' instead of
>>>>>>>>>>> 'exit(0)' or
>>>>>>>>>>> 'exit(1)'?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> test/runtime/6929067/T.java
>>>>>>>>>>> test/runtime/6929067/Test6929067.sh
>>>>>>>>>>> test/runtime/6929067/invoke.c
>>>>>>>>>>> test/runtime/InitialThreadOverflow/DoOverflow.java
>>>>>>>>>>> test/runtime/InitialThreadOverflow/invoke.c
>>>>>>>>>>> test/runtime/InitialThreadOverflow/testme.sh
>>>>>>>>>>> OK I can see why you want to combine the two tests.
>>>>>>>>>>> I'm guessing that 'InitialThreadOverflow' was
>>>>>>>>>>> derived from the much older '6929067'...
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Dan
>>>>>>>>>>>
>>>>>>>>>>>> The fix combine tests for 6929067 and 8009062 into single, more
>>>>>>>>>>>> sophisticated test.
>>>>>>>>>>>>
>>>>>>>>>>>> -Dmitry
>>>>>>>>>>>>
>>
>
--
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.
More information about the hotspot-runtime-dev
mailing list