RFR(S): 8151259: [TESTBUG] nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of outer fields of the class" when running with -Xcomp
Alex Menkov
alexey.menkov at oracle.com
Wed Jul 25 18:00:01 UTC 2018
Looks good to me
--alex
On 07/24/2018 16:23, Chris Plummer wrote:
> Thanks, Serguei.
>
> I could use one more reviewer.
>
> thanks,
>
> Chris
>
> On 7/24/18 3:00 PM, serguei.spitsyn at oracle.com wrote:
>> Chris,
>>
>> Thank you for the explanations.
>> I'm Okay with this webrev as it is.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 7/24/18 13:55, Chris Plummer wrote:
>>> On 7/24/18 1:46 PM, serguei.spitsyn at oracle.com wrote:
>>>>
>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java.frames.html
>>>> - log.complain("Redefinition not started. Maybe running with -Xcomp.
>>>> Test ignored.");
>>>> + log.complain("Redefinition not started. May need more time for
>>>> -Xcomp.");
>>>> + status = Consts.TEST_FAILED;
>>>> return false;
>>>> }
>>>> . . .
>>>> - log.complain("Redefinition not completed.");
>>>> + log.complain("Redefinition not completed. May need more time for
>>>> -Xcomp.");
>>>> + status = Consts.TEST_FAILED;
>>>> + return false; The complain is not fully correct if this can happen
>>>> not only with the -Xcomp. Could this message be relaxed a little bit?
>>> I think it is relaxed. It says *may* need more time for -Xcomp. I'm
>>> not sure how else to word it unless you want me to just say
>>> "Redefinition not completed".
>>>> Also, just a side comment: The changes above are not that harmless.
>>>> As the status now is set to TEST_FAILED there is a potential for the
>>>> tests to start failing where they were passed before.
>>> Yes, that was intentional. It's still the case that you only need the
>>> fail = 0 change to fix the bug, but having these methods properly
>>> cause the test to fail is necessary if something were to ever go
>>> wrong and the redef was not started or completed. Otherwise the test
>>> would either silently pass (if redef was not started) or just produce
>>> error messages like it has been when it checks for the proper redef
>>> (if the redef never completed).
>>>
>>> thanks,
>>>
>>> Chris
>>>> Otherwise, looks good.
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 7/24/18 13:22, Chris Plummer wrote:
>>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.01
>>>>>
>>>>> Since I was removed the "else", there was no need for the "if", so
>>>>> I removed it also. I had to re-indent the body of the "if" section
>>>>> because of that. The webrev seems to not call out the whitespace
>>>>> changes, although I also did a couple of other minor formatting
>>>>> changes in the code that do show up.
>>>>>
>>>>> Chris
>>>>>
>>>>> On 7/24/18 12:42 PM, Chris Plummer wrote:
>>>>>> Yes. I'm just retesting first.
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 7/24/18 12:18 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>> Hi Chris,
>>>>>>>
>>>>>>> You have my all my comments and I leave it up to you to decide
>>>>>>> what approach to pick.
>>>>>>> Could you send an updated webrev, please?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>> On 7/24/18 09:27, Chris Plummer wrote:
>>>>>>>> On 7/24/18 12:25 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>> Hi Chris,
>>>>>>>>>
>>>>>>>>> I still feel, this fix adds more confusion and complexity.
>>>>>>>>> Let's look at some fragments.
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00/test/hotspot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses/redefclass028/redefclass028.c.frames.html
>>>>>>>>>
>>>>>>>>> 116 if ((strcmp(name, expHSMethod) == 0) &&
>>>>>>>>> 117 (strcmp(sig, expHSSignature) == 0)) {
>>>>>>>>> 118 NSK_DISPLAY0("CompiledMethodLoad: a tested hotspot method found\n");
>>>>>>>>> 119
>>>>>>>>> 120 // CR 6604375: check whether "hot" method was entered
>>>>>>>>> 121 if (enteredHotMethod) {
>>>>>>>>> 122 hsMethodID = method;
>>>>>>>>> 123 fire = 1;
>>>>>>>>> 124 } else {
>>>>>>>>> 125 NSK_DISPLAY0("Compilation occured before method execution\n");
>>>>>>>>> 126 fire = 0; // Ignore this compilation. Wait for next one.
>>>>>>>>> 127 }
>>>>>>>>> 128 }
>>>>>>>>>
>>>>>>>>> I think, the line #126 is not needed.
>>>>>>>>> It just creates a confusion.
>>>>>>>>> The fire == 0 from beginning.
>>>>>>>>> Why do we need it to set to 0 again?
>>>>>>>> Yes, it can be removed. I just didn't give it much thought when
>>>>>>>> changing the code from -1 to 0.
>>>>>>>>> Is it because it can be already set to 1?
>>>>>>>>> Id so, I'm not sure I understand this code then.
>>>>>>>>>
>>>>>>>>> 187 } while(fire == 0);
>>>>>>>>> 188
>>>>>>>>> 189 NSK_DISPLAY0("agentProc: hotspot method compiled\n\n");
>>>>>>>>> 190
>>>>>>>>> 192 if (fire == 1) {
>>>>>>>>> . . .
>>>>>>>>> 224 } else {
>>>>>>>>> 225 // fire == -1
>>>>>>>>> 226 // NOTE: This isn't suppose to happen anymore. Hot method
>>>>>>>>> should always end up being entered.
>>>>>>>>> 227 NSK_COMPLAIN0("agentProc: \"hot\" method wasn't executed.
>>>>>>>>> Don't perform redefinition\n");
>>>>>>>>> 228 }
>>>>>>>>> I don't understand why do we need the check at the line #192.
>>>>>>>>> The variable fire can be only equal to 0 or 1.
>>>>>>>>> The only way out of the loop at the line #187 is if fire == 1.
>>>>>>>>>
>>>>>>>>> Then the else statement at the lines 224-228 confuses even more.
>>>>>>>> The else section can be removed. I left it in as sort of an
>>>>>>>> assert, but I see now that it just cause confusion.
>>>>>>>>
>>>>>>>> thanks,
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 7/23/18 20:19, Chris Plummer wrote:
>>>>>>>>>> On 7/23/18 5:22 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 7/23/18 11:40, Chris Plummer wrote:
>>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>>
>>>>>>>>>>>> If the fix was complicated I would agree, but it really just
>>>>>>>>>>>> boils down to this one line change:
>>>>>>>>>>>>
>>>>>>>>>>>> - fire = -1;
>>>>>>>>>>>> + fire = 0; // Ignore this compilation. Wait for
>>>>>>>>>>>> next one.
>>>>>>>>>>>
>>>>>>>>>>> It is not obvious that this will completely fix the problem.
>>>>>>>>>>> Is it possible that there will not be next compilation with
>>>>>>>>>>> the -Xcomp?
>>>>>>>>>> It's only one method that we check for. I don't see why there
>>>>>>>>>> would be 2nd -Xcomp compilation for it, but even if there was,
>>>>>>>>>> the test will ignore it just like the first one. It will
>>>>>>>>>> ignore compilations of the method until the flag has been set
>>>>>>>>>> indicating the method has been executed once.
>>>>>>>>>
>>>>>>>>>> If for some reason the method is never compiled after being
>>>>>>>>>> executed once, the test will give up waiting for it (I think
>>>>>>>>>> after 30 seconds) and produce an error.
>>>>>>>>>
>>>>>>>>> I'm afraid that it is what will always happen with the -Xcomp.
>>>>>>>>> Then there is no point to waist this by waiting for timeout as
>>>>>>>>> the test will successfully complete without testing anything.
>>>>>>>>> It seems to be not worth this complexity.
>>>>>>>>>
>>>>>>>>> I guess, you would want some extra tracing though. :)
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>> If it is possible then it is better to explicitly exclude
>>>>>>>>>>> these tests for -Xcomp.
>>>>>>>>>>> Otherwise, consider this reviewed.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Given that, I see no reason not to increase our test
>>>>>>>>>>>> coverage by supporting this test during -Xcomp runs.
>>>>>>>>>>>
>>>>>>>>>>> I'd agree if it is going to be stable.
>>>>>>>>>>>
>>>>>>>>>> If problems turn up in the future, we can reconsider disabling
>>>>>>>>>> it.
>>>>>>>>>>
>>>>>>>>>> thanks,
>>>>>>>>>>
>>>>>>>>>> Chris
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Serguei
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Chris
>>>>>>>>>>>>
>>>>>>>>>>>> On 7/23/18 9:44 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Would it be more simple to avoid running these tests with
>>>>>>>>>>>>> -Xcomp?
>>>>>>>>>>>>> I guess, this would work: @requires vm.compMode != "Xcomp"
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 7/23/18 00:42, Chris Plummer wrote:
>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please review the following fix for JDK11:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8151259
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It fixes the following 3 tests:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java
>>>>>>>>>>>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass029.java
>>>>>>>>>>>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass030.java
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Any of which could fail when run with -Xcomp with
>>>>>>>>>>>>>> (followed by a bunch more errors):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # ERROR: Redefinition not started. Maybe running with
>>>>>>>>>>>>>> -Xcomp. Test ignored.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Although lately we've only seen this with
>>>>>>>>>>>>>> redefclass030.java on macosx.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> These 3 tests do redefinition of a "hot" method after
>>>>>>>>>>>>>> triggering compilation for it. After the redef some
>>>>>>>>>>>>>> testing is done to ensure that the redef was done
>>>>>>>>>>>>>> correctly, but the issue these test have actually comes
>>>>>>>>>>>>>> before any redef is done.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The test attempts to trigger compilation by calling a hot
>>>>>>>>>>>>>> method a lot. The agent detects compilation by receiving a
>>>>>>>>>>>>>> CompiledMethodLoad event. There was an issue discovered
>>>>>>>>>>>>>> long ago that when -Xcomp is used, the compilation happens
>>>>>>>>>>>>>> before the "hot" method is ever called. Then the redef
>>>>>>>>>>>>>> would happen before compilation, and this somehow messed
>>>>>>>>>>>>>> up the test (I'm not exactly sure how). The fix was to
>>>>>>>>>>>>>> basically abandon the redef attempt when this problem is
>>>>>>>>>>>>>> detected, and then supposedly just let the test run to
>>>>>>>>>>>>>> completion (skipping the actual testing of the redef).
>>>>>>>>>>>>>> After this change, if you ran with -Xcomp it would pass,
>>>>>>>>>>>>>> but if you looked in the log you would see:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # ERROR: Redefinition not started. Maybe running with
>>>>>>>>>>>>>> -Xcomp. Test ignored.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> However, there was a bug in the logic to make the test run
>>>>>>>>>>>>>> to completion, and also causes the above message to not
>>>>>>>>>>>>>> appear. Instead the test would fail with:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # ERROR: Redefinition not completed.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Followed by a bunch more error message during the part of
>>>>>>>>>>>>>> the test that checks if the redef was done properly.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If the CompiledMethodLoad event comes in before the hot
>>>>>>>>>>>>>> method is ever called (which it does with -Xcomp), the
>>>>>>>>>>>>>> test sets fire = -1. If the hot method was called, it is
>>>>>>>>>>>>>> set to 1. The setting of fire = -1 was added to fix the
>>>>>>>>>>>>>> -Xcomp problem mentioned above. The jvmti agent does the
>>>>>>>>>>>>>> following:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> do {
>>>>>>>>>>>>>> THREAD_sleep(1);
>>>>>>>>>>>>>> /* wait for compilation to happen */
>>>>>>>>>>>>>> } while(fire == 0);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> if (fire == 1) {
>>>>>>>>>>>>>> /* do the redef here */
>>>>>>>>>>>>>> NSK_DISPLAY0("agentProc: <<<<<<<<
>>>>>>>>>>>>>> RedefineClasses() is successfully done\n");
>>>>>>>>>>>>>> } else {
>>>>>>>>>>>>>> // fire == -1
>>>>>>>>>>>>>> NSK_DISPLAY0("agentProc: \"hot\" method wasn't
>>>>>>>>>>>>>> executed. Don't perform redefinition\n");
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The agent then syncs with the debuggee, waiting for it
>>>>>>>>>>>>>> finish up. What the test expects is that
>>>>>>>>>>>>>> waitForRedefinitionStarted() in the debuggee will time out
>>>>>>>>>>>>>> after two seconds while waiting for fire == 1 (which it
>>>>>>>>>>>>>> thinks will will always happen because it was set to -1).
>>>>>>>>>>>>>> When it times out, the test does appear to exit properly
>>>>>>>>>>>>>> with, but with the following in the log, which is intended:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # ERROR: Redefinition not started. Maybe running with
>>>>>>>>>>>>>> -Xcomp. Test ignored.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> However, sometimes before waitForRedefinitionStarted()
>>>>>>>>>>>>>> times out, the hot method is called enough times to
>>>>>>>>>>>>>> trigger compilation. So another CompiledMethodLoad event
>>>>>>>>>>>>>> arrives, and this time fire is set to 1. Because of this,
>>>>>>>>>>>>>> waitForRedefinitionStarted() doesn't time out and returns
>>>>>>>>>>>>>> with an indication that the redef has started. After this
>>>>>>>>>>>>>> waitForRedefinitionCompleted() is executed. It waits for
>>>>>>>>>>>>>> the redef to complete, but it never does since the agent
>>>>>>>>>>>>>> decided not to do the redef when it saw fire == -1. So
>>>>>>>>>>>>>> waitForRedefinitionCompleted() times out after 10 seconds
>>>>>>>>>>>>>> and the test fails, with:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # ERROR: Redefinition not completed.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Actually the above error is not really what causes the
>>>>>>>>>>>>>> failure. When the above error is detected, no error status
>>>>>>>>>>>>>> is set and the test continues as if the redef had been
>>>>>>>>>>>>>> done. So then the logic that detects if the redef was done
>>>>>>>>>>>>>> properly ends up failing, and that's where the test
>>>>>>>>>>>>>> actually indicates a failure status. You see a whole bunch
>>>>>>>>>>>>>> of other errors in the log because of all the checks that
>>>>>>>>>>>>>> fail.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The fix is to not abandon the test when the first
>>>>>>>>>>>>>> CompiledMethodLoad event is before the hot method was
>>>>>>>>>>>>>> called. Instead just leave fire==0 and wait for the next
>>>>>>>>>>>>>> CompiledMethodLoad event that is triggered after the
>>>>>>>>>>>>>> method is called enough times to be recompiled. I'm not
>>>>>>>>>>>>>> sure why it was not originally done this way. Possibly the
>>>>>>>>>>>>>> recompilation did not happen reliably, but I have not run
>>>>>>>>>>>>>> into this problem. The other changes in redefclass030.c
>>>>>>>>>>>>>> are just cleaning up debug tracing.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Another fix was to properly set the error status when
>>>>>>>>>>>>>> waitForRedefinitionStarted() or
>>>>>>>>>>>>>> waitForRedefinitionCompleted() times out, although this is
>>>>>>>>>>>>>> just a safety net and I didn't run into any cases where
>>>>>>>>>>>>>> this happened after fixing the CompiledMethodLoad event
>>>>>>>>>>>>>> handling. So in general the changes in redefclass030.java
>>>>>>>>>>>>>> were not needed, but provide better error handling.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
More information about the serviceability-dev
mailing list