RFR(S): 8151259: [TESTBUG] nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of outer fields of the class" when running with -Xcomp

Wed Jul 25 18:00:01 UTC 2018

Looks good to me

--alex

On 07/24/2018 16:23, Chris Plummer wrote:
> Thanks, Serguei.
> 
> I could use one more reviewer.
> 
> thanks,
> 
> Chris
> 
> On 7/24/18 3:00 PM, serguei.spitsyn at oracle.com wrote:
>> Chris,
>>
>> Thank you for the explanations.
>> I'm Okay with this webrev as it is.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 7/24/18 13:55, Chris Plummer wrote:
>>> On 7/24/18 1:46 PM, serguei.spitsyn at oracle.com wrote:
>>>>
>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java.frames.html
>>>> - log.complain("Redefinition not started. Maybe running with -Xcomp. 
>>>> Test ignored.");
>>>> + log.complain("Redefinition not started. May need more time for 
>>>> -Xcomp.");
>>>> + status = Consts.TEST_FAILED;
>>>>           return false;
>>>>       }
>>>> . . .
>>>> - log.complain("Redefinition not completed.");
>>>> + log.complain("Redefinition not completed. May need more time for 
>>>> -Xcomp.");
>>>> + status = Consts.TEST_FAILED;
>>>> + return false; The complain is not fully correct if this can happen 
>>>> not only with the -Xcomp. Could this message be relaxed a little bit?
>>> I think it is relaxed. It says *may* need more time for -Xcomp. I'm 
>>> not sure how else to word it unless you want me to just say 
>>> "Redefinition not completed".
>>>> Also, just a side comment: The changes above are not that harmless. 
>>>> As the status now is set to TEST_FAILED there is a potential for the 
>>>> tests to start failing where they were passed before.
>>> Yes, that was intentional. It's still the case that you only need the 
>>> fail = 0 change to fix the bug, but having these methods properly 
>>> cause the test to fail is necessary if something were to ever go 
>>> wrong and the redef was not started or completed. Otherwise the test 
>>> would either silently pass (if redef was not started) or just produce 
>>> error messages like it has been when it checks for the proper redef 
>>> (if the redef never completed).
>>>
>>> thanks,
>>>
>>> Chris
>>>> Otherwise, looks good.
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 7/24/18 13:22, Chris Plummer wrote:
>>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.01
>>>>>
>>>>> Since I was removed the "else", there was no need for the "if", so 
>>>>> I removed it also. I had to re-indent the body of the "if" section 
>>>>> because of that. The webrev seems to not call out the whitespace 
>>>>> changes, although I also did a couple of other minor formatting 
>>>>> changes in the code that do show up.
>>>>>
>>>>> Chris
>>>>>
>>>>> On 7/24/18 12:42 PM, Chris Plummer wrote:
>>>>>> Yes. I'm just retesting first.
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 7/24/18 12:18 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>> Hi Chris,
>>>>>>>
>>>>>>> You have my all my comments and I leave it up to you to decide 
>>>>>>> what approach to pick.
>>>>>>> Could you send an updated webrev, please?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>> On 7/24/18 09:27, Chris Plummer wrote:
>>>>>>>> On 7/24/18 12:25 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>> Hi Chris,
>>>>>>>>>
>>>>>>>>> I still feel, this fix adds more confusion and complexity.
>>>>>>>>> Let's look at some fragments.
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00/test/hotspot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses/redefclass028/redefclass028.c.frames.html
>>>>>>>>>
>>>>>>>>>   116     if ((strcmp(name, expHSMethod) == 0) &&
>>>>>>>>>   117             (strcmp(sig, expHSSignature) == 0)) {
>>>>>>>>>   118         NSK_DISPLAY0("CompiledMethodLoad: a tested hotspot method found\n");
>>>>>>>>>   119
>>>>>>>>>   120         // CR 6604375: check whether "hot" method was entered
>>>>>>>>>   121         if (enteredHotMethod) {
>>>>>>>>>   122             hsMethodID = method;
>>>>>>>>>   123             fire = 1;
>>>>>>>>>   124         } else {
>>>>>>>>> 125 NSK_DISPLAY0("Compilation occured before method execution\n");
>>>>>>>>> 126 fire = 0; // Ignore this compilation. Wait for next one.
>>>>>>>>>   127         }
>>>>>>>>>   128     }
>>>>>>>>>
>>>>>>>>> I think, the line #126 is not needed.
>>>>>>>>> It just creates a confusion.
>>>>>>>>> The fire == 0 from beginning.
>>>>>>>>> Why do we need it to set to 0 again?
>>>>>>>> Yes, it can be removed. I just didn't give it much thought when 
>>>>>>>> changing the code from -1 to 0.
>>>>>>>>> Is it because it can be already set to 1?
>>>>>>>>> Id so, I'm not sure I understand this code then.
>>>>>>>>>
>>>>>>>>>   187     } while(fire == 0);
>>>>>>>>>   188
>>>>>>>>>   189     NSK_DISPLAY0("agentProc: hotspot method compiled\n\n");
>>>>>>>>>   190
>>>>>>>>>   192     if (fire == 1) {
>>>>>>>>>   . . .
>>>>>>>>>   224     } else {
>>>>>>>>>   225         // fire == -1
>>>>>>>>> 226 // NOTE: This isn't suppose to happen anymore. Hot method 
>>>>>>>>> should always end up being entered.
>>>>>>>>> 227 NSK_COMPLAIN0("agentProc: \"hot\" method wasn't executed. 
>>>>>>>>> Don't perform redefinition\n");
>>>>>>>>>   228     }
>>>>>>>>> I don't understand why do we need the check at the line #192.
>>>>>>>>> The variable fire can be only equal to 0 or 1.
>>>>>>>>> The only way out of the loop at the line #187 is if fire == 1.
>>>>>>>>>
>>>>>>>>> Then the else statement at the lines 224-228 confuses even more.
>>>>>>>> The else section can be removed. I left it in as sort of an 
>>>>>>>> assert, but I see now that it just cause confusion.
>>>>>>>>
>>>>>>>> thanks,
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 7/23/18 20:19, Chris Plummer wrote:
>>>>>>>>>> On 7/23/18 5:22 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 7/23/18 11:40, Chris Plummer wrote:
>>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>>
>>>>>>>>>>>> If the fix was complicated I would agree, but it really just 
>>>>>>>>>>>> boils down to this one line change:
>>>>>>>>>>>>
>>>>>>>>>>>> -            fire = -1;
>>>>>>>>>>>> +            fire = 0; // Ignore this compilation. Wait for 
>>>>>>>>>>>> next one.
>>>>>>>>>>>
>>>>>>>>>>> It is not obvious that this will completely fix the problem.
>>>>>>>>>>> Is it possible that there will not be next compilation with 
>>>>>>>>>>> the -Xcomp?
>>>>>>>>>> It's only one method that we check for. I don't see why there 
>>>>>>>>>> would be 2nd -Xcomp compilation for it, but even if there was, 
>>>>>>>>>> the test will ignore it just like the first one. It will 
>>>>>>>>>> ignore compilations of the method until the flag has been set 
>>>>>>>>>> indicating the method has been executed once.
>>>>>>>>>
>>>>>>>>>> If for some reason the method is never compiled after being 
>>>>>>>>>> executed once, the test will give up waiting for it (I think 
>>>>>>>>>> after 30 seconds) and produce an error.
>>>>>>>>>
>>>>>>>>> I'm afraid that it is what will always happen with the -Xcomp.
>>>>>>>>> Then there is no point to waist this by waiting for timeout as 
>>>>>>>>> the test will successfully complete without testing anything.
>>>>>>>>> It seems to be not worth this complexity.
>>>>>>>>>
>>>>>>>>> I guess, you would want some extra tracing though. :)
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>> If it is possible then it is better to explicitly exclude 
>>>>>>>>>>> these tests for -Xcomp.
>>>>>>>>>>> Otherwise, consider this reviewed.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Given that, I see no reason not to increase our test 
>>>>>>>>>>>> coverage by supporting this test during -Xcomp runs.
>>>>>>>>>>>
>>>>>>>>>>> I'd agree if it is going to be stable.
>>>>>>>>>>>
>>>>>>>>>> If problems turn up in the future, we can reconsider disabling 
>>>>>>>>>> it.
>>>>>>>>>>
>>>>>>>>>> thanks,
>>>>>>>>>>
>>>>>>>>>> Chris
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Serguei
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Chris
>>>>>>>>>>>>
>>>>>>>>>>>> On 7/23/18 9:44 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Would it be more simple to avoid running these tests with 
>>>>>>>>>>>>> -Xcomp?
>>>>>>>>>>>>> I guess, this would work: @requires vm.compMode != "Xcomp"
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 7/23/18 00:42, Chris Plummer wrote:
>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please review the following fix for JDK11:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8151259
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It fixes the following 3 tests:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java
>>>>>>>>>>>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass029.java
>>>>>>>>>>>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass030.java
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Any of which could fail when run with -Xcomp with 
>>>>>>>>>>>>>> (followed by a bunch more errors):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  # ERROR: Redefinition not started. Maybe running with 
>>>>>>>>>>>>>> -Xcomp. Test ignored.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Although lately we've only seen this with 
>>>>>>>>>>>>>> redefclass030.java on macosx.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> These 3 tests do redefinition of a "hot" method after 
>>>>>>>>>>>>>> triggering compilation for it. After the redef some 
>>>>>>>>>>>>>> testing is done to ensure that the redef was done 
>>>>>>>>>>>>>> correctly, but the issue these test have actually comes 
>>>>>>>>>>>>>> before any redef is done.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The test attempts to trigger compilation by calling a hot 
>>>>>>>>>>>>>> method a lot. The agent detects compilation by receiving a 
>>>>>>>>>>>>>> CompiledMethodLoad event. There was an issue discovered 
>>>>>>>>>>>>>> long ago that when -Xcomp is used, the compilation happens 
>>>>>>>>>>>>>> before the "hot" method is ever called. Then the redef 
>>>>>>>>>>>>>> would happen before compilation, and this somehow messed 
>>>>>>>>>>>>>> up the test (I'm not exactly sure how). The fix was to 
>>>>>>>>>>>>>> basically abandon the redef attempt when this problem is 
>>>>>>>>>>>>>> detected, and then supposedly just let the test run to 
>>>>>>>>>>>>>> completion (skipping the actual testing of the redef). 
>>>>>>>>>>>>>> After this change, if you ran with -Xcomp it would pass, 
>>>>>>>>>>>>>> but if you looked in the log you would see:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  # ERROR: Redefinition not started. Maybe running with 
>>>>>>>>>>>>>> -Xcomp. Test ignored.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> However, there was a bug in the logic to make the test run 
>>>>>>>>>>>>>> to completion, and also causes the above message to not 
>>>>>>>>>>>>>> appear. Instead the test would fail with:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # ERROR: Redefinition not completed.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Followed by a bunch more error message during the part of 
>>>>>>>>>>>>>> the test that checks if the redef was done properly.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If the CompiledMethodLoad event comes in before the hot 
>>>>>>>>>>>>>> method is ever called (which it does with -Xcomp), the 
>>>>>>>>>>>>>> test sets fire = -1. If the hot method was called, it is 
>>>>>>>>>>>>>> set to 1.  The setting of fire = -1 was added to fix the 
>>>>>>>>>>>>>> -Xcomp problem mentioned above. The jvmti agent does the 
>>>>>>>>>>>>>> following:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     do {
>>>>>>>>>>>>>>         THREAD_sleep(1);
>>>>>>>>>>>>>>         /* wait for compilation to happen */
>>>>>>>>>>>>>>     } while(fire == 0);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     if (fire == 1) {
>>>>>>>>>>>>>>         /* do the redef here */
>>>>>>>>>>>>>>         NSK_DISPLAY0("agentProc: <<<<<<<< 
>>>>>>>>>>>>>> RedefineClasses() is successfully done\n");
>>>>>>>>>>>>>>     } else {
>>>>>>>>>>>>>>         // fire == -1
>>>>>>>>>>>>>>         NSK_DISPLAY0("agentProc: \"hot\" method wasn't 
>>>>>>>>>>>>>> executed. Don't perform redefinition\n");
>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The agent then syncs with the debuggee, waiting for it 
>>>>>>>>>>>>>> finish up. What the test expects is that 
>>>>>>>>>>>>>> waitForRedefinitionStarted() in the debuggee will time out 
>>>>>>>>>>>>>> after two seconds while waiting for fire == 1 (which it 
>>>>>>>>>>>>>> thinks will will always happen because it was set to -1). 
>>>>>>>>>>>>>> When it times out, the test does appear to exit properly 
>>>>>>>>>>>>>> with, but with the following in the log, which is intended:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  # ERROR: Redefinition not started. Maybe running with 
>>>>>>>>>>>>>> -Xcomp. Test ignored.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> However, sometimes before waitForRedefinitionStarted() 
>>>>>>>>>>>>>> times out, the hot method is called enough times to 
>>>>>>>>>>>>>> trigger compilation. So another CompiledMethodLoad event 
>>>>>>>>>>>>>> arrives, and this time fire is set to 1. Because of this, 
>>>>>>>>>>>>>> waitForRedefinitionStarted() doesn't time out and returns 
>>>>>>>>>>>>>> with an indication that the redef has started. After this 
>>>>>>>>>>>>>> waitForRedefinitionCompleted() is executed. It waits for 
>>>>>>>>>>>>>> the redef to complete, but it never does since the agent 
>>>>>>>>>>>>>> decided not to do the redef when it saw fire == -1. So 
>>>>>>>>>>>>>> waitForRedefinitionCompleted() times out after 10 seconds 
>>>>>>>>>>>>>> and the test fails, with:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # ERROR: Redefinition not completed.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Actually the above error is not really what causes the 
>>>>>>>>>>>>>> failure. When the above error is detected, no error status 
>>>>>>>>>>>>>> is set and the test continues as if the redef had been 
>>>>>>>>>>>>>> done. So then the logic that detects if the redef was done 
>>>>>>>>>>>>>> properly ends up failing, and that's where the test 
>>>>>>>>>>>>>> actually indicates a failure status. You see a whole bunch 
>>>>>>>>>>>>>> of other errors in the log because of all the checks that 
>>>>>>>>>>>>>> fail.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The fix is to not abandon the test when the first 
>>>>>>>>>>>>>> CompiledMethodLoad event is before the hot method was 
>>>>>>>>>>>>>> called. Instead just leave fire==0 and wait for the next 
>>>>>>>>>>>>>> CompiledMethodLoad event that is triggered after the 
>>>>>>>>>>>>>> method is called enough times to be recompiled. I'm not 
>>>>>>>>>>>>>> sure why it was not originally done this way. Possibly the 
>>>>>>>>>>>>>> recompilation did not happen reliably, but I have not run 
>>>>>>>>>>>>>> into this problem. The other changes in redefclass030.c 
>>>>>>>>>>>>>> are just cleaning up debug tracing.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Another fix was to properly set the error status when 
>>>>>>>>>>>>>> waitForRedefinitionStarted() or 
>>>>>>>>>>>>>> waitForRedefinitionCompleted() times out, although this is 
>>>>>>>>>>>>>> just a safety net and I didn't run into any cases where 
>>>>>>>>>>>>>> this happened after fixing the CompiledMethodLoad event 
>>>>>>>>>>>>>> handling. So in general the changes in redefclass030.java 
>>>>>>>>>>>>>> were not needed, but provide better error handling.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>