RFR(S): 8151259: [TESTBUG] nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of outer fields of the class" when running with -Xcomp

Wed Jul 25 18:13:09 UTC 2018

Thanks!

On 7/25/18 11:00 AM, Alex Menkov wrote:
> Looks good to me
>
> --alex
>
> On 07/24/2018 16:23, Chris Plummer wrote:
>> Thanks, Serguei.
>>
>> I could use one more reviewer.
>>
>> thanks,
>>
>> Chris
>>
>> On 7/24/18 3:00 PM, serguei.spitsyn at oracle.com wrote:
>>> Chris,
>>>
>>> Thank you for the explanations.
>>> I'm Okay with this webrev as it is.
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 7/24/18 13:55, Chris Plummer wrote:
>>>> On 7/24/18 1:46 PM, serguei.spitsyn at oracle.com wrote:
>>>>>
>>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java.frames.html 
>>>>>
>>>>> - log.complain("Redefinition not started. Maybe running with 
>>>>> -Xcomp. Test ignored.");
>>>>> + log.complain("Redefinition not started. May need more time for 
>>>>> -Xcomp.");
>>>>> + status = Consts.TEST_FAILED;
>>>>>           return false;
>>>>>       }
>>>>> . . .
>>>>> - log.complain("Redefinition not completed.");
>>>>> + log.complain("Redefinition not completed. May need more time for 
>>>>> -Xcomp.");
>>>>> + status = Consts.TEST_FAILED;
>>>>> + return false; The complain is not fully correct if this can 
>>>>> happen not only with the -Xcomp. Could this message be relaxed a 
>>>>> little bit?
>>>> I think it is relaxed. It says *may* need more time for -Xcomp. I'm 
>>>> not sure how else to word it unless you want me to just say 
>>>> "Redefinition not completed".
>>>>> Also, just a side comment: The changes above are not that 
>>>>> harmless. As the status now is set to TEST_FAILED there is a 
>>>>> potential for the tests to start failing where they were passed 
>>>>> before.
>>>> Yes, that was intentional. It's still the case that you only need 
>>>> the fail = 0 change to fix the bug, but having these methods 
>>>> properly cause the test to fail is necessary if something were to 
>>>> ever go wrong and the redef was not started or completed. Otherwise 
>>>> the test would either silently pass (if redef was not started) or 
>>>> just produce error messages like it has been when it checks for the 
>>>> proper redef (if the redef never completed).
>>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>> Otherwise, looks good.
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 7/24/18 13:22, Chris Plummer wrote:
>>>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.01
>>>>>>
>>>>>> Since I was removed the "else", there was no need for the "if", 
>>>>>> so I removed it also. I had to re-indent the body of the "if" 
>>>>>> section because of that. The webrev seems to not call out the 
>>>>>> whitespace changes, although I also did a couple of other minor 
>>>>>> formatting changes in the code that do show up.
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 7/24/18 12:42 PM, Chris Plummer wrote:
>>>>>>> Yes. I'm just retesting first.
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 7/24/18 12:18 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>> Hi Chris,
>>>>>>>>
>>>>>>>> You have my all my comments and I leave it up to you to decide 
>>>>>>>> what approach to pick.
>>>>>>>> Could you send an updated webrev, please?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>>
>>>>>>>> On 7/24/18 09:27, Chris Plummer wrote:
>>>>>>>>> On 7/24/18 12:25 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>> Hi Chris,
>>>>>>>>>>
>>>>>>>>>> I still feel, this fix adds more confusion and complexity.
>>>>>>>>>> Let's look at some fragments.
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00/test/hotspot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses/redefclass028/redefclass028.c.frames.html 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>   116     if ((strcmp(name, expHSMethod) == 0) &&
>>>>>>>>>>   117             (strcmp(sig, expHSSignature) == 0)) {
>>>>>>>>>>   118         NSK_DISPLAY0("CompiledMethodLoad: a tested 
>>>>>>>>>> hotspot method found\n");
>>>>>>>>>>   119
>>>>>>>>>>   120         // CR 6604375: check whether "hot" method was 
>>>>>>>>>> entered
>>>>>>>>>>   121         if (enteredHotMethod) {
>>>>>>>>>>   122             hsMethodID = method;
>>>>>>>>>>   123             fire = 1;
>>>>>>>>>>   124         } else {
>>>>>>>>>> 125 NSK_DISPLAY0("Compilation occured before method 
>>>>>>>>>> execution\n");
>>>>>>>>>> 126 fire = 0; // Ignore this compilation. Wait for next one.
>>>>>>>>>>   127         }
>>>>>>>>>>   128     }
>>>>>>>>>>
>>>>>>>>>> I think, the line #126 is not needed.
>>>>>>>>>> It just creates a confusion.
>>>>>>>>>> The fire == 0 from beginning.
>>>>>>>>>> Why do we need it to set to 0 again?
>>>>>>>>> Yes, it can be removed. I just didn't give it much thought 
>>>>>>>>> when changing the code from -1 to 0.
>>>>>>>>>> Is it because it can be already set to 1?
>>>>>>>>>> Id so, I'm not sure I understand this code then.
>>>>>>>>>>
>>>>>>>>>>   187     } while(fire == 0);
>>>>>>>>>>   188
>>>>>>>>>>   189     NSK_DISPLAY0("agentProc: hotspot method 
>>>>>>>>>> compiled\n\n");
>>>>>>>>>>   190
>>>>>>>>>>   192     if (fire == 1) {
>>>>>>>>>>   . . .
>>>>>>>>>>   224     } else {
>>>>>>>>>>   225         // fire == -1
>>>>>>>>>> 226 // NOTE: This isn't suppose to happen anymore. Hot method 
>>>>>>>>>> should always end up being entered.
>>>>>>>>>> 227 NSK_COMPLAIN0("agentProc: \"hot\" method wasn't executed. 
>>>>>>>>>> Don't perform redefinition\n");
>>>>>>>>>>   228     }
>>>>>>>>>> I don't understand why do we need the check at the line #192.
>>>>>>>>>> The variable fire can be only equal to 0 or 1.
>>>>>>>>>> The only way out of the loop at the line #187 is if fire == 1.
>>>>>>>>>>
>>>>>>>>>> Then the else statement at the lines 224-228 confuses even more.
>>>>>>>>> The else section can be removed. I left it in as sort of an 
>>>>>>>>> assert, but I see now that it just cause confusion.
>>>>>>>>>
>>>>>>>>> thanks,
>>>>>>>>>
>>>>>>>>> Chris
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 7/23/18 20:19, Chris Plummer wrote:
>>>>>>>>>>> On 7/23/18 5:22 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 7/23/18 11:40, Chris Plummer wrote:
>>>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>>>
>>>>>>>>>>>>> If the fix was complicated I would agree, but it really 
>>>>>>>>>>>>> just boils down to this one line change:
>>>>>>>>>>>>>
>>>>>>>>>>>>> -            fire = -1;
>>>>>>>>>>>>> +            fire = 0; // Ignore this compilation. Wait 
>>>>>>>>>>>>> for next one.
>>>>>>>>>>>>
>>>>>>>>>>>> It is not obvious that this will completely fix the problem.
>>>>>>>>>>>> Is it possible that there will not be next compilation with 
>>>>>>>>>>>> the -Xcomp?
>>>>>>>>>>> It's only one method that we check for. I don't see why 
>>>>>>>>>>> there would be 2nd -Xcomp compilation for it, but even if 
>>>>>>>>>>> there was, the test will ignore it just like the first one. 
>>>>>>>>>>> It will ignore compilations of the method until the flag has 
>>>>>>>>>>> been set indicating the method has been executed once.
>>>>>>>>>>
>>>>>>>>>>> If for some reason the method is never compiled after being 
>>>>>>>>>>> executed once, the test will give up waiting for it (I think 
>>>>>>>>>>> after 30 seconds) and produce an error.
>>>>>>>>>>
>>>>>>>>>> I'm afraid that it is what will always happen with the -Xcomp.
>>>>>>>>>> Then there is no point to waist this by waiting for timeout 
>>>>>>>>>> as the test will successfully complete without testing anything.
>>>>>>>>>> It seems to be not worth this complexity.
>>>>>>>>>>
>>>>>>>>>> I guess, you would want some extra tracing though. :)
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Serguei
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>> If it is possible then it is better to explicitly exclude 
>>>>>>>>>>>> these tests for -Xcomp.
>>>>>>>>>>>> Otherwise, consider this reviewed.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Given that, I see no reason not to increase our test 
>>>>>>>>>>>>> coverage by supporting this test during -Xcomp runs.
>>>>>>>>>>>>
>>>>>>>>>>>> I'd agree if it is going to be stable.
>>>>>>>>>>>>
>>>>>>>>>>> If problems turn up in the future, we can reconsider 
>>>>>>>>>>> disabling it.
>>>>>>>>>>>
>>>>>>>>>>> thanks,
>>>>>>>>>>>
>>>>>>>>>>> Chris
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Serguei
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 7/23/18 9:44 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Would it be more simple to avoid running these tests with 
>>>>>>>>>>>>>> -Xcomp?
>>>>>>>>>>>>>> I guess, this would work: @requires vm.compMode != "Xcomp"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 7/23/18 00:42, Chris Plummer wrote:
>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please review the following fix for JDK11:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8151259
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It fixes the following 3 tests:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java
>>>>>>>>>>>>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass029.java
>>>>>>>>>>>>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass030.java
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Any of which could fail when run with -Xcomp with 
>>>>>>>>>>>>>>> (followed by a bunch more errors):
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  # ERROR: Redefinition not started. Maybe running with 
>>>>>>>>>>>>>>> -Xcomp. Test ignored.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Although lately we've only seen this with 
>>>>>>>>>>>>>>> redefclass030.java on macosx.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> These 3 tests do redefinition of a "hot" method after 
>>>>>>>>>>>>>>> triggering compilation for it. After the redef some 
>>>>>>>>>>>>>>> testing is done to ensure that the redef was done 
>>>>>>>>>>>>>>> correctly, but the issue these test have actually comes 
>>>>>>>>>>>>>>> before any redef is done.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The test attempts to trigger compilation by calling a 
>>>>>>>>>>>>>>> hot method a lot. The agent detects compilation by 
>>>>>>>>>>>>>>> receiving a CompiledMethodLoad event. There was an issue 
>>>>>>>>>>>>>>> discovered long ago that when -Xcomp is used, the 
>>>>>>>>>>>>>>> compilation happens before the "hot" method is ever 
>>>>>>>>>>>>>>> called. Then the redef would happen before compilation, 
>>>>>>>>>>>>>>> and this somehow messed up the test (I'm not exactly 
>>>>>>>>>>>>>>> sure how). The fix was to basically abandon the redef 
>>>>>>>>>>>>>>> attempt when this problem is detected, and then 
>>>>>>>>>>>>>>> supposedly just let the test run to completion (skipping 
>>>>>>>>>>>>>>> the actual testing of the redef). After this change, if 
>>>>>>>>>>>>>>> you ran with -Xcomp it would pass, but if you looked in 
>>>>>>>>>>>>>>> the log you would see:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  # ERROR: Redefinition not started. Maybe running with 
>>>>>>>>>>>>>>> -Xcomp. Test ignored.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> However, there was a bug in the logic to make the test 
>>>>>>>>>>>>>>> run to completion, and also causes the above message to 
>>>>>>>>>>>>>>> not appear. Instead the test would fail with:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> # ERROR: Redefinition not completed.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Followed by a bunch more error message during the part 
>>>>>>>>>>>>>>> of the test that checks if the redef was done properly.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If the CompiledMethodLoad event comes in before the hot 
>>>>>>>>>>>>>>> method is ever called (which it does with -Xcomp), the 
>>>>>>>>>>>>>>> test sets fire = -1. If the hot method was called, it is 
>>>>>>>>>>>>>>> set to 1. The setting of fire = -1 was added to fix the 
>>>>>>>>>>>>>>> -Xcomp problem mentioned above. The jvmti agent does the 
>>>>>>>>>>>>>>> following:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     do {
>>>>>>>>>>>>>>>         THREAD_sleep(1);
>>>>>>>>>>>>>>>         /* wait for compilation to happen */
>>>>>>>>>>>>>>>     } while(fire == 0);
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     if (fire == 1) {
>>>>>>>>>>>>>>>         /* do the redef here */
>>>>>>>>>>>>>>>         NSK_DISPLAY0("agentProc: <<<<<<<< 
>>>>>>>>>>>>>>> RedefineClasses() is successfully done\n");
>>>>>>>>>>>>>>>     } else {
>>>>>>>>>>>>>>>         // fire == -1
>>>>>>>>>>>>>>>         NSK_DISPLAY0("agentProc: \"hot\" method wasn't 
>>>>>>>>>>>>>>> executed. Don't perform redefinition\n");
>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The agent then syncs with the debuggee, waiting for it 
>>>>>>>>>>>>>>> finish up. What the test expects is that 
>>>>>>>>>>>>>>> waitForRedefinitionStarted() in the debuggee will time 
>>>>>>>>>>>>>>> out after two seconds while waiting for fire == 1 (which 
>>>>>>>>>>>>>>> it thinks will will always happen because it was set to 
>>>>>>>>>>>>>>> -1). When it times out, the test does appear to exit 
>>>>>>>>>>>>>>> properly with, but with the following in the log, which 
>>>>>>>>>>>>>>> is intended:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  # ERROR: Redefinition not started. Maybe running with 
>>>>>>>>>>>>>>> -Xcomp. Test ignored.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> However, sometimes before waitForRedefinitionStarted() 
>>>>>>>>>>>>>>> times out, the hot method is called enough times to 
>>>>>>>>>>>>>>> trigger compilation. So another CompiledMethodLoad event 
>>>>>>>>>>>>>>> arrives, and this time fire is set to 1. Because of 
>>>>>>>>>>>>>>> this, waitForRedefinitionStarted() doesn't time out and 
>>>>>>>>>>>>>>> returns with an indication that the redef has started. 
>>>>>>>>>>>>>>> After this waitForRedefinitionCompleted() is executed. 
>>>>>>>>>>>>>>> It waits for the redef to complete, but it never does 
>>>>>>>>>>>>>>> since the agent decided not to do the redef when it saw 
>>>>>>>>>>>>>>> fire == -1. So waitForRedefinitionCompleted() times out 
>>>>>>>>>>>>>>> after 10 seconds and the test fails, with:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> # ERROR: Redefinition not completed.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Actually the above error is not really what causes the 
>>>>>>>>>>>>>>> failure. When the above error is detected, no error 
>>>>>>>>>>>>>>> status is set and the test continues as if the redef had 
>>>>>>>>>>>>>>> been done. So then the logic that detects if the redef 
>>>>>>>>>>>>>>> was done properly ends up failing, and that's where the 
>>>>>>>>>>>>>>> test actually indicates a failure status. You see a 
>>>>>>>>>>>>>>> whole bunch of other errors in the log because of all 
>>>>>>>>>>>>>>> the checks that fail.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The fix is to not abandon the test when the first 
>>>>>>>>>>>>>>> CompiledMethodLoad event is before the hot method was 
>>>>>>>>>>>>>>> called. Instead just leave fire==0 and wait for the next 
>>>>>>>>>>>>>>> CompiledMethodLoad event that is triggered after the 
>>>>>>>>>>>>>>> method is called enough times to be recompiled. I'm not 
>>>>>>>>>>>>>>> sure why it was not originally done this way. Possibly 
>>>>>>>>>>>>>>> the recompilation did not happen reliably, but I have 
>>>>>>>>>>>>>>> not run into this problem. The other changes in 
>>>>>>>>>>>>>>> redefclass030.c are just cleaning up debug tracing.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Another fix was to properly set the error status when 
>>>>>>>>>>>>>>> waitForRedefinitionStarted() or 
>>>>>>>>>>>>>>> waitForRedefinitionCompleted() times out, although this 
>>>>>>>>>>>>>>> is just a safety net and I didn't run into any cases 
>>>>>>>>>>>>>>> where this happened after fixing the CompiledMethodLoad 
>>>>>>>>>>>>>>> event handling. So in general the changes in 
>>>>>>>>>>>>>>> redefclass030.java were not needed, but provide better 
>>>>>>>>>>>>>>> error handling.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>