RFR(XS): 8165881 - Backout JDK-8164913
Dmitry Samersoff
dmitry.samersoff at oracle.com
Tue Sep 13 08:06:45 UTC 2016
David,
Thank you for the evaluation.
> With that in mind I suspect the real fix for the original issue is to
> have Dcmd recognize that it also has to read two "results". Though I'm
> not sure how exactly.
This logic seems completely broken for me. But I don't see anything we
can do right now - for jdk 9.
It requires careful changes of both - code and tests.
I can help with this task but not today.
-Dmitry
On 2016-09-13 08:08, David Holmes wrote:
> On 13/09/2016 1:53 PM, David Holmes wrote:
>> On 13/09/2016 1:30 PM, Yasumasa Suenaga wrote:
>>> Hi,
>>>
>>> I could reproduce and I added a comment to JBS:
>>> https://bugs.openjdk.java.net/browse/JDK-8165869?focusedCommentId=14000623&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14000623
>>>
>>>
>>>
>>>
>>> In case of BasicTests.java, I think it is a test bug.
>>
>> I don't agree as yet. I have not been able to isolate the exact
>> difference between what happens with your fix and what happens without.
>> I know it relates to the presence of the error code, but also how we get
>> AgentInitializationException instead of AgentLoadException. I need to be
>> able to run the test outside of jtreg but that is proving to be very
>> difficult to arrange. :(
>
> I finally managed to connect all the pieces on this.
>
> The basic problem is that with the proposed change
> VirtualMachineImpl.execute() will throw AgentLoadException, which then
> leads to the InternalError. Without the change we reach this block in
> HotspotVirtualMachine.loadAgentLibrary:
>
> int result = readInt(in);
> if (result != 0) {
> throw new AgentInitializationException("Agent_OnAttach failed", result);
> }
>
> and so get the expected AgentInitializationException.
>
> Looking at the proposed change in jvmtiExport.cpp if we have the original:
>
> st->print_cr("%d", result);
> result = JNI_OK;
>
> then execute() manages to read a zero completion status from the stream,
> and then loadAgentLibrary is able to read the actual "result" - whether
> zero or not - from the stream. This is because the AttachListener code
> calls op->complete(result, &st) where st is the stream where we wrote
> the result value above in print_cr. Then if we look at, for example,
> LinuxAttachOperation::complete, we will write "result" to the socket
> first, followed by the contents of st. Hence on a successful operation
> we can read 0 for execute, and then 0 for loadAgent. On error we read 0
> for execute and the actual error code for loadAgent. With the proposed
> change execute() sees the real result (instead of JNI_OK) and so throws
> AgentLoadException.
>
> So in summary there are two different error indicators written into the
> stream, and we need the first to be zero unless some major error with
> the actual attach functionality occurred - otherwise even if the "load"
> failed in some other way, it is treated as a secondary error.
>
> With that in mind I suspect the real fix for the original issue is to
> have Dcmd recognize that it also has to read two "results". Though I'm
> not sure how exactly.
>
> David
> -----
>
>> David
>>
>>
>>
>>> If it is accepted, I will upload webrev for this redo task.
>>> However, I cannot access (and fix) Oracle internal test. Can someone
>>> help me?
>>> (I cannot access JPRT. So I need a sponsor.)
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2016/09/13 9:00, David Holmes wrote:
>>>> I think I see the problem. The attach framework tries to load the
>>>> "instrument" agent library. Prior to 8164913 if this fails it actually
>>>> appears to succeed. Now it fails, and that error propagates through.
>>>> I'm guessing, at the moment, that the failure is due to a missing
>>>> module related option.
>>>>
>>>> David
>>>>
>>>> On 13/09/2016 9:54 AM, David Holmes wrote:
>>>>> Hi Yasumasa,
>>>>>
>>>>> On 13/09/2016 9:23 AM, Yasumasa Suenaga wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Hmm, it has been backouted...
>>>>>>
>>>>>> I agree to David. This error seems to be a test bug.
>>>>>> Can you share the test? I want to evaluate it.
>>>>>
>>>>> Sorry we can't share the tests. I took a quick look and it seems it
>>>>> may
>>>>> be related to the result code parsing (that I thought we ended up not
>>>>> touching!).
>>>>>
>>>>> Can you run a test of HotSpotVirtualMachine.loadAgent(null) and see
>>>>> what
>>>>> happens? We see AgentLoadException from the code that internally tries
>>>>> to load the "instrument" agent:
>>>>>
>>>>> Exception in thread "main" Failure: Unexpected exception during test
>>>>> execution: java.lang.InternalError: instrument library is missing in
>>>>> target VM
>>>>> ...
>>>>> Caused by: java.lang.InternalError: instrument library is missing in
>>>>> target VM
>>>>> ...
>>>>> Caused by: com.sun.tools.attach.AgentLoadException: Failed to load
>>>>> agent
>>>>> library
>>>>>
>>>>>
>>>>>> I do not agree to fix this bug in JDK 10 because VM might lie when
>>>>>> the
>>>>>> JVMTI agent could not be attached. IMHO this bug should be fixed in 9
>>>>>> GA, and we should fix testcase(s).
>>>>>
>>>>> I agree. It has to be noted that "we" failed to run all the
>>>>> appropriate
>>>>> tests before pushing this, which would have discovered the problem -
>>>>> assuming it is actually a problem with the change and not an unrelated
>>>>> issue in our testing environment.
>>>>>
>>>>> Unfortunately I have some high priority tasks to handle right now,
>>>>> so I
>>>>> can't go into this in any more depth at the moment.
>>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2016/09/13 6:15, David Holmes wrote:
>>>>>>> For the record it looks like the tests involved are broken, in that
>>>>>>> because the VM used to lie about the success of attaching the
>>>>>>> agent, the
>>>>>>> tests expected different exceptions to occur.
>>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>> On 13/09/2016 3:11 AM, harold seigel wrote:
>>>>>>>> Looks good.
>>>>>>>>
>>>>>>>> Harold
>>>>>>>>
>>>>>>>>
>>>>>>>> On 9/12/2016 1:05 PM, Christian Tornqvist wrote:
>>>>>>>>> Hi everyone,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Please review this (clean) backout of the change for
>>>>>>>>> JDK-8164913, it
>>>>>>>>> failed
>>>>>>>>> several tests in the nightly testing. The failures are tracked in:
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8165869
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Webrev:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~ctornqvi/webrev/8165881/webrev.00/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Bug:
>>>>>>>>>
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8165881
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Christian
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
--
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.
More information about the hotspot-dev
mailing list