RFR(XS): 8165881 - Backout JDK-8164913
Dmitry Samersoff
dmitry.samersoff at oracle.com
Thu Sep 15 11:08:11 UTC 2016
Yasumasa, David,
I'm looking into this issue and come back as soon as I have a clear
picture.
It takes some time. Sorry!
-Dmitry
On 2016-09-15 14:01, Yasumasa Suenaga wrote:
> Hi David,
>
> I agree the call sequence which you write in the case of "jcmd" operation.
> However, we should think about "load" operation in AttachListener.
>
> We can access JvmtiExport::load_agent_library() through two routes:
>
> Route "load": AttachListener -> JvmtiExport::load_agent_library()
> Route "jcmd": AttachListener -> jcmd() in attachListener.cpp ->
> DCmd::parse_and_execute()
>
> "load" sets error code (it means first data in the stream) when
> JvmtiExport::load_agent_library() returns JNI_ERR.
> OTOH "jcmd" sets error code if exception is occurred ONLY.
>
> If we try to load invalid JVMTI agent, "load" writes JNI_ERR to stream,
> however "jcmd" writes JNI_OK because pending exception is not available
> at this point.
> Currently, JCmd.java shows "Command executed successfully" when it
> cannot read the message from the socket. In case of this issue,
> JVMTIAgentLoadDCmd does not write any message because it cannot attach
> the agent.
> If return code which is in the top of stream should mean whether
> communication with AttachListener is succeeded, I think HotSpot should
> write error message or error code to the stream for JCmd.java .
>
> I'm not sure this email is the answer for you.
> Sorry for my English.
>
>
> Yasumasa
>
>
> On 2016/09/15 18:43, David Holmes wrote:
>> Hi Yasumasa,
>>
>> On 15/09/2016 1:56 PM, Yasumasa Suenaga wrote:
>>> Hi,
>>>
>>>> If this is done through jcmd then jcmd needs to know how to "unwrap"
>>>> the information it gets back from the Dcmd.
>>>
>>> In case of JVMTI.agent_load dcmd, I think we should be able to get the
>>> response as below:
>>>
>>> 1. Invalid JVMTI agent
>>> 2. Agent_OnAttach() did not return JNI_OK
>>> 3. Succeeded
>>>
>>> Currently, JvmtiExport::load_agent_library() returns JNI_OK to caller,
>>> and jcmd() in attachListener.cpp returns JNI_ERR if exception is
>>> occurred (in Agent_OnAttach()).
>>
>> Can you respond to my comment in the bug report about the chain of
>> calls and explain exactly where you think things are going wrong in
>> this scenario. I'm having a lot of trouble trying to see how the
>> information flows back through the layers.
>>
>>> IMHO, HotSpot should return error code (in top of the stream: written by
>>> AttachListener at first). So we should be able to get some error code in
>>> case 1. and 2. Then the client (jcmd or other methods) can decide to
>>> parse text information in the stream.
>>
>> It returns the high-level response code first "did communication with
>> the target VM succeed", then the actual action response code. That
>> seems the right thing to do to me. It is up to the callers reading the
>> stream to understand how to read it. To me the issue lies somewhere on
>> the jcmd/dcmd side.
>>
>> Thanks,
>> David
>>
>>>
>>> Sorry for my English.
>>>
>>> Yasumasa
>>>
>>>
>>> On 2016/09/14 7:03, David Holmes wrote:
>>>> On 13/09/2016 10:31 PM, Yasumasa Suenaga wrote:
>>>>> Thanks David!
>>>>> If we should not change the meaning of return code from
>>>>> JvmtiExport::load_agent_library(), we should print return code as
>>>>> below:
>>>>> -------------------
>>>>> diff -r 0cf03b9d9b1f src/share/vm/prims/jvmtiExport.cpp
>>>>> --- a/src/share/vm/prims/jvmtiExport.cpp Mon Sep 12 18:59:13 2016
>>>>> +0000
>>>>> +++ b/src/share/vm/prims/jvmtiExport.cpp Tue Sep 13 21:12:14 2016
>>>>> +0900
>>>>> @@ -2412,6 +2412,10 @@
>>>>> result = JNI_OK;
>>>>> }
>>>>> }
>>>>> + // Print error code if Agent_OnAttach cannot be executed
>>>>> + if (result != JNI_OK) {
>>>>> + st->print_cr("%d", result);
>>>>> + }
>>>>> return result;
>>>>> }
>>>>> -------------------
>>>>
>>>> Not sure I see the point. "return result" will put the error code into
>>>> the socket stream and that error will be seen and responded to. I
>>>> don't expect anything to then read further into the stream to see the
>>>> "result" repeated a second time. In other words if execute() doesn't
>>>> see a zero result then you go no further; if execute sees a zero
>>>> result then the actual action may have had an issue so you proceed to
>>>> read the "action's result".
>>>>
>>>>> It can pass com/sun/tools/attach/BasicTests.java, and we can get -1 if
>>>>> we try to attach invalid agent.
>>>>
>>>> Can you please outline your test case for this again. If this is done
>>>> through jcmd then jcmd needs to know how to "unwrap" the information
>>>> it gets back from the Dcmd.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2016/09/13 17:06, Dmitry Samersoff wrote:
>>>>>> David,
>>>>>>
>>>>>> Thank you for the evaluation.
>>>>>>
>>>>>>> With that in mind I suspect the real fix for the original issue
>>>>>>> is to
>>>>>>> have Dcmd recognize that it also has to read two "results".
>>>>>>> Though I'm
>>>>>>> not sure how exactly.
>>>>>>
>>>>>> This logic seems completely broken for me. But I don't see
>>>>>> anything we
>>>>>> can do right now - for jdk 9.
>>>>>>
>>>>>> It requires careful changes of both - code and tests.
>>>>>>
>>>>>> I can help with this task but not today.
>>>>>>
>>>>>> -Dmitry
>>>>>>
>>>>>> On 2016-09-13 08:08, David Holmes wrote:
>>>>>>> On 13/09/2016 1:53 PM, David Holmes wrote:
>>>>>>>> On 13/09/2016 1:30 PM, Yasumasa Suenaga wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I could reproduce and I added a comment to JBS:
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8165869?focusedCommentId=14000623&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14000623
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> In case of BasicTests.java, I think it is a test bug.
>>>>>>>>
>>>>>>>> I don't agree as yet. I have not been able to isolate the exact
>>>>>>>> difference between what happens with your fix and what happens
>>>>>>>> without.
>>>>>>>> I know it relates to the presence of the error code, but also
>>>>>>>> how we
>>>>>>>> get
>>>>>>>> AgentInitializationException instead of AgentLoadException. I need
>>>>>>>> to be
>>>>>>>> able to run the test outside of jtreg but that is proving to be
>>>>>>>> very
>>>>>>>> difficult to arrange. :(
>>>>>>>
>>>>>>> I finally managed to connect all the pieces on this.
>>>>>>>
>>>>>>> The basic problem is that with the proposed change
>>>>>>> VirtualMachineImpl.execute() will throw AgentLoadException, which
>>>>>>> then
>>>>>>> leads to the InternalError. Without the change we reach this
>>>>>>> block in
>>>>>>> HotspotVirtualMachine.loadAgentLibrary:
>>>>>>>
>>>>>>> int result = readInt(in);
>>>>>>> if (result != 0) {
>>>>>>> throw new AgentInitializationException("Agent_OnAttach failed",
>>>>>>> result);
>>>>>>> }
>>>>>>>
>>>>>>> and so get the expected AgentInitializationException.
>>>>>>>
>>>>>>> Looking at the proposed change in jvmtiExport.cpp if we have the
>>>>>>> original:
>>>>>>>
>>>>>>> st->print_cr("%d", result);
>>>>>>> result = JNI_OK;
>>>>>>>
>>>>>>> then execute() manages to read a zero completion status from the
>>>>>>> stream,
>>>>>>> and then loadAgentLibrary is able to read the actual "result" -
>>>>>>> whether
>>>>>>> zero or not - from the stream. This is because the AttachListener
>>>>>>> code
>>>>>>> calls op->complete(result, &st) where st is the stream where we
>>>>>>> wrote
>>>>>>> the result value above in print_cr. Then if we look at, for example,
>>>>>>> LinuxAttachOperation::complete, we will write "result" to the socket
>>>>>>> first, followed by the contents of st. Hence on a successful
>>>>>>> operation
>>>>>>> we can read 0 for execute, and then 0 for loadAgent. On error we
>>>>>>> read 0
>>>>>>> for execute and the actual error code for loadAgent. With the
>>>>>>> proposed
>>>>>>> change execute() sees the real result (instead of JNI_OK) and so
>>>>>>> throws
>>>>>>> AgentLoadException.
>>>>>>>
>>>>>>> So in summary there are two different error indicators written into
>>>>>>> the
>>>>>>> stream, and we need the first to be zero unless some major error
>>>>>>> with
>>>>>>> the actual attach functionality occurred - otherwise even if the
>>>>>>> "load"
>>>>>>> failed in some other way, it is treated as a secondary error.
>>>>>>>
>>>>>>> With that in mind I suspect the real fix for the original issue
>>>>>>> is to
>>>>>>> have Dcmd recognize that it also has to read two "results".
>>>>>>> Though I'm
>>>>>>> not sure how exactly.
>>>>>>>
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>> David
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> If it is accepted, I will upload webrev for this redo task.
>>>>>>>>> However, I cannot access (and fix) Oracle internal test. Can
>>>>>>>>> someone
>>>>>>>>> help me?
>>>>>>>>> (I cannot access JPRT. So I need a sponsor.)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2016/09/13 9:00, David Holmes wrote:
>>>>>>>>>> I think I see the problem. The attach framework tries to load the
>>>>>>>>>> "instrument" agent library. Prior to 8164913 if this fails it
>>>>>>>>>> actually
>>>>>>>>>> appears to succeed. Now it fails, and that error propagates
>>>>>>>>>> through.
>>>>>>>>>> I'm guessing, at the moment, that the failure is due to a missing
>>>>>>>>>> module related option.
>>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>> On 13/09/2016 9:54 AM, David Holmes wrote:
>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>
>>>>>>>>>>> On 13/09/2016 9:23 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> Hmm, it has been backouted...
>>>>>>>>>>>>
>>>>>>>>>>>> I agree to David. This error seems to be a test bug.
>>>>>>>>>>>> Can you share the test? I want to evaluate it.
>>>>>>>>>>>
>>>>>>>>>>> Sorry we can't share the tests. I took a quick look and it
>>>>>>>>>>> seems it
>>>>>>>>>>> may
>>>>>>>>>>> be related to the result code parsing (that I thought we
>>>>>>>>>>> ended up
>>>>>>>>>>> not
>>>>>>>>>>> touching!).
>>>>>>>>>>>
>>>>>>>>>>> Can you run a test of HotSpotVirtualMachine.loadAgent(null) and
>>>>>>>>>>> see
>>>>>>>>>>> what
>>>>>>>>>>> happens? We see AgentLoadException from the code that internally
>>>>>>>>>>> tries
>>>>>>>>>>> to load the "instrument" agent:
>>>>>>>>>>>
>>>>>>>>>>> Exception in thread "main" Failure: Unexpected exception during
>>>>>>>>>>> test
>>>>>>>>>>> execution: java.lang.InternalError: instrument library is
>>>>>>>>>>> missing in
>>>>>>>>>>> target VM
>>>>>>>>>>> ...
>>>>>>>>>>> Caused by: java.lang.InternalError: instrument library is
>>>>>>>>>>> missing in
>>>>>>>>>>> target VM
>>>>>>>>>>> ...
>>>>>>>>>>> Caused by: com.sun.tools.attach.AgentLoadException: Failed to
>>>>>>>>>>> load
>>>>>>>>>>> agent
>>>>>>>>>>> library
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> I do not agree to fix this bug in JDK 10 because VM might lie
>>>>>>>>>>>> when
>>>>>>>>>>>> the
>>>>>>>>>>>> JVMTI agent could not be attached. IMHO this bug should be
>>>>>>>>>>>> fixed
>>>>>>>>>>>> in 9
>>>>>>>>>>>> GA, and we should fix testcase(s).
>>>>>>>>>>>
>>>>>>>>>>> I agree. It has to be noted that "we" failed to run all the
>>>>>>>>>>> appropriate
>>>>>>>>>>> tests before pushing this, which would have discovered the
>>>>>>>>>>> problem -
>>>>>>>>>>> assuming it is actually a problem with the change and not an
>>>>>>>>>>> unrelated
>>>>>>>>>>> issue in our testing environment.
>>>>>>>>>>>
>>>>>>>>>>> Unfortunately I have some high priority tasks to handle right
>>>>>>>>>>> now,
>>>>>>>>>>> so I
>>>>>>>>>>> can't go into this in any more depth at the moment.
>>>>>>>>>>>
>>>>>>>>>>> David
>>>>>>>>>>> -----
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 2016/09/13 6:15, David Holmes wrote:
>>>>>>>>>>>>> For the record it looks like the tests involved are broken, in
>>>>>>>>>>>>> that
>>>>>>>>>>>>> because the VM used to lie about the success of attaching the
>>>>>>>>>>>>> agent, the
>>>>>>>>>>>>> tests expected different exceptions to occur.
>>>>>>>>>>>>>
>>>>>>>>>>>>> David
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 13/09/2016 3:11 AM, harold seigel wrote:
>>>>>>>>>>>>>> Looks good.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Harold
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 9/12/2016 1:05 PM, Christian Tornqvist wrote:
>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please review this (clean) backout of the change for
>>>>>>>>>>>>>>> JDK-8164913, it
>>>>>>>>>>>>>>> failed
>>>>>>>>>>>>>>> several tests in the nightly testing. The failures are
>>>>>>>>>>>>>>> tracked in:
>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8165869
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ctornqvi/webrev/8165881/webrev.00/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Bug:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8165881
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Christian
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>
>>>>>>
--
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.
More information about the hotspot-dev
mailing list