RFR(XS): 8165881 - Backout JDK-8164913

David Holmes david.holmes at oracle.com
Tue Oct 25 10:30:26 UTC 2016


On 25/10/2016 7:28 PM, Yasumasa Suenaga wrote:
> Hi David,
>
>> The completion status will be 0 because we did manage to communicate
>> with the target VM and process the DCmd request. It passes sis back up
>> to the caller. So looking at JCmd.java I see:
>>
>>            try (InputStream in = hvm.executeJCmd(line);) {
>>                 // read to EOF and just print output
>>                 byte b[] = new byte[256];
>>                 int n;
>>                 boolean messagePrinted = false;
>>                 do {
>>                     n = in.read(b);
>>                     if (n > 0) {
>>                         String s = new String(b, 0, n, "UTF-8");
>>                         System.out.print(s);
>>                         messagePrinted = true;
>>                     }
>>                 } while (n > 0);
>>                 if (!messagePrinted) {
>>                     System.out.println("Command executed successfully");
>>                 }
>>             }
>>
>> and that logic, AFAICS, is not parsing out a return code followed by
>> an error message!
>
> IMHO, JCmd#executeCommandForPid() should not parse return code from DCmd
> because most of DCmds will not return the code.
> For example, HelpDCmd and SystemGCDcmd will not return it.

I can't look at this at the moment, but there is a wire protocol here 
that needs to be followed, and if it isn't followed then that needs to 
be fixed. If all of Dcmd needs some updates then so be it.

David

PS. I am about to travel and will be offline till Wednesday afternoon US 
PST.

> If we should parse return code in JCmd.java, I guess we have to add
> return code to all
> DCmds. IMHO, they should be fixed on another issue.
>
> Thanks,
>
> Yasumasa
>
>
>
> On 2016/10/25 12:41, David Holmes wrote:
>> On 24/10/2016 8:24 PM, Yasumasa Suenaga wrote:
>>> Hi Dmitry, David,
>>>
>>> I want to resume to work for this issue because this issue has been
>>> marked P3.
>>> Dmitry, do you have any idea for it?
>>>
>>> IMHO, we can fix as below:
>>>
>>>   http://cr.openjdk.java.net/~ysuenaga/JDK-8165869/webrev.00/
>>>
>>> According to David, JvmtiExport::load_agent_library() should return zero
>>> when the operation which is kicked by AttachListener is succeeded.
>>> AttachListener will write response code from
>>> JvmtiExport::load_agent_library() at first, and the client (e.g.
>>> HotSpotVirtualMachine.java) checks it.
>>>
>>> Thus I keep current implementation about return code, and I added the
>>> code to send error message to the client.
>>> It works fine with jdk/test/com/sun/tools/attach/BasicTests.java .
>>>
>>> What do you think about it?
>>
>> Sorry but I still see the problem as being in the JCmd code, as per my
>> previous email on the subject. [1]
>>
>> David
>> -----
>>
>> [1]
>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2016-September/024546.html
>>
>>
>>
>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2016/09/15 20:08, Dmitry Samersoff wrote:
>>>> Yasumasa, David,
>>>>
>>>> I'm looking into this issue and come back as soon as I have a clear
>>>> picture.
>>>>
>>>> It takes some time. Sorry!
>>>>
>>>> -Dmitry
>>>>
>>>> On 2016-09-15 14:01, Yasumasa Suenaga wrote:
>>>>> Hi David,
>>>>>
>>>>> I agree the call sequence which you write in the case of "jcmd"
>>>>> operation.
>>>>> However, we should think about "load" operation in AttachListener.
>>>>>
>>>>> We can access JvmtiExport::load_agent_library() through two routes:
>>>>>
>>>>>  Route "load": AttachListener -> JvmtiExport::load_agent_library()
>>>>>  Route "jcmd": AttachListener -> jcmd() in attachListener.cpp ->
>>>>> DCmd::parse_and_execute()
>>>>>
>>>>> "load" sets error code (it means first data in the stream) when
>>>>> JvmtiExport::load_agent_library() returns JNI_ERR.
>>>>> OTOH "jcmd" sets error code if exception is occurred ONLY.
>>>>>
>>>>> If we try to load invalid JVMTI agent, "load" writes JNI_ERR to
>>>>> stream,
>>>>> however "jcmd" writes JNI_OK because pending exception is not
>>>>> available
>>>>> at this point.
>>>>> Currently, JCmd.java shows "Command executed successfully" when it
>>>>> cannot read the message from the socket. In case of this issue,
>>>>> JVMTIAgentLoadDCmd does not write any message because it cannot attach
>>>>> the agent.
>>>>> If return code which is in the top of stream should mean whether
>>>>> communication with AttachListener is succeeded, I think HotSpot should
>>>>> write error message or error code to the stream for JCmd.java .
>>>>>
>>>>> I'm not sure this email is the answer for you.
>>>>> Sorry for my English.
>>>>>
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2016/09/15 18:43, David Holmes wrote:
>>>>>> Hi Yasumasa,
>>>>>>
>>>>>> On 15/09/2016 1:56 PM, Yasumasa Suenaga wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>>> If this is done through jcmd then jcmd needs to know how to
>>>>>>>> "unwrap"
>>>>>>>> the information it gets back from the Dcmd.
>>>>>>>
>>>>>>> In case of JVMTI.agent_load dcmd, I think we should be able to
>>>>>>> get the
>>>>>>> response as below:
>>>>>>>
>>>>>>>   1. Invalid JVMTI agent
>>>>>>>   2. Agent_OnAttach() did not return JNI_OK
>>>>>>>   3. Succeeded
>>>>>>>
>>>>>>> Currently, JvmtiExport::load_agent_library() returns JNI_OK to
>>>>>>> caller,
>>>>>>> and jcmd() in attachListener.cpp returns JNI_ERR if exception is
>>>>>>> occurred (in Agent_OnAttach()).
>>>>>>
>>>>>> Can you respond to my comment in the bug report about the chain of
>>>>>> calls and explain exactly where you think things are going wrong in
>>>>>> this scenario. I'm having a lot of trouble trying to see how the
>>>>>> information flows back through the layers.
>>>>>>
>>>>>>> IMHO, HotSpot should return error code (in top of the stream:
>>>>>>> written by
>>>>>>> AttachListener at first). So we should be able to get some error
>>>>>>> code in
>>>>>>> case 1. and 2. Then the client (jcmd or other methods) can decide to
>>>>>>> parse text information in the stream.
>>>>>>
>>>>>> It returns the high-level response code first "did communication with
>>>>>> the target VM succeed", then the actual action response code. That
>>>>>> seems the right thing to do to me. It is up to the callers reading
>>>>>> the
>>>>>> stream to understand how to read it. To me the issue lies
>>>>>> somewhere on
>>>>>> the jcmd/dcmd side.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>>
>>>>>>> Sorry for my English.
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>> On 2016/09/14 7:03, David Holmes wrote:
>>>>>>>> On 13/09/2016 10:31 PM, Yasumasa Suenaga wrote:
>>>>>>>>> Thanks David!
>>>>>>>>> If we should not change the meaning of return code from
>>>>>>>>> JvmtiExport::load_agent_library(), we should print return code as
>>>>>>>>> below:
>>>>>>>>> -------------------
>>>>>>>>> diff -r 0cf03b9d9b1f src/share/vm/prims/jvmtiExport.cpp
>>>>>>>>> --- a/src/share/vm/prims/jvmtiExport.cpp    Mon Sep 12 18:59:13
>>>>>>>>> 2016
>>>>>>>>> +0000
>>>>>>>>> +++ b/src/share/vm/prims/jvmtiExport.cpp    Tue Sep 13 21:12:14
>>>>>>>>> 2016
>>>>>>>>> +0900
>>>>>>>>> @@ -2412,6 +2412,10 @@
>>>>>>>>>        result = JNI_OK;
>>>>>>>>>      }
>>>>>>>>>    }
>>>>>>>>> + // Print error code if Agent_OnAttach cannot be executed
>>>>>>>>> + if (result != JNI_OK) {
>>>>>>>>> + st->print_cr("%d", result);
>>>>>>>>> + }
>>>>>>>>>    return result;
>>>>>>>>>  }
>>>>>>>>> -------------------
>>>>>>>>
>>>>>>>> Not sure I see the point. "return result" will put the error code
>>>>>>>> into
>>>>>>>> the socket stream and that error will be seen and responded to. I
>>>>>>>> don't expect anything to then read further into the stream to
>>>>>>>> see the
>>>>>>>> "result" repeated a second time. In other words if execute()
>>>>>>>> doesn't
>>>>>>>> see a zero result then you go no further; if execute sees a zero
>>>>>>>> result then the actual action may have had an issue so you
>>>>>>>> proceed to
>>>>>>>> read the "action's result".
>>>>>>>>
>>>>>>>>> It can pass com/sun/tools/attach/BasicTests.java, and we can get
>>>>>>>>> -1 if
>>>>>>>>> we try to attach invalid agent.
>>>>>>>>
>>>>>>>> Can you please outline your test case for this again. If this is
>>>>>>>> done
>>>>>>>> through jcmd then jcmd needs to know how to "unwrap" the
>>>>>>>> information
>>>>>>>> it gets back from the Dcmd.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2016/09/13 17:06, Dmitry Samersoff wrote:
>>>>>>>>>> David,
>>>>>>>>>>
>>>>>>>>>> Thank you for the evaluation.
>>>>>>>>>>
>>>>>>>>>>> With that in mind I suspect the real fix for the original issue
>>>>>>>>>>> is to
>>>>>>>>>>> have Dcmd recognize that it also has to read two "results".
>>>>>>>>>>> Though I'm
>>>>>>>>>>> not sure how exactly.
>>>>>>>>>>
>>>>>>>>>> This logic seems completely broken for me. But I don't see
>>>>>>>>>> anything we
>>>>>>>>>> can do right now - for jdk 9.
>>>>>>>>>>
>>>>>>>>>> It requires careful changes of both - code and tests.
>>>>>>>>>>
>>>>>>>>>> I can help with this task but not today.
>>>>>>>>>>
>>>>>>>>>> -Dmitry
>>>>>>>>>>
>>>>>>>>>> On 2016-09-13 08:08, David Holmes wrote:
>>>>>>>>>>> On 13/09/2016 1:53 PM, David Holmes wrote:
>>>>>>>>>>>> On 13/09/2016 1:30 PM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I could reproduce and I added a comment to JBS:
>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8165869?focusedCommentId=14000623&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14000623
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> In case of BasicTests.java, I think it is a test bug.
>>>>>>>>>>>>
>>>>>>>>>>>> I don't agree as yet. I have not been able to isolate the exact
>>>>>>>>>>>> difference between what happens with your fix and what happens
>>>>>>>>>>>> without.
>>>>>>>>>>>> I know it relates to the presence of the error code, but also
>>>>>>>>>>>> how we
>>>>>>>>>>>> get
>>>>>>>>>>>> AgentInitializationException instead of AgentLoadException. I
>>>>>>>>>>>> need
>>>>>>>>>>>> to be
>>>>>>>>>>>> able to run the test outside of jtreg but that is proving to be
>>>>>>>>>>>> very
>>>>>>>>>>>> difficult to arrange. :(
>>>>>>>>>>>
>>>>>>>>>>> I finally managed to connect all the pieces on this.
>>>>>>>>>>>
>>>>>>>>>>> The basic problem is that with the proposed change
>>>>>>>>>>> VirtualMachineImpl.execute() will throw AgentLoadException,
>>>>>>>>>>> which
>>>>>>>>>>> then
>>>>>>>>>>> leads to the InternalError. Without the change we reach this
>>>>>>>>>>> block in
>>>>>>>>>>> HotspotVirtualMachine.loadAgentLibrary:
>>>>>>>>>>>
>>>>>>>>>>> int result = readInt(in);
>>>>>>>>>>> if (result != 0) {
>>>>>>>>>>>    throw new AgentInitializationException("Agent_OnAttach
>>>>>>>>>>> failed",
>>>>>>>>>>> result);
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> and so get the expected AgentInitializationException.
>>>>>>>>>>>
>>>>>>>>>>> Looking at the proposed change in jvmtiExport.cpp if we have the
>>>>>>>>>>> original:
>>>>>>>>>>>
>>>>>>>>>>>       st->print_cr("%d", result);
>>>>>>>>>>>       result = JNI_OK;
>>>>>>>>>>>
>>>>>>>>>>> then execute() manages to read a zero completion status from the
>>>>>>>>>>> stream,
>>>>>>>>>>> and then loadAgentLibrary is able to read the actual "result" -
>>>>>>>>>>> whether
>>>>>>>>>>> zero or not - from the stream. This is because the
>>>>>>>>>>> AttachListener
>>>>>>>>>>> code
>>>>>>>>>>> calls op->complete(result, &st) where st is the stream where we
>>>>>>>>>>> wrote
>>>>>>>>>>> the result value above in print_cr. Then if we look at, for
>>>>>>>>>>> example,
>>>>>>>>>>> LinuxAttachOperation::complete, we will write "result" to the
>>>>>>>>>>> socket
>>>>>>>>>>> first, followed by the contents of st. Hence on a successful
>>>>>>>>>>> operation
>>>>>>>>>>> we can read 0 for execute, and then 0 for loadAgent. On error we
>>>>>>>>>>> read 0
>>>>>>>>>>> for execute and the actual error code for loadAgent. With the
>>>>>>>>>>> proposed
>>>>>>>>>>> change execute() sees the real result (instead of JNI_OK) and so
>>>>>>>>>>> throws
>>>>>>>>>>> AgentLoadException.
>>>>>>>>>>>
>>>>>>>>>>> So in summary there are two different error indicators written
>>>>>>>>>>> into
>>>>>>>>>>> the
>>>>>>>>>>> stream, and we need the first to be zero unless some major error
>>>>>>>>>>> with
>>>>>>>>>>> the actual attach functionality occurred - otherwise even if the
>>>>>>>>>>> "load"
>>>>>>>>>>> failed in some other way, it is treated as a secondary error.
>>>>>>>>>>>
>>>>>>>>>>> With that in mind I suspect the real fix for the original issue
>>>>>>>>>>> is to
>>>>>>>>>>> have Dcmd recognize that it also has to read two "results".
>>>>>>>>>>> Though I'm
>>>>>>>>>>> not sure how exactly.
>>>>>>>>>>>
>>>>>>>>>>> David
>>>>>>>>>>> -----
>>>>>>>>>>>
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> If it is accepted, I will upload webrev for this redo task.
>>>>>>>>>>>>> However, I cannot access (and fix) Oracle internal test. Can
>>>>>>>>>>>>> someone
>>>>>>>>>>>>> help me?
>>>>>>>>>>>>> (I cannot access JPRT. So I need a sponsor.)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2016/09/13 9:00, David Holmes wrote:
>>>>>>>>>>>>>> I think I see the problem. The attach framework tries to
>>>>>>>>>>>>>> load the
>>>>>>>>>>>>>> "instrument" agent library. Prior to 8164913 if this fails it
>>>>>>>>>>>>>> actually
>>>>>>>>>>>>>> appears to succeed. Now it fails, and that error propagates
>>>>>>>>>>>>>> through.
>>>>>>>>>>>>>> I'm guessing, at the moment, that the failure is due to a
>>>>>>>>>>>>>> missing
>>>>>>>>>>>>>> module related option.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 13/09/2016 9:54 AM, David Holmes wrote:
>>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 13/09/2016 9:23 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hmm, it has been backouted...
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I agree to David. This error seems to be a test bug.
>>>>>>>>>>>>>>>> Can you share the test? I want to evaluate it.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sorry we can't share the tests. I took a quick look and it
>>>>>>>>>>>>>>> seems it
>>>>>>>>>>>>>>> may
>>>>>>>>>>>>>>> be related to the result code parsing (that I thought we
>>>>>>>>>>>>>>> ended up
>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>> touching!).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Can you run a test of HotSpotVirtualMachine.loadAgent(null)
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> see
>>>>>>>>>>>>>>> what
>>>>>>>>>>>>>>> happens? We see AgentLoadException from the code that
>>>>>>>>>>>>>>> internally
>>>>>>>>>>>>>>> tries
>>>>>>>>>>>>>>> to load the "instrument" agent:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Exception in thread "main" Failure: Unexpected exception
>>>>>>>>>>>>>>> during
>>>>>>>>>>>>>>> test
>>>>>>>>>>>>>>> execution: java.lang.InternalError: instrument library is
>>>>>>>>>>>>>>> missing in
>>>>>>>>>>>>>>> target VM
>>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>> Caused by: java.lang.InternalError: instrument library is
>>>>>>>>>>>>>>> missing in
>>>>>>>>>>>>>>> target VM
>>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>> Caused by: com.sun.tools.attach.AgentLoadException:
>>>>>>>>>>>>>>> Failed to
>>>>>>>>>>>>>>> load
>>>>>>>>>>>>>>> agent
>>>>>>>>>>>>>>> library
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I do not agree to fix this bug in JDK 10 because VM
>>>>>>>>>>>>>>>> might lie
>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> JVMTI agent could not be attached. IMHO this bug should be
>>>>>>>>>>>>>>>> fixed
>>>>>>>>>>>>>>>> in 9
>>>>>>>>>>>>>>>> GA, and we should fix testcase(s).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I agree. It has to be noted that "we" failed to run all the
>>>>>>>>>>>>>>> appropriate
>>>>>>>>>>>>>>> tests before pushing this, which would have discovered the
>>>>>>>>>>>>>>> problem -
>>>>>>>>>>>>>>> assuming it is actually a problem with the change and not an
>>>>>>>>>>>>>>> unrelated
>>>>>>>>>>>>>>> issue in our testing environment.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Unfortunately I have some high priority tasks to handle
>>>>>>>>>>>>>>> right
>>>>>>>>>>>>>>> now,
>>>>>>>>>>>>>>> so I
>>>>>>>>>>>>>>> can't go into this in any more depth at the moment.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 2016/09/13 6:15, David Holmes wrote:
>>>>>>>>>>>>>>>>> For the record it looks like the tests involved are
>>>>>>>>>>>>>>>>> broken, in
>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>> because the VM used to lie about the success of attaching
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> agent, the
>>>>>>>>>>>>>>>>> tests expected different exceptions to occur.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 13/09/2016 3:11 AM, harold seigel wrote:
>>>>>>>>>>>>>>>>>> Looks good.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Harold
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 9/12/2016 1:05 PM, Christian Tornqvist wrote:
>>>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Please review this (clean) backout of the change for
>>>>>>>>>>>>>>>>>>> JDK-8164913, it
>>>>>>>>>>>>>>>>>>> failed
>>>>>>>>>>>>>>>>>>> several tests in the nightly testing. The failures are
>>>>>>>>>>>>>>>>>>> tracked in:
>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8165869
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ctornqvi/webrev/8165881/webrev.00/
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Bug:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8165881
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Christian
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>
>>>>


More information about the hotspot-dev mailing list