RFR(XS): 8165881 - Backout JDK-8164913

Yasumasa Suenaga yasuenag at gmail.com
Tue Oct 25 09:29:45 UTC 2016


Hi Dmitry,

> It's hard to fix this issue without access to Oracle testing infrastructure.

Agree.
I'm not an Oracle employee. Thus I cannot access it.

Should I change the Assignee on JBS (JDK-8165869) to you?


Thanks,

Yasumasa


On 2016/10/25 15:09, Dmitry Samersoff wrote:
> Yasumasa,
>
> It's hard to fix this issue without access to Oracle testing infrastructure.
>
> I'm working on the issue but unfortunately I was caught by some internal
> problems so it progresses slowly than I expected.
>
> I'm sorry about it.
>
> -Dmitry
>
>
> On 2016-10-24 13:24, Yasumasa Suenaga wrote:
>> Hi Dmitry, David,
>>
>> I want to resume to work for this issue because this issue has been
>> marked P3.
>> Dmitry, do you have any idea for it?
>>
>> IMHO, we can fix as below:
>>
>>   http://cr.openjdk.java.net/~ysuenaga/JDK-8165869/webrev.00/
>>
>> According to David, JvmtiExport::load_agent_library() should return zero
>> when the operation which is kicked by AttachListener is succeeded.
>> AttachListener will write response code from
>> JvmtiExport::load_agent_library() at first, and the client (e.g.
>> HotSpotVirtualMachine.java) checks it.
>>
>> Thus I keep current implementation about return code, and I added the
>> code to send error message to the client.
>> It works fine with jdk/test/com/sun/tools/attach/BasicTests.java .
>>
>> What do you think about it?
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2016/09/15 20:08, Dmitry Samersoff wrote:
>>> Yasumasa, David,
>>>
>>> I'm looking into this issue and come back as soon as I have a clear
>>> picture.
>>>
>>> It takes some time. Sorry!
>>>
>>> -Dmitry
>>>
>>> On 2016-09-15 14:01, Yasumasa Suenaga wrote:
>>>> Hi David,
>>>>
>>>> I agree the call sequence which you write in the case of "jcmd"
>>>> operation.
>>>> However, we should think about "load" operation in AttachListener.
>>>>
>>>> We can access JvmtiExport::load_agent_library() through two routes:
>>>>
>>>>  Route "load": AttachListener -> JvmtiExport::load_agent_library()
>>>>  Route "jcmd": AttachListener -> jcmd() in attachListener.cpp ->
>>>> DCmd::parse_and_execute()
>>>>
>>>> "load" sets error code (it means first data in the stream) when
>>>> JvmtiExport::load_agent_library() returns JNI_ERR.
>>>> OTOH "jcmd" sets error code if exception is occurred ONLY.
>>>>
>>>> If we try to load invalid JVMTI agent, "load" writes JNI_ERR to stream,
>>>> however "jcmd" writes JNI_OK because pending exception is not available
>>>> at this point.
>>>> Currently, JCmd.java shows "Command executed successfully" when it
>>>> cannot read the message from the socket. In case of this issue,
>>>> JVMTIAgentLoadDCmd does not write any message because it cannot attach
>>>> the agent.
>>>> If return code which is in the top of stream should mean whether
>>>> communication with AttachListener is succeeded, I think HotSpot should
>>>> write error message or error code to the stream for JCmd.java .
>>>>
>>>> I'm not sure this email is the answer for you.
>>>> Sorry for my English.
>>>>
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2016/09/15 18:43, David Holmes wrote:
>>>>> Hi Yasumasa,
>>>>>
>>>>> On 15/09/2016 1:56 PM, Yasumasa Suenaga wrote:
>>>>>> Hi,
>>>>>>
>>>>>>> If this is done through jcmd then jcmd needs to know how to "unwrap"
>>>>>>> the information it gets back from the Dcmd.
>>>>>>
>>>>>> In case of JVMTI.agent_load dcmd, I think we should be able to get the
>>>>>> response as below:
>>>>>>
>>>>>>   1. Invalid JVMTI agent
>>>>>>   2. Agent_OnAttach() did not return JNI_OK
>>>>>>   3. Succeeded
>>>>>>
>>>>>> Currently, JvmtiExport::load_agent_library() returns JNI_OK to caller,
>>>>>> and jcmd() in attachListener.cpp returns JNI_ERR if exception is
>>>>>> occurred (in Agent_OnAttach()).
>>>>>
>>>>> Can you respond to my comment in the bug report about the chain of
>>>>> calls and explain exactly where you think things are going wrong in
>>>>> this scenario. I'm having a lot of trouble trying to see how the
>>>>> information flows back through the layers.
>>>>>
>>>>>> IMHO, HotSpot should return error code (in top of the stream:
>>>>>> written by
>>>>>> AttachListener at first). So we should be able to get some error
>>>>>> code in
>>>>>> case 1. and 2. Then the client (jcmd or other methods) can decide to
>>>>>> parse text information in the stream.
>>>>>
>>>>> It returns the high-level response code first "did communication with
>>>>> the target VM succeed", then the actual action response code. That
>>>>> seems the right thing to do to me. It is up to the callers reading the
>>>>> stream to understand how to read it. To me the issue lies somewhere on
>>>>> the jcmd/dcmd side.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>>
>>>>>> Sorry for my English.
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2016/09/14 7:03, David Holmes wrote:
>>>>>>> On 13/09/2016 10:31 PM, Yasumasa Suenaga wrote:
>>>>>>>> Thanks David!
>>>>>>>> If we should not change the meaning of return code from
>>>>>>>> JvmtiExport::load_agent_library(), we should print return code as
>>>>>>>> below:
>>>>>>>> -------------------
>>>>>>>> diff -r 0cf03b9d9b1f src/share/vm/prims/jvmtiExport.cpp
>>>>>>>> --- a/src/share/vm/prims/jvmtiExport.cpp    Mon Sep 12 18:59:13 2016
>>>>>>>> +0000
>>>>>>>> +++ b/src/share/vm/prims/jvmtiExport.cpp    Tue Sep 13 21:12:14 2016
>>>>>>>> +0900
>>>>>>>> @@ -2412,6 +2412,10 @@
>>>>>>>>        result = JNI_OK;
>>>>>>>>      }
>>>>>>>>    }
>>>>>>>> + // Print error code if Agent_OnAttach cannot be executed
>>>>>>>> + if (result != JNI_OK) {
>>>>>>>> + st->print_cr("%d", result);
>>>>>>>> + }
>>>>>>>>    return result;
>>>>>>>>  }
>>>>>>>> -------------------
>>>>>>>
>>>>>>> Not sure I see the point. "return result" will put the error code
>>>>>>> into
>>>>>>> the socket stream and that error will be seen and responded to. I
>>>>>>> don't expect anything to then read further into the stream to see the
>>>>>>> "result" repeated a second time. In other words if execute() doesn't
>>>>>>> see a zero result then you go no further; if execute sees a zero
>>>>>>> result then the actual action may have had an issue so you proceed to
>>>>>>> read the "action's result".
>>>>>>>
>>>>>>>> It can pass com/sun/tools/attach/BasicTests.java, and we can get
>>>>>>>> -1 if
>>>>>>>> we try to attach invalid agent.
>>>>>>>
>>>>>>> Can you please outline your test case for this again. If this is done
>>>>>>> through jcmd then jcmd needs to know how to "unwrap" the information
>>>>>>> it gets back from the Dcmd.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2016/09/13 17:06, Dmitry Samersoff wrote:
>>>>>>>>> David,
>>>>>>>>>
>>>>>>>>> Thank you for the evaluation.
>>>>>>>>>
>>>>>>>>>> With that in mind I suspect the real fix for the original issue
>>>>>>>>>> is to
>>>>>>>>>> have Dcmd recognize that it also has to read two "results".
>>>>>>>>>> Though I'm
>>>>>>>>>> not sure how exactly.
>>>>>>>>>
>>>>>>>>> This logic seems completely broken for me. But I don't see
>>>>>>>>> anything we
>>>>>>>>> can do right now - for jdk 9.
>>>>>>>>>
>>>>>>>>> It requires careful changes of both - code and tests.
>>>>>>>>>
>>>>>>>>> I can help with this task but not today.
>>>>>>>>>
>>>>>>>>> -Dmitry
>>>>>>>>>
>>>>>>>>> On 2016-09-13 08:08, David Holmes wrote:
>>>>>>>>>> On 13/09/2016 1:53 PM, David Holmes wrote:
>>>>>>>>>>> On 13/09/2016 1:30 PM, Yasumasa Suenaga wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> I could reproduce and I added a comment to JBS:
>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8165869?focusedCommentId=14000623&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14000623
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> In case of BasicTests.java, I think it is a test bug.
>>>>>>>>>>>
>>>>>>>>>>> I don't agree as yet. I have not been able to isolate the exact
>>>>>>>>>>> difference between what happens with your fix and what happens
>>>>>>>>>>> without.
>>>>>>>>>>> I know it relates to the presence of the error code, but also
>>>>>>>>>>> how we
>>>>>>>>>>> get
>>>>>>>>>>> AgentInitializationException instead of AgentLoadException. I
>>>>>>>>>>> need
>>>>>>>>>>> to be
>>>>>>>>>>> able to run the test outside of jtreg but that is proving to be
>>>>>>>>>>> very
>>>>>>>>>>> difficult to arrange. :(
>>>>>>>>>>
>>>>>>>>>> I finally managed to connect all the pieces on this.
>>>>>>>>>>
>>>>>>>>>> The basic problem is that with the proposed change
>>>>>>>>>> VirtualMachineImpl.execute() will throw AgentLoadException, which
>>>>>>>>>> then
>>>>>>>>>> leads to the InternalError. Without the change we reach this
>>>>>>>>>> block in
>>>>>>>>>> HotspotVirtualMachine.loadAgentLibrary:
>>>>>>>>>>
>>>>>>>>>> int result = readInt(in);
>>>>>>>>>> if (result != 0) {
>>>>>>>>>>    throw new AgentInitializationException("Agent_OnAttach failed",
>>>>>>>>>> result);
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> and so get the expected AgentInitializationException.
>>>>>>>>>>
>>>>>>>>>> Looking at the proposed change in jvmtiExport.cpp if we have the
>>>>>>>>>> original:
>>>>>>>>>>
>>>>>>>>>>       st->print_cr("%d", result);
>>>>>>>>>>       result = JNI_OK;
>>>>>>>>>>
>>>>>>>>>> then execute() manages to read a zero completion status from the
>>>>>>>>>> stream,
>>>>>>>>>> and then loadAgentLibrary is able to read the actual "result" -
>>>>>>>>>> whether
>>>>>>>>>> zero or not - from the stream. This is because the AttachListener
>>>>>>>>>> code
>>>>>>>>>> calls op->complete(result, &st) where st is the stream where we
>>>>>>>>>> wrote
>>>>>>>>>> the result value above in print_cr. Then if we look at, for
>>>>>>>>>> example,
>>>>>>>>>> LinuxAttachOperation::complete, we will write "result" to the
>>>>>>>>>> socket
>>>>>>>>>> first, followed by the contents of st. Hence on a successful
>>>>>>>>>> operation
>>>>>>>>>> we can read 0 for execute, and then 0 for loadAgent. On error we
>>>>>>>>>> read 0
>>>>>>>>>> for execute and the actual error code for loadAgent. With the
>>>>>>>>>> proposed
>>>>>>>>>> change execute() sees the real result (instead of JNI_OK) and so
>>>>>>>>>> throws
>>>>>>>>>> AgentLoadException.
>>>>>>>>>>
>>>>>>>>>> So in summary there are two different error indicators written
>>>>>>>>>> into
>>>>>>>>>> the
>>>>>>>>>> stream, and we need the first to be zero unless some major error
>>>>>>>>>> with
>>>>>>>>>> the actual attach functionality occurred - otherwise even if the
>>>>>>>>>> "load"
>>>>>>>>>> failed in some other way, it is treated as a secondary error.
>>>>>>>>>>
>>>>>>>>>> With that in mind I suspect the real fix for the original issue
>>>>>>>>>> is to
>>>>>>>>>> have Dcmd recognize that it also has to read two "results".
>>>>>>>>>> Though I'm
>>>>>>>>>> not sure how exactly.
>>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>> -----
>>>>>>>>>>
>>>>>>>>>>> David
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> If it is accepted, I will upload webrev for this redo task.
>>>>>>>>>>>> However, I cannot access (and fix) Oracle internal test. Can
>>>>>>>>>>>> someone
>>>>>>>>>>>> help me?
>>>>>>>>>>>> (I cannot access JPRT. So I need a sponsor.)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 2016/09/13 9:00, David Holmes wrote:
>>>>>>>>>>>>> I think I see the problem. The attach framework tries to
>>>>>>>>>>>>> load the
>>>>>>>>>>>>> "instrument" agent library. Prior to 8164913 if this fails it
>>>>>>>>>>>>> actually
>>>>>>>>>>>>> appears to succeed. Now it fails, and that error propagates
>>>>>>>>>>>>> through.
>>>>>>>>>>>>> I'm guessing, at the moment, that the failure is due to a
>>>>>>>>>>>>> missing
>>>>>>>>>>>>> module related option.
>>>>>>>>>>>>>
>>>>>>>>>>>>> David
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 13/09/2016 9:54 AM, David Holmes wrote:
>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 13/09/2016 9:23 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hmm, it has been backouted...
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I agree to David. This error seems to be a test bug.
>>>>>>>>>>>>>>> Can you share the test? I want to evaluate it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sorry we can't share the tests. I took a quick look and it
>>>>>>>>>>>>>> seems it
>>>>>>>>>>>>>> may
>>>>>>>>>>>>>> be related to the result code parsing (that I thought we
>>>>>>>>>>>>>> ended up
>>>>>>>>>>>>>> not
>>>>>>>>>>>>>> touching!).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can you run a test of HotSpotVirtualMachine.loadAgent(null)
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>> see
>>>>>>>>>>>>>> what
>>>>>>>>>>>>>> happens? We see AgentLoadException from the code that
>>>>>>>>>>>>>> internally
>>>>>>>>>>>>>> tries
>>>>>>>>>>>>>> to load the "instrument" agent:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Exception in thread "main" Failure: Unexpected exception
>>>>>>>>>>>>>> during
>>>>>>>>>>>>>> test
>>>>>>>>>>>>>> execution: java.lang.InternalError: instrument library is
>>>>>>>>>>>>>> missing in
>>>>>>>>>>>>>> target VM
>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>> Caused by: java.lang.InternalError: instrument library is
>>>>>>>>>>>>>> missing in
>>>>>>>>>>>>>> target VM
>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>> Caused by: com.sun.tools.attach.AgentLoadException: Failed to
>>>>>>>>>>>>>> load
>>>>>>>>>>>>>> agent
>>>>>>>>>>>>>> library
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I do not agree to fix this bug in JDK 10 because VM might lie
>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> JVMTI agent could not be attached. IMHO this bug should be
>>>>>>>>>>>>>>> fixed
>>>>>>>>>>>>>>> in 9
>>>>>>>>>>>>>>> GA, and we should fix testcase(s).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I agree. It has to be noted that "we" failed to run all the
>>>>>>>>>>>>>> appropriate
>>>>>>>>>>>>>> tests before pushing this, which would have discovered the
>>>>>>>>>>>>>> problem -
>>>>>>>>>>>>>> assuming it is actually a problem with the change and not an
>>>>>>>>>>>>>> unrelated
>>>>>>>>>>>>>> issue in our testing environment.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Unfortunately I have some high priority tasks to handle right
>>>>>>>>>>>>>> now,
>>>>>>>>>>>>>> so I
>>>>>>>>>>>>>> can't go into this in any more depth at the moment.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> David
>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 2016/09/13 6:15, David Holmes wrote:
>>>>>>>>>>>>>>>> For the record it looks like the tests involved are
>>>>>>>>>>>>>>>> broken, in
>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>> because the VM used to lie about the success of attaching
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> agent, the
>>>>>>>>>>>>>>>> tests expected different exceptions to occur.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 13/09/2016 3:11 AM, harold seigel wrote:
>>>>>>>>>>>>>>>>> Looks good.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Harold
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 9/12/2016 1:05 PM, Christian Tornqvist wrote:
>>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Please review this (clean) backout of the change for
>>>>>>>>>>>>>>>>>> JDK-8164913, it
>>>>>>>>>>>>>>>>>> failed
>>>>>>>>>>>>>>>>>> several tests in the nightly testing. The failures are
>>>>>>>>>>>>>>>>>> tracked in:
>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8165869
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ctornqvi/webrev/8165881/webrev.00/
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Bug:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8165881
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Christian
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>
>>>
>
>


More information about the hotspot-dev mailing list