RFR(XS): 8165881 - Backout JDK-8164913

Dmitry Samersoff dmitry.samersoff at oracle.com
Tue Oct 25 06:09:02 UTC 2016


Yasumasa,

It's hard to fix this issue without access to Oracle testing infrastructure.

I'm working on the issue but unfortunately I was caught by some internal
problems so it progresses slowly than I expected.

I'm sorry about it.

-Dmitry


On 2016-10-24 13:24, Yasumasa Suenaga wrote:
> Hi Dmitry, David,
> 
> I want to resume to work for this issue because this issue has been
> marked P3.
> Dmitry, do you have any idea for it?
> 
> IMHO, we can fix as below:
> 
>   http://cr.openjdk.java.net/~ysuenaga/JDK-8165869/webrev.00/
> 
> According to David, JvmtiExport::load_agent_library() should return zero
> when the operation which is kicked by AttachListener is succeeded.
> AttachListener will write response code from
> JvmtiExport::load_agent_library() at first, and the client (e.g.
> HotSpotVirtualMachine.java) checks it.
> 
> Thus I keep current implementation about return code, and I added the
> code to send error message to the client.
> It works fine with jdk/test/com/sun/tools/attach/BasicTests.java .
> 
> What do you think about it?
> 
> 
> Thanks,
> 
> Yasumasa
> 
> 
> On 2016/09/15 20:08, Dmitry Samersoff wrote:
>> Yasumasa, David,
>>
>> I'm looking into this issue and come back as soon as I have a clear
>> picture.
>>
>> It takes some time. Sorry!
>>
>> -Dmitry
>>
>> On 2016-09-15 14:01, Yasumasa Suenaga wrote:
>>> Hi David,
>>>
>>> I agree the call sequence which you write in the case of "jcmd"
>>> operation.
>>> However, we should think about "load" operation in AttachListener.
>>>
>>> We can access JvmtiExport::load_agent_library() through two routes:
>>>
>>>  Route "load": AttachListener -> JvmtiExport::load_agent_library()
>>>  Route "jcmd": AttachListener -> jcmd() in attachListener.cpp ->
>>> DCmd::parse_and_execute()
>>>
>>> "load" sets error code (it means first data in the stream) when
>>> JvmtiExport::load_agent_library() returns JNI_ERR.
>>> OTOH "jcmd" sets error code if exception is occurred ONLY.
>>>
>>> If we try to load invalid JVMTI agent, "load" writes JNI_ERR to stream,
>>> however "jcmd" writes JNI_OK because pending exception is not available
>>> at this point.
>>> Currently, JCmd.java shows "Command executed successfully" when it
>>> cannot read the message from the socket. In case of this issue,
>>> JVMTIAgentLoadDCmd does not write any message because it cannot attach
>>> the agent.
>>> If return code which is in the top of stream should mean whether
>>> communication with AttachListener is succeeded, I think HotSpot should
>>> write error message or error code to the stream for JCmd.java .
>>>
>>> I'm not sure this email is the answer for you.
>>> Sorry for my English.
>>>
>>>
>>> Yasumasa
>>>
>>>
>>> On 2016/09/15 18:43, David Holmes wrote:
>>>> Hi Yasumasa,
>>>>
>>>> On 15/09/2016 1:56 PM, Yasumasa Suenaga wrote:
>>>>> Hi,
>>>>>
>>>>>> If this is done through jcmd then jcmd needs to know how to "unwrap"
>>>>>> the information it gets back from the Dcmd.
>>>>>
>>>>> In case of JVMTI.agent_load dcmd, I think we should be able to get the
>>>>> response as below:
>>>>>
>>>>>   1. Invalid JVMTI agent
>>>>>   2. Agent_OnAttach() did not return JNI_OK
>>>>>   3. Succeeded
>>>>>
>>>>> Currently, JvmtiExport::load_agent_library() returns JNI_OK to caller,
>>>>> and jcmd() in attachListener.cpp returns JNI_ERR if exception is
>>>>> occurred (in Agent_OnAttach()).
>>>>
>>>> Can you respond to my comment in the bug report about the chain of
>>>> calls and explain exactly where you think things are going wrong in
>>>> this scenario. I'm having a lot of trouble trying to see how the
>>>> information flows back through the layers.
>>>>
>>>>> IMHO, HotSpot should return error code (in top of the stream:
>>>>> written by
>>>>> AttachListener at first). So we should be able to get some error
>>>>> code in
>>>>> case 1. and 2. Then the client (jcmd or other methods) can decide to
>>>>> parse text information in the stream.
>>>>
>>>> It returns the high-level response code first "did communication with
>>>> the target VM succeed", then the actual action response code. That
>>>> seems the right thing to do to me. It is up to the callers reading the
>>>> stream to understand how to read it. To me the issue lies somewhere on
>>>> the jcmd/dcmd side.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>>
>>>>> Sorry for my English.
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2016/09/14 7:03, David Holmes wrote:
>>>>>> On 13/09/2016 10:31 PM, Yasumasa Suenaga wrote:
>>>>>>> Thanks David!
>>>>>>> If we should not change the meaning of return code from
>>>>>>> JvmtiExport::load_agent_library(), we should print return code as
>>>>>>> below:
>>>>>>> -------------------
>>>>>>> diff -r 0cf03b9d9b1f src/share/vm/prims/jvmtiExport.cpp
>>>>>>> --- a/src/share/vm/prims/jvmtiExport.cpp    Mon Sep 12 18:59:13 2016
>>>>>>> +0000
>>>>>>> +++ b/src/share/vm/prims/jvmtiExport.cpp    Tue Sep 13 21:12:14 2016
>>>>>>> +0900
>>>>>>> @@ -2412,6 +2412,10 @@
>>>>>>>        result = JNI_OK;
>>>>>>>      }
>>>>>>>    }
>>>>>>> + // Print error code if Agent_OnAttach cannot be executed
>>>>>>> + if (result != JNI_OK) {
>>>>>>> + st->print_cr("%d", result);
>>>>>>> + }
>>>>>>>    return result;
>>>>>>>  }
>>>>>>> -------------------
>>>>>>
>>>>>> Not sure I see the point. "return result" will put the error code
>>>>>> into
>>>>>> the socket stream and that error will be seen and responded to. I
>>>>>> don't expect anything to then read further into the stream to see the
>>>>>> "result" repeated a second time. In other words if execute() doesn't
>>>>>> see a zero result then you go no further; if execute sees a zero
>>>>>> result then the actual action may have had an issue so you proceed to
>>>>>> read the "action's result".
>>>>>>
>>>>>>> It can pass com/sun/tools/attach/BasicTests.java, and we can get
>>>>>>> -1 if
>>>>>>> we try to attach invalid agent.
>>>>>>
>>>>>> Can you please outline your test case for this again. If this is done
>>>>>> through jcmd then jcmd needs to know how to "unwrap" the information
>>>>>> it gets back from the Dcmd.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>> On 2016/09/13 17:06, Dmitry Samersoff wrote:
>>>>>>>> David,
>>>>>>>>
>>>>>>>> Thank you for the evaluation.
>>>>>>>>
>>>>>>>>> With that in mind I suspect the real fix for the original issue
>>>>>>>>> is to
>>>>>>>>> have Dcmd recognize that it also has to read two "results".
>>>>>>>>> Though I'm
>>>>>>>>> not sure how exactly.
>>>>>>>>
>>>>>>>> This logic seems completely broken for me. But I don't see
>>>>>>>> anything we
>>>>>>>> can do right now - for jdk 9.
>>>>>>>>
>>>>>>>> It requires careful changes of both - code and tests.
>>>>>>>>
>>>>>>>> I can help with this task but not today.
>>>>>>>>
>>>>>>>> -Dmitry
>>>>>>>>
>>>>>>>> On 2016-09-13 08:08, David Holmes wrote:
>>>>>>>>> On 13/09/2016 1:53 PM, David Holmes wrote:
>>>>>>>>>> On 13/09/2016 1:30 PM, Yasumasa Suenaga wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I could reproduce and I added a comment to JBS:
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8165869?focusedCommentId=14000623&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14000623
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> In case of BasicTests.java, I think it is a test bug.
>>>>>>>>>>
>>>>>>>>>> I don't agree as yet. I have not been able to isolate the exact
>>>>>>>>>> difference between what happens with your fix and what happens
>>>>>>>>>> without.
>>>>>>>>>> I know it relates to the presence of the error code, but also
>>>>>>>>>> how we
>>>>>>>>>> get
>>>>>>>>>> AgentInitializationException instead of AgentLoadException. I
>>>>>>>>>> need
>>>>>>>>>> to be
>>>>>>>>>> able to run the test outside of jtreg but that is proving to be
>>>>>>>>>> very
>>>>>>>>>> difficult to arrange. :(
>>>>>>>>>
>>>>>>>>> I finally managed to connect all the pieces on this.
>>>>>>>>>
>>>>>>>>> The basic problem is that with the proposed change
>>>>>>>>> VirtualMachineImpl.execute() will throw AgentLoadException, which
>>>>>>>>> then
>>>>>>>>> leads to the InternalError. Without the change we reach this
>>>>>>>>> block in
>>>>>>>>> HotspotVirtualMachine.loadAgentLibrary:
>>>>>>>>>
>>>>>>>>> int result = readInt(in);
>>>>>>>>> if (result != 0) {
>>>>>>>>>    throw new AgentInitializationException("Agent_OnAttach failed",
>>>>>>>>> result);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> and so get the expected AgentInitializationException.
>>>>>>>>>
>>>>>>>>> Looking at the proposed change in jvmtiExport.cpp if we have the
>>>>>>>>> original:
>>>>>>>>>
>>>>>>>>>       st->print_cr("%d", result);
>>>>>>>>>       result = JNI_OK;
>>>>>>>>>
>>>>>>>>> then execute() manages to read a zero completion status from the
>>>>>>>>> stream,
>>>>>>>>> and then loadAgentLibrary is able to read the actual "result" -
>>>>>>>>> whether
>>>>>>>>> zero or not - from the stream. This is because the AttachListener
>>>>>>>>> code
>>>>>>>>> calls op->complete(result, &st) where st is the stream where we
>>>>>>>>> wrote
>>>>>>>>> the result value above in print_cr. Then if we look at, for
>>>>>>>>> example,
>>>>>>>>> LinuxAttachOperation::complete, we will write "result" to the
>>>>>>>>> socket
>>>>>>>>> first, followed by the contents of st. Hence on a successful
>>>>>>>>> operation
>>>>>>>>> we can read 0 for execute, and then 0 for loadAgent. On error we
>>>>>>>>> read 0
>>>>>>>>> for execute and the actual error code for loadAgent. With the
>>>>>>>>> proposed
>>>>>>>>> change execute() sees the real result (instead of JNI_OK) and so
>>>>>>>>> throws
>>>>>>>>> AgentLoadException.
>>>>>>>>>
>>>>>>>>> So in summary there are two different error indicators written
>>>>>>>>> into
>>>>>>>>> the
>>>>>>>>> stream, and we need the first to be zero unless some major error
>>>>>>>>> with
>>>>>>>>> the actual attach functionality occurred - otherwise even if the
>>>>>>>>> "load"
>>>>>>>>> failed in some other way, it is treated as a secondary error.
>>>>>>>>>
>>>>>>>>> With that in mind I suspect the real fix for the original issue
>>>>>>>>> is to
>>>>>>>>> have Dcmd recognize that it also has to read two "results".
>>>>>>>>> Though I'm
>>>>>>>>> not sure how exactly.
>>>>>>>>>
>>>>>>>>> David
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> If it is accepted, I will upload webrev for this redo task.
>>>>>>>>>>> However, I cannot access (and fix) Oracle internal test. Can
>>>>>>>>>>> someone
>>>>>>>>>>> help me?
>>>>>>>>>>> (I cannot access JPRT. So I need a sponsor.)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2016/09/13 9:00, David Holmes wrote:
>>>>>>>>>>>> I think I see the problem. The attach framework tries to
>>>>>>>>>>>> load the
>>>>>>>>>>>> "instrument" agent library. Prior to 8164913 if this fails it
>>>>>>>>>>>> actually
>>>>>>>>>>>> appears to succeed. Now it fails, and that error propagates
>>>>>>>>>>>> through.
>>>>>>>>>>>> I'm guessing, at the moment, that the failure is due to a
>>>>>>>>>>>> missing
>>>>>>>>>>>> module related option.
>>>>>>>>>>>>
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>> On 13/09/2016 9:54 AM, David Holmes wrote:
>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 13/09/2016 9:23 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hmm, it has been backouted...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I agree to David. This error seems to be a test bug.
>>>>>>>>>>>>>> Can you share the test? I want to evaluate it.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sorry we can't share the tests. I took a quick look and it
>>>>>>>>>>>>> seems it
>>>>>>>>>>>>> may
>>>>>>>>>>>>> be related to the result code parsing (that I thought we
>>>>>>>>>>>>> ended up
>>>>>>>>>>>>> not
>>>>>>>>>>>>> touching!).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can you run a test of HotSpotVirtualMachine.loadAgent(null)
>>>>>>>>>>>>> and
>>>>>>>>>>>>> see
>>>>>>>>>>>>> what
>>>>>>>>>>>>> happens? We see AgentLoadException from the code that
>>>>>>>>>>>>> internally
>>>>>>>>>>>>> tries
>>>>>>>>>>>>> to load the "instrument" agent:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Exception in thread "main" Failure: Unexpected exception
>>>>>>>>>>>>> during
>>>>>>>>>>>>> test
>>>>>>>>>>>>> execution: java.lang.InternalError: instrument library is
>>>>>>>>>>>>> missing in
>>>>>>>>>>>>> target VM
>>>>>>>>>>>>> ...
>>>>>>>>>>>>> Caused by: java.lang.InternalError: instrument library is
>>>>>>>>>>>>> missing in
>>>>>>>>>>>>> target VM
>>>>>>>>>>>>> ...
>>>>>>>>>>>>> Caused by: com.sun.tools.attach.AgentLoadException: Failed to
>>>>>>>>>>>>> load
>>>>>>>>>>>>> agent
>>>>>>>>>>>>> library
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I do not agree to fix this bug in JDK 10 because VM might lie
>>>>>>>>>>>>>> when
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> JVMTI agent could not be attached. IMHO this bug should be
>>>>>>>>>>>>>> fixed
>>>>>>>>>>>>>> in 9
>>>>>>>>>>>>>> GA, and we should fix testcase(s).
>>>>>>>>>>>>>
>>>>>>>>>>>>> I agree. It has to be noted that "we" failed to run all the
>>>>>>>>>>>>> appropriate
>>>>>>>>>>>>> tests before pushing this, which would have discovered the
>>>>>>>>>>>>> problem -
>>>>>>>>>>>>> assuming it is actually a problem with the change and not an
>>>>>>>>>>>>> unrelated
>>>>>>>>>>>>> issue in our testing environment.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Unfortunately I have some high priority tasks to handle right
>>>>>>>>>>>>> now,
>>>>>>>>>>>>> so I
>>>>>>>>>>>>> can't go into this in any more depth at the moment.
>>>>>>>>>>>>>
>>>>>>>>>>>>> David
>>>>>>>>>>>>> -----
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 2016/09/13 6:15, David Holmes wrote:
>>>>>>>>>>>>>>> For the record it looks like the tests involved are
>>>>>>>>>>>>>>> broken, in
>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>> because the VM used to lie about the success of attaching
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> agent, the
>>>>>>>>>>>>>>> tests expected different exceptions to occur.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 13/09/2016 3:11 AM, harold seigel wrote:
>>>>>>>>>>>>>>>> Looks good.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Harold
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 9/12/2016 1:05 PM, Christian Tornqvist wrote:
>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Please review this (clean) backout of the change for
>>>>>>>>>>>>>>>>> JDK-8164913, it
>>>>>>>>>>>>>>>>> failed
>>>>>>>>>>>>>>>>> several tests in the nightly testing. The failures are
>>>>>>>>>>>>>>>>> tracked in:
>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8165869
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ctornqvi/webrev/8165881/webrev.00/
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Bug:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8165881
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Christian
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>
>>


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.


More information about the hotspot-dev mailing list