RFR(XS): 8165881 - Backout JDK-8164913

David Holmes david.holmes at oracle.com
Tue Sep 13 22:03:27 UTC 2016


On 13/09/2016 10:31 PM, Yasumasa Suenaga wrote:
> Thanks David!
> If we should not change the meaning of return code from
> JvmtiExport::load_agent_library(), we should print return code as below:
> -------------------
> diff -r 0cf03b9d9b1f src/share/vm/prims/jvmtiExport.cpp
> --- a/src/share/vm/prims/jvmtiExport.cpp    Mon Sep 12 18:59:13 2016 +0000
> +++ b/src/share/vm/prims/jvmtiExport.cpp    Tue Sep 13 21:12:14 2016 +0900
> @@ -2412,6 +2412,10 @@
>        result = JNI_OK;
>      }
>    }
> + // Print error code if Agent_OnAttach cannot be executed
> + if (result != JNI_OK) {
> + st->print_cr("%d", result);
> + }
>    return result;
>  }
> -------------------

Not sure I see the point. "return result" will put the error code into 
the socket stream and that error will be seen and responded to. I don't 
expect anything to then read further into the stream to see the "result" 
repeated a second time. In other words if execute() doesn't see a zero 
result then you go no further; if execute sees a zero result then the 
actual action may have had an issue so you proceed to read the "action's 
result".

> It can pass com/sun/tools/attach/BasicTests.java, and we can get -1 if
> we try to attach invalid agent.

Can you please outline your test case for this again. If this is done 
through jcmd then jcmd needs to know how to "unwrap" the information it 
gets back from the Dcmd.

Thanks,
David

>
> Yasumasa
>
>
> On 2016/09/13 17:06, Dmitry Samersoff wrote:
>> David,
>>
>> Thank you for the evaluation.
>>
>>> With that in mind I suspect the real fix for the original issue is to
>>> have Dcmd recognize that it also has to read two "results". Though I'm
>>> not sure how exactly.
>>
>> This logic seems completely broken for me. But I don't see anything we
>> can do right now - for jdk 9.
>>
>> It requires careful changes of both - code and tests.
>>
>> I can help with this task but not today.
>>
>> -Dmitry
>>
>> On 2016-09-13 08:08, David Holmes wrote:
>>> On 13/09/2016 1:53 PM, David Holmes wrote:
>>>> On 13/09/2016 1:30 PM, Yasumasa Suenaga wrote:
>>>>> Hi,
>>>>>
>>>>> I could reproduce and I added a comment to JBS:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8165869?focusedCommentId=14000623&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14000623
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> In case of BasicTests.java, I think it is a test bug.
>>>>
>>>> I don't agree as yet. I have not been able to isolate the exact
>>>> difference between what happens with your fix and what happens without.
>>>> I know it relates to the presence of the error code, but also how we
>>>> get
>>>> AgentInitializationException instead of AgentLoadException. I need
>>>> to be
>>>> able to run the test outside of jtreg but that is proving to be very
>>>> difficult to arrange. :(
>>>
>>> I finally managed to connect all the pieces on this.
>>>
>>> The basic problem is that with the proposed change
>>> VirtualMachineImpl.execute() will throw AgentLoadException, which then
>>> leads to the InternalError. Without the change we reach this block in
>>> HotspotVirtualMachine.loadAgentLibrary:
>>>
>>> int result = readInt(in);
>>> if (result != 0) {
>>>    throw new AgentInitializationException("Agent_OnAttach failed",
>>> result);
>>> }
>>>
>>> and so get the expected AgentInitializationException.
>>>
>>> Looking at the proposed change in jvmtiExport.cpp if we have the
>>> original:
>>>
>>>       st->print_cr("%d", result);
>>>       result = JNI_OK;
>>>
>>> then execute() manages to read a zero completion status from the stream,
>>> and then loadAgentLibrary is able to read the actual "result" - whether
>>> zero or not - from the stream. This is because the AttachListener code
>>> calls op->complete(result, &st) where st is the stream where we wrote
>>> the result value above in print_cr. Then if we look at, for example,
>>> LinuxAttachOperation::complete, we will write "result" to the socket
>>> first, followed by the contents of st. Hence on a successful operation
>>> we can read 0 for execute, and then 0 for loadAgent. On error we read 0
>>> for execute and the actual error code for loadAgent. With the proposed
>>> change execute() sees the real result (instead of JNI_OK) and so throws
>>> AgentLoadException.
>>>
>>> So in summary there are two different error indicators written into the
>>> stream, and we need the first to be zero unless some major error with
>>> the actual attach functionality occurred - otherwise even if the "load"
>>> failed in some other way, it is treated as a secondary error.
>>>
>>> With that in mind I suspect the real fix for the original issue is to
>>> have Dcmd recognize that it also has to read two "results". Though I'm
>>> not sure how exactly.
>>>
>>> David
>>> -----
>>>
>>>> David
>>>>
>>>>
>>>>
>>>>> If it is accepted, I will upload webrev for this redo task.
>>>>> However, I cannot access (and fix) Oracle internal test. Can someone
>>>>> help me?
>>>>> (I cannot access JPRT. So I need a sponsor.)
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2016/09/13 9:00, David Holmes wrote:
>>>>>> I think I see the problem. The attach framework tries to load the
>>>>>> "instrument" agent library. Prior to 8164913 if this fails it
>>>>>> actually
>>>>>> appears to succeed. Now it fails, and that error propagates through.
>>>>>> I'm guessing, at the moment, that the failure is due to a missing
>>>>>> module related option.
>>>>>>
>>>>>> David
>>>>>>
>>>>>> On 13/09/2016 9:54 AM, David Holmes wrote:
>>>>>>> Hi Yasumasa,
>>>>>>>
>>>>>>> On 13/09/2016 9:23 AM, Yasumasa Suenaga wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Hmm, it has been backouted...
>>>>>>>>
>>>>>>>> I agree to David. This error seems to be a test bug.
>>>>>>>> Can you share the test? I want to evaluate it.
>>>>>>>
>>>>>>> Sorry we can't share the tests. I took a quick look and it seems it
>>>>>>> may
>>>>>>> be related to the result code parsing (that I thought we ended up
>>>>>>> not
>>>>>>> touching!).
>>>>>>>
>>>>>>> Can you run a test of HotSpotVirtualMachine.loadAgent(null) and see
>>>>>>> what
>>>>>>> happens? We see AgentLoadException from the code that internally
>>>>>>> tries
>>>>>>> to load the "instrument" agent:
>>>>>>>
>>>>>>> Exception in thread "main" Failure: Unexpected exception during test
>>>>>>> execution: java.lang.InternalError: instrument library is missing in
>>>>>>> target VM
>>>>>>> ...
>>>>>>> Caused by: java.lang.InternalError: instrument library is missing in
>>>>>>> target VM
>>>>>>> ...
>>>>>>> Caused by: com.sun.tools.attach.AgentLoadException: Failed to load
>>>>>>> agent
>>>>>>> library
>>>>>>>
>>>>>>>
>>>>>>>> I do not agree to fix this bug in JDK 10 because VM might lie when
>>>>>>>> the
>>>>>>>> JVMTI agent could not be attached. IMHO this bug should be fixed
>>>>>>>> in 9
>>>>>>>> GA, and we should fix testcase(s).
>>>>>>>
>>>>>>> I agree. It has to be noted that "we" failed to run all the
>>>>>>> appropriate
>>>>>>> tests before pushing this, which would have discovered the problem -
>>>>>>> assuming it is actually a problem with the change and not an
>>>>>>> unrelated
>>>>>>> issue in our testing environment.
>>>>>>>
>>>>>>> Unfortunately I have some high priority tasks to handle right now,
>>>>>>> so I
>>>>>>> can't go into this in any more depth at the moment.
>>>>>>>
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2016/09/13 6:15, David Holmes wrote:
>>>>>>>>> For the record it looks like the tests involved are broken, in
>>>>>>>>> that
>>>>>>>>> because the VM used to lie about the success of attaching the
>>>>>>>>> agent, the
>>>>>>>>> tests expected different exceptions to occur.
>>>>>>>>>
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>> On 13/09/2016 3:11 AM, harold seigel wrote:
>>>>>>>>>> Looks good.
>>>>>>>>>>
>>>>>>>>>> Harold
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 9/12/2016 1:05 PM, Christian Tornqvist wrote:
>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Please review this (clean) backout of the change for
>>>>>>>>>>> JDK-8164913, it
>>>>>>>>>>> failed
>>>>>>>>>>> several tests in the nightly testing. The failures are
>>>>>>>>>>> tracked in:
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8165869
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Webrev:
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~ctornqvi/webrev/8165881/webrev.00/
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Bug:
>>>>>>>>>>>
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8165881
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Christian
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>
>>


More information about the hotspot-dev mailing list