RFR 8170541: serviceability/jdwp/AllModulesCommandTest.java fails intermittently on Windows and Solaris

Chris Plummer chris.plummer at oracle.com
Mon Feb 26 23:06:26 UTC 2018


On 2/26/18 3:00 PM, daniil.x.titov at oracle.com wrote:
>
>
> On 2/26/18 12:16 PM, Chris Plummer wrote:
>> On 2/26/18 11:51 AM, daniil.x.titov at oracle.com wrote:
>>> Hi David and Sergei,
>>>
>>> On 2/20/18 10:16 PM, serguei.spitsyn at oracle.com wrote:
>>>> Hi David,
>>>>
>>>>
>>>> On 2/20/18 20:02, David Holmes wrote:
>>>>> Hi Daniil,
>>>>>
>>>>> Good find on this!
>>>>>
>>>>> What does the actual spec say about the length of things and how 
>>>>> they may be split across multiple packets? Are we guaranteed that 
>>>>> at most two packets will be involved?
>>>
>>> The JDWP spec 
>>> (https://docs.oracle.com/javase/9/docs/specs/jdwp/jdwp-spec.html) 
>>> says nothing about splitting JDWP reply packets at all but the 
>>> implementation limits the max number of the sent packets to two 
>>> packets max. The implementation is dated back to the initial load 
>>> that happened in 2007 and the information about the related Jira 
>>> issue is missing.
>>>
>>> open/src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c
>>>
>>> 836    data = packet->type.cmd.data;
>>> 837    /* Do one send for short packets, two for longer ones */
>>> 838    if (data_len <= MAX_DATA_SIZE) {
>>> 839        memcpy(header + JDWP_HEADER_SIZE, data, data_len);
>>> 840        if (send_fully(socketFD, (char *)&header, 
>>> JDWP_HEADER_SIZE + data_len) !=
>>> 841            JDWP_HEADER_SIZE + data_len) {
>>> 842            RETURN_IO_ERROR("send failed");
>>> 843        }
>>> 844    } else {
>>> 845        memcpy(header + JDWP_HEADER_SIZE, data, MAX_DATA_SIZE);
>>> 846        if (send_fully(socketFD, (char *)&header, 
>>> JDWP_HEADER_SIZE + MAX_DATA_SIZE) !=
>>> 847            JDWP_HEADER_SIZE + MAX_DATA_SIZE) {
>>> 848            RETURN_IO_ERROR("send failed");
>>> 849        }
>>> 850        /* Send the remaining data bytes right out of the data 
>>> area. */
>>> 851        if (send_fully(socketFD, (char *)data + MAX_DATA_SIZE,
>>> 852                       data_len - MAX_DATA_SIZE) != data_len - 
>>> MAX_DATA_SIZE) {
>>> 853            RETURN_IO_ERROR("send failed");
>>> 854        }
>>> 855    }
>>>
>> Curious. First packet is limited to MAX_DATA_SIZE, 2nd packet has no 
>> size limit. What's the point then of splitting it then? Is there a 
>> desire to get the header transmitted in a smaller packet.
>>
>> Chris
>
> It looks as the goal was to somehow improve the responsiveness in case 
> of the large data but I am not sure about this. I could not locate any 
> traces in Jira related to this implementation.
I was thinking it might be something like that too. Get the header 
across the wire quickly. Maybe the user just wants the header (with size 
info) initially, and will allocate a large buffer for the rest if necessary.

Chris
> Probably Serguei has some info what is the history behind this design.
>>>>>
>>>>>  68     protected byte[] readJdwpString(DataInputStream ds) throws 
>>>>> IOException {
>>>>>   69         byte[] str = null;
>>>>>   70         int len = ds.readInt();
>>>>>   71         if (len > 0) {
>>>>>   72             str = new byte[len];
>>>>>   73             ds.read(str, 0, len);
>>>>>   74         }
>>>>>
>>>>> might we get a short-read of the string if it is split across 
>>>>> multiple packets?
>>>>
>>> This and all other reads happen not directly from the socket input 
>>> stream but rather from the  DataInputStream object that is 
>>> constructed in  JdwpReply.initFromStream(InputStream) method. With 
>>> the proposed fix we do ensure that the created DataInputStream 
>>> object contains data from both packets in cases when the reply was 
>>> split in two packets.
>>>
>>>> Nice catch!
>>>> Even though this fix is enough to resolve this problem now, there 
>>>> is a chance,
>>>> it can fail in the future when more modules are added to the platform.
>>>>
>>>>
>>>>> I'm wondering if all these reads should be loops, ensuring we read 
>>>>> the expected amount of data.
>>>>>
>>> Since the implementation of the socket transport limits the max 
>>> number of packets the reply might be split in to two packets I don't 
>>> think we really need it here.
>>>>> One further comment - not sure why we need the print out for when 
>>>>> we do read multiple packets?
>>>>> That would seem to be a debugging aid.
>>>>
>>>> Yes, it helps to understand what happens.
>>>> Many tests have a lack of tracing which makes it harder to debug 
>>>> and understand failures.
>>> That is correct.  This additional tracing was added to help to 
>>> understand the possible failures in the future.
>>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>> Thanks,
>>> Daniil
>>>
>>>>> On 21/02/2018 10:14 AM, Daniil Titov wrote:
>>>>>> Hi Serguei,
>>>>>>
>>>>>> A new version of the webrev that has these strings reformatted is 
>>>>>> at http://cr.openjdk.java.net/~dtitov/8170541/webrev.02/
>>>>>>
>>>>>> Thank you!
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> Daniil
>>>>>>
>>>>>> *From: *"serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
>>>>>> *Date: *Tuesday, February 20, 2018 at 3:00 PM
>>>>>> *To: *Daniil Titov <daniil.x.titov at oracle.com>, 
>>>>>> "serviceability-dev at openjdk.java.net" 
>>>>>> <serviceability-dev at openjdk.java.net>
>>>>>> *Subject: *Re: RFR 8170541: 
>>>>>> serviceability/jdwp/AllModulesCommandTest.java fails 
>>>>>> intermittently on Windows and Solaris
>>>>>>
>>>>>> Hi Daniil,
>>>>>>
>>>>>> Interesting issue...
>>>>>> Thank you for finding to the root cause so quickly!
>>>>>>
>>>>>> The fix looks good.
>>>>>> Could I ask you to reformat these lines to make the L54 shorter ?:
>>>>>>
>>>>>>    54                 System.out.println("[" + 
>>>>>> getClass().getName() + "] Only " + bytesRead + " bytes of " + 
>>>>>> dataLength +
>>>>>>
>>>>>>    55                         " were read in the first packet. 
>>>>>> Reading the rest...");
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>
>>>>>> On 2/20/18 09:24, Daniil Titov wrote:
>>>>>>
>>>>>>     Please review the changes that fix intermittent failure of
>>>>>>     serviceability/jdwp/AllModulesCommandTest.java test.
>>>>>>
>>>>>>     The problem here is that for a large data the JDWP agent
>>>>>>     (socketTransport_writePacket() method in
>>>>>> src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c )
>>>>>>     sends 2 packets and in some cases only the first packet is 
>>>>>> received
>>>>>>     at the time when the test reads the reply from the JDWP 
>>>>>> agent. Since
>>>>>>     the test does not check that all data is received in the first
>>>>>>     packet the correlation between commands and replies became 
>>>>>> broken
>>>>>>     (the unread second packet is read by the next command and the 
>>>>>> reply
>>>>>>     for the next command is read by the next after next command 
>>>>>> and so on).
>>>>>>
>>>>>>     Bug: https://bugs.openjdk.java.net/browse/JDK-8170541
>>>>>>
>>>>>>     Webrev: http://cr.openjdk.java.net/~dtitov/8170541/webrev.01
>>>>>>
>>>>>>     The tests ran successfully with Mach5.
>>>>>>
>>>>>>     Best regards,
>>>>>>
>>>>>>     Daniil
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>
>>
>>
>




More information about the serviceability-dev mailing list