RFR 8170541: serviceability/jdwp/AllModulesCommandTest.java fails intermittently on Windows and Solaris

daniil.x.titov at oracle.com daniil.x.titov at oracle.com
Mon Feb 26 23:00:29 UTC 2018



On 2/26/18 12:16 PM, Chris Plummer wrote:
> On 2/26/18 11:51 AM, daniil.x.titov at oracle.com wrote:
>> Hi David and Sergei,
>>
>> On 2/20/18 10:16 PM, serguei.spitsyn at oracle.com wrote:
>>> Hi David,
>>>
>>>
>>> On 2/20/18 20:02, David Holmes wrote:
>>>> Hi Daniil,
>>>>
>>>> Good find on this!
>>>>
>>>> What does the actual spec say about the length of things and how 
>>>> they may be split across multiple packets? Are we guaranteed that 
>>>> at most two packets will be involved?
>>
>> The JDWP spec 
>> (https://docs.oracle.com/javase/9/docs/specs/jdwp/jdwp-spec.html) 
>> says nothing about splitting JDWP reply packets at all but the 
>> implementation limits the max number of the sent packets to two 
>> packets max. The implementation is dated back to the initial load 
>> that happened in 2007 and the information about the related Jira 
>> issue is missing.
>>
>> open/src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c
>>
>> 836    data = packet->type.cmd.data;
>> 837    /* Do one send for short packets, two for longer ones */
>> 838    if (data_len <= MAX_DATA_SIZE) {
>> 839        memcpy(header + JDWP_HEADER_SIZE, data, data_len);
>> 840        if (send_fully(socketFD, (char *)&header, JDWP_HEADER_SIZE 
>> + data_len) !=
>> 841            JDWP_HEADER_SIZE + data_len) {
>> 842            RETURN_IO_ERROR("send failed");
>> 843        }
>> 844    } else {
>> 845        memcpy(header + JDWP_HEADER_SIZE, data, MAX_DATA_SIZE);
>> 846        if (send_fully(socketFD, (char *)&header, JDWP_HEADER_SIZE 
>> + MAX_DATA_SIZE) !=
>> 847            JDWP_HEADER_SIZE + MAX_DATA_SIZE) {
>> 848            RETURN_IO_ERROR("send failed");
>> 849        }
>> 850        /* Send the remaining data bytes right out of the data 
>> area. */
>> 851        if (send_fully(socketFD, (char *)data + MAX_DATA_SIZE,
>> 852                       data_len - MAX_DATA_SIZE) != data_len - 
>> MAX_DATA_SIZE) {
>> 853            RETURN_IO_ERROR("send failed");
>> 854        }
>> 855    }
>>
> Curious. First packet is limited to MAX_DATA_SIZE, 2nd packet has no 
> size limit. What's the point then of splitting it then? Is there a 
> desire to get the header transmitted in a smaller packet.
>
> Chris

It looks as the goal was to somehow improve the responsiveness in case 
of the large data but I am not sure about this. I could not locate any 
traces in Jira related to this implementation.
Probably Serguei has some info what is the history behind this design.
>>>>
>>>>  68     protected byte[] readJdwpString(DataInputStream ds) throws 
>>>> IOException {
>>>>   69         byte[] str = null;
>>>>   70         int len = ds.readInt();
>>>>   71         if (len > 0) {
>>>>   72             str = new byte[len];
>>>>   73             ds.read(str, 0, len);
>>>>   74         }
>>>>
>>>> might we get a short-read of the string if it is split across 
>>>> multiple packets?
>>>
>> This and all other reads happen not directly from the socket input 
>> stream but rather from the  DataInputStream object that is 
>> constructed in  JdwpReply.initFromStream(InputStream) method. With 
>> the proposed fix we do ensure that the created DataInputStream object 
>> contains data from both packets in cases when the reply was split in 
>> two packets.
>>
>>> Nice catch!
>>> Even though this fix is enough to resolve this problem now, there is 
>>> a chance,
>>> it can fail in the future when more modules are added to the platform.
>>>
>>>
>>>> I'm wondering if all these reads should be loops, ensuring we read 
>>>> the expected amount of data.
>>>>
>> Since the implementation of the socket transport limits the max 
>> number of packets the reply might be split in to two packets I don't 
>> think we really need it here.
>>>> One further comment - not sure why we need the print out for when 
>>>> we do read multiple packets?
>>>> That would seem to be a debugging aid.
>>>
>>> Yes, it helps to understand what happens.
>>> Many tests have a lack of tracing which makes it harder to debug and 
>>> understand failures.
>> That is correct.  This additional tracing was added to help to 
>> understand the possible failures in the future.
>>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>>>
>>>> Thanks,
>>>> David
>>>>
>> Thanks,
>> Daniil
>>
>>>> On 21/02/2018 10:14 AM, Daniil Titov wrote:
>>>>> Hi Serguei,
>>>>>
>>>>> A new version of the webrev that has these strings reformatted is 
>>>>> at http://cr.openjdk.java.net/~dtitov/8170541/webrev.02/
>>>>>
>>>>> Thank you!
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Daniil
>>>>>
>>>>> *From: *"serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
>>>>> *Date: *Tuesday, February 20, 2018 at 3:00 PM
>>>>> *To: *Daniil Titov <daniil.x.titov at oracle.com>, 
>>>>> "serviceability-dev at openjdk.java.net" 
>>>>> <serviceability-dev at openjdk.java.net>
>>>>> *Subject: *Re: RFR 8170541: 
>>>>> serviceability/jdwp/AllModulesCommandTest.java fails 
>>>>> intermittently on Windows and Solaris
>>>>>
>>>>> Hi Daniil,
>>>>>
>>>>> Interesting issue...
>>>>> Thank you for finding to the root cause so quickly!
>>>>>
>>>>> The fix looks good.
>>>>> Could I ask you to reformat these lines to make the L54 shorter ?:
>>>>>
>>>>>    54                 System.out.println("[" + 
>>>>> getClass().getName() + "] Only " + bytesRead + " bytes of " + 
>>>>> dataLength +
>>>>>
>>>>>    55                         " were read in the first packet. 
>>>>> Reading the rest...");
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 2/20/18 09:24, Daniil Titov wrote:
>>>>>
>>>>>     Please review the changes that fix intermittent failure of
>>>>>     serviceability/jdwp/AllModulesCommandTest.java test.
>>>>>
>>>>>     The problem here is that for a large data the JDWP agent
>>>>>     (socketTransport_writePacket() method in
>>>>> src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c )
>>>>>     sends 2 packets and in some cases only the first packet is 
>>>>> received
>>>>>     at the time when the test reads the reply from the JDWP agent. 
>>>>> Since
>>>>>     the test does not check that all data is received in the first
>>>>>     packet the correlation between commands and replies became broken
>>>>>     (the unread second packet is read by the next command and the 
>>>>> reply
>>>>>     for the next command is read by the next after next command 
>>>>> and so on).
>>>>>
>>>>>     Bug: https://bugs.openjdk.java.net/browse/JDK-8170541
>>>>>
>>>>>     Webrev: http://cr.openjdk.java.net/~dtitov/8170541/webrev.01
>>>>>
>>>>>     The tests ran successfully with Mach5.
>>>>>
>>>>>     Best regards,
>>>>>
>>>>>     Daniil
>>>>>
>>>>>
>>>>>
>>>
>>
>
>



More information about the serviceability-dev mailing list