RFR 8075484:SocketInputStream.socketRead0 can hang even with soTimeout set

Mark Sheppard mark.sheppard at oracle.com
Mon Sep 5 14:37:54 UTC 2016


if the desire is to avoid making double calls on gettimeofday in the 
NET_ReadWithTimeout's while loop for its main call flow,
then consider a modification to NET_Timeout and create a version
int NET_TimeoutWithCurrentTime(int s, long timeout, stuct timeval * 
current_time)

int NET_TimeoutWithCurrentTime(int s, long timeout, stuct timeval * 
current_time) {
     int result;
     struct timeval t;
     long prevtime, newtime;
     struct pollfd pfd;
     pfd.fd = s;
     pfd.events = POLLIN;

     if (timeout > 0) {
         prevtime = (current_time->tv_sec * 1000)  + 
current_time->tv_usec / 1000;
     }

     for(;;) {
         result = poll(&pfd, 1, timeout);
         if (result < 0 && errno == EINTR) {
             if (timeout > 0) {
                 gettimeofday(&t, NULL);
                 newtime = (t.tv_sec * 1000)  +  t.tv_usec /1000;
                 timeout -= newtime - prevtime;
                 if (timeout <= 0)
                     return 0;
                 prevtime = newtime;
             }
         } else {
             return result;
         }
     }
}

the NET_ReadWithTimeout is modified with.

    while (timeout > 0) {
         result = NET_TimeoutWithCurrentTime(fd, timeout, &t);

in any case there is the potential for multiple invocation of 
gettimeofday due to a need to make
poll restartable, and adjust the originally specified timeout 
accordingly, and for the edge case
after the non blocking read.

regards
Mark



On 05/09/2016 12:02, Chris Hegarty wrote:
> Vyom,
>
> >>> 
> webrev(http://cr.openjdk.java.net/~vtewari/8075484/webrev0.1/index.html
>
> There is some concern about the potential performance effect, etc,
> of gettimeofday, maybe there is a way out of this. The reuse of
> NET_Timeout is good, but it also calls gettimeofday. It seems that
> a specific NET_ReadWithTimeout could be written to NOT reuse
> NET_Timeout, but effectively inline its interruptible operation.
> Or write a variant of NET_Timeout that takes a function to
> execute. Rather than effectively two loops conditioned on the
> timeout.  Disclaimer: I have not actually tried to do this, but
> it seems worth considering / evaluating.
>
> -Chris.
>
> On 02/09/16 04:39, Vyom Tewari wrote:
>> hi Dimitry,
>>
>> thanks for review, I did consider to use  a monotonically increasing
>> clock like "clock_gettime" but  existing nearby code("NET_Timeout") uses
>> "gettimeofday"  so i choose to be consistent with the existing code.
>>
>> Thanks,
>>
>> Vyom
>>
>>
>> On Friday 02 September 2016 01:38 AM, Dmitry Samersoff wrote:
>>> Vyom,
>>>
>>> Did you consider to use select() to calculate timeout instead of
>>> gettimeofday ?
>>>
>>> gettimeofday is affected by system time changes, so running ntpd can
>>> cause unpredictable behavior of this code. Also it's rather expensive
>>> syscall.
>>>
>>> -Dmitry
>>>
>>> On 2016-09-01 19:03, Vyom Tewari wrote:
>>>> please find the updated
>>>> webrev(http://cr.openjdk.java.net/~vtewari/8075484/webrev0.1/index.html 
>>>>
>>>> <http://cr.openjdk.java.net/%7Evtewari/8075484/webrev0.1/index.html>). 
>>>> I
>>>> incorporated the review comments.
>>>>
>>>> Thanks,
>>>>
>>>> Vyom
>>>>
>>>>
>>>> On Tuesday 30 August 2016 04:11 PM, Mark Sheppard wrote:
>>>>> Hi
>>>>>    perhaps there is an opportunity to do  some refactoring here (...
>>>>> for me a "goto " carries a code smell! )
>>>>>
>>>>> along the lines
>>>>>
>>>>> if (timeout) {
>>>>>      nread =  NET_ReadWithTimeout(...);
>>>>> } else {
>>>>>       nread = NET_Read(...);
>>>>> }
>>>>>
>>>>>
>>>>> the NET_ReadWithTimeout (...) function will contain a 
>>>>> restructuring of
>>>>> your goto loop
>>>>> while (_timeout > 0) { nread = NET_Timeout(fd, _timeout);
>>>>>            if (nread <= 0) {
>>>>>                if (nread == 0) {
>>>>>                    JNU_ThrowByName(env, JNU_JAVANETPKG
>>>>> "SocketTimeoutException",
>>>>>                                "Read timed out");
>>>>>                } else if (nread == -1) {
>>>>>                    if (errno == EBADF) {
>>>>>                         JNU_ThrowByName(env, JNU_JAVANETPKG
>>>>> "SocketException", "Socket closed");
>>>>>                    } else if (errno == ENOMEM) {
>>>>>                        JNU_ThrowOutOfMemoryError(env, "NET_Timeout
>>>>> native heap allocation failed");
>>>>>                    } else {
>>>>> JNU_ThrowByNameWithMessageAndLastError
>>>>>                            (env, JNU_JAVANETPKG "SocketException",
>>>>> "select/poll failed");
>>>>>                    }
>>>>>                }
>>>>>                  // release buffer in main call flow
>>>>> //              if (bufP != BUF) {
>>>>> //                  free(bufP);
>>>>> //             }
>>>>>               nread = -1;
>>>>>               break;
>>>>>            } else {
>>>>> nread = NET_NonBlockingRead(fd, bufP, len);
>>>>> if (nread == -1 && ((errno == EAGAIN) || (errno == EWOULDBLOCK))) {
>>>>>                gettimeofday(&t, NULL);
>>>>> newtime = t.tv_sec * 1000 + t.tv_usec / 1000;
>>>>> _timeout -= newtime - prevtime;
>>>>> if(_timeout > 0){
>>>>> prevtime = newtime;
>>>>>                    }
>>>>> } else { break; } } } return nread;
>>>>>
>>>>> e&oe
>>>>>
>>>>> regards
>>>>> Mark
>>>>>
>>>>>
>>>>> On 29/08/2016 10:58, Vyom Tewari wrote:
>>>>>> gentle reminder, please review the below code change.
>>>>>>
>>>>>> Vyom
>>>>>>
>>>>>>
>>>>>> On Monday 22 August 2016 05:12 PM, Vyom Tewari wrote:
>>>>>>> Hi All,
>>>>>>>
>>>>>>> Please review the code changes for below issue.
>>>>>>>
>>>>>>> Bug         : https://bugs.openjdk.java.net/browse/JDK-8075484
>>>>>>>
>>>>>>> webrev    :
>>>>>>> http://cr.openjdk.java.net/~vtewari/8075484/webrev0.0/index.html
>>>>>>> <http://cr.openjdk.java.net/%7Evtewari/8075484/webrev0.0/index.html> 
>>>>>>>
>>>>>>>
>>>>>>> This issue is SocketInputStream.socketread0() hangs even with
>>>>>>> "soTimeout" set.the implementation of
>>>>>>> Java_java_net_SocketInputStream_socketRead0 assumes that read()
>>>>>>> won't block after poll() reports that a read is possible.
>>>>>>>
>>>>>>> This assumption does not hold, as noted on the man page for select
>>>>>>> (referenced by the man page for poll): Under Linux, select() may
>>>>>>> report a socket file descriptor as "ready for reading", while
>>>>>>> nevertheless a subsequent read blocks. This could for example 
>>>>>>> happen
>>>>>>> when data has arrived but upon examination has wrong checksum 
>>>>>>> and is
>>>>>>> discarded.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Vyom
>>>>>>>
>>>>>>>
>>>
>>



More information about the net-dev mailing list