RFR 8075484:SocketInputStream.socketRead0 can hang even with soTimeout set

Chris Hegarty chris.hegarty at oracle.com
Mon Sep 5 15:07:14 UTC 2016


On 05/09/16 15:37, Mark Sheppard wrote:
>
> if the desire is to avoid making double calls on gettimeofday in the
> NET_ReadWithTimeout's while loop for its main call flow,

Yes, the desire is to make no more calls to gettimeofday, than are
already done.

> then consider a modification to NET_Timeout and create a version
> int NET_TimeoutWithCurrentTime(int s, long timeout, stuct timeval *
> current_time)
>
> int NET_TimeoutWithCurrentTime(int s, long timeout, stuct timeval *
> current_time) {
>     int result;
>     struct timeval t;
>     long prevtime, newtime;
>     struct pollfd pfd;
>     pfd.fd = s;
>     pfd.events = POLLIN;
>
>     if (timeout > 0) {
>         prevtime = (current_time->tv_sec * 1000)  +
> current_time->tv_usec / 1000;
>     }
>
>     for(;;) {
>         result = poll(&pfd, 1, timeout);
>         if (result < 0 && errno == EINTR) {
>             if (timeout > 0) {
>                 gettimeofday(&t, NULL);
>                 newtime = (t.tv_sec * 1000)  +  t.tv_usec /1000;
>                 timeout -= newtime - prevtime;
>                 if (timeout <= 0)
>                     return 0;
>                 prevtime = newtime;
>             }
>         } else {
>             return result;
>         }
>     }
> }
>
> the NET_ReadWithTimeout is modified with.
>
>    while (timeout > 0) {
>         result = NET_TimeoutWithCurrentTime(fd, timeout, &t);
>
> in any case there is the potential for multiple invocation of
> gettimeofday due to a need to make
> poll restartable,

Yes, and that is fine. Just no more than is already there.

  and adjust the originally specified timeout
> accordingly, and for the edge case
> after the non blocking read.

After an initial skim, your suggestion seems reasonable.

-Chris.

> regards
> Mark
>
>
>
> On 05/09/2016 12:02, Chris Hegarty wrote:
>> Vyom,
>>
>> >>>
>> webrev(http://cr.openjdk.java.net/~vtewari/8075484/webrev0.1/index.html
>>
>> There is some concern about the potential performance effect, etc,
>> of gettimeofday, maybe there is a way out of this. The reuse of
>> NET_Timeout is good, but it also calls gettimeofday. It seems that
>> a specific NET_ReadWithTimeout could be written to NOT reuse
>> NET_Timeout, but effectively inline its interruptible operation.
>> Or write a variant of NET_Timeout that takes a function to
>> execute. Rather than effectively two loops conditioned on the
>> timeout.  Disclaimer: I have not actually tried to do this, but
>> it seems worth considering / evaluating.
>>
>> -Chris.
>>
>> On 02/09/16 04:39, Vyom Tewari wrote:
>>> hi Dimitry,
>>>
>>> thanks for review, I did consider to use  a monotonically increasing
>>> clock like "clock_gettime" but  existing nearby code("NET_Timeout") uses
>>> "gettimeofday"  so i choose to be consistent with the existing code.
>>>
>>> Thanks,
>>>
>>> Vyom
>>>
>>>
>>> On Friday 02 September 2016 01:38 AM, Dmitry Samersoff wrote:
>>>> Vyom,
>>>>
>>>> Did you consider to use select() to calculate timeout instead of
>>>> gettimeofday ?
>>>>
>>>> gettimeofday is affected by system time changes, so running ntpd can
>>>> cause unpredictable behavior of this code. Also it's rather expensive
>>>> syscall.
>>>>
>>>> -Dmitry
>>>>
>>>> On 2016-09-01 19:03, Vyom Tewari wrote:
>>>>> please find the updated
>>>>> webrev(http://cr.openjdk.java.net/~vtewari/8075484/webrev0.1/index.html
>>>>>
>>>>> <http://cr.openjdk.java.net/%7Evtewari/8075484/webrev0.1/index.html>).
>>>>> I
>>>>> incorporated the review comments.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Vyom
>>>>>
>>>>>
>>>>> On Tuesday 30 August 2016 04:11 PM, Mark Sheppard wrote:
>>>>>> Hi
>>>>>>    perhaps there is an opportunity to do  some refactoring here (...
>>>>>> for me a "goto " carries a code smell! )
>>>>>>
>>>>>> along the lines
>>>>>>
>>>>>> if (timeout) {
>>>>>>      nread =  NET_ReadWithTimeout(...);
>>>>>> } else {
>>>>>>       nread = NET_Read(...);
>>>>>> }
>>>>>>
>>>>>>
>>>>>> the NET_ReadWithTimeout (...) function will contain a
>>>>>> restructuring of
>>>>>> your goto loop
>>>>>> while (_timeout > 0) { nread = NET_Timeout(fd, _timeout);
>>>>>>            if (nread <= 0) {
>>>>>>                if (nread == 0) {
>>>>>>                    JNU_ThrowByName(env, JNU_JAVANETPKG
>>>>>> "SocketTimeoutException",
>>>>>>                                "Read timed out");
>>>>>>                } else if (nread == -1) {
>>>>>>                    if (errno == EBADF) {
>>>>>>                         JNU_ThrowByName(env, JNU_JAVANETPKG
>>>>>> "SocketException", "Socket closed");
>>>>>>                    } else if (errno == ENOMEM) {
>>>>>>                        JNU_ThrowOutOfMemoryError(env, "NET_Timeout
>>>>>> native heap allocation failed");
>>>>>>                    } else {
>>>>>> JNU_ThrowByNameWithMessageAndLastError
>>>>>>                            (env, JNU_JAVANETPKG "SocketException",
>>>>>> "select/poll failed");
>>>>>>                    }
>>>>>>                }
>>>>>>                  // release buffer in main call flow
>>>>>> //              if (bufP != BUF) {
>>>>>> //                  free(bufP);
>>>>>> //             }
>>>>>>               nread = -1;
>>>>>>               break;
>>>>>>            } else {
>>>>>> nread = NET_NonBlockingRead(fd, bufP, len);
>>>>>> if (nread == -1 && ((errno == EAGAIN) || (errno == EWOULDBLOCK))) {
>>>>>>                gettimeofday(&t, NULL);
>>>>>> newtime = t.tv_sec * 1000 + t.tv_usec / 1000;
>>>>>> _timeout -= newtime - prevtime;
>>>>>> if(_timeout > 0){
>>>>>> prevtime = newtime;
>>>>>>                    }
>>>>>> } else { break; } } } return nread;
>>>>>>
>>>>>> e&oe
>>>>>>
>>>>>> regards
>>>>>> Mark
>>>>>>
>>>>>>
>>>>>> On 29/08/2016 10:58, Vyom Tewari wrote:
>>>>>>> gentle reminder, please review the below code change.
>>>>>>>
>>>>>>> Vyom
>>>>>>>
>>>>>>>
>>>>>>> On Monday 22 August 2016 05:12 PM, Vyom Tewari wrote:
>>>>>>>> Hi All,
>>>>>>>>
>>>>>>>> Please review the code changes for below issue.
>>>>>>>>
>>>>>>>> Bug         : https://bugs.openjdk.java.net/browse/JDK-8075484
>>>>>>>>
>>>>>>>> webrev    :
>>>>>>>> http://cr.openjdk.java.net/~vtewari/8075484/webrev0.0/index.html
>>>>>>>> <http://cr.openjdk.java.net/%7Evtewari/8075484/webrev0.0/index.html>
>>>>>>>>
>>>>>>>>
>>>>>>>> This issue is SocketInputStream.socketread0() hangs even with
>>>>>>>> "soTimeout" set.the implementation of
>>>>>>>> Java_java_net_SocketInputStream_socketRead0 assumes that read()
>>>>>>>> won't block after poll() reports that a read is possible.
>>>>>>>>
>>>>>>>> This assumption does not hold, as noted on the man page for select
>>>>>>>> (referenced by the man page for poll): Under Linux, select() may
>>>>>>>> report a socket file descriptor as "ready for reading", while
>>>>>>>> nevertheless a subsequent read blocks. This could for example
>>>>>>>> happen
>>>>>>>> when data has arrived but upon examination has wrong checksum
>>>>>>>> and is
>>>>>>>> discarded.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Vyom
>>>>>>>>
>>>>>>>>
>>>>
>>>
>


More information about the net-dev mailing list