RFR 8075484:SocketInputStream.socketRead0 can hang even with soTimeout set

Chris Hegarty chris.hegarty at oracle.com
Wed Sep 7 09:48:25 UTC 2016



On 07/09/16 07:45, Vyom Tewari wrote:
> Hi Chris,
>
> Please find the latest
> webrev(http://cr.openjdk.java.net/~vtewari/8075484/webrev0.3/index.html
> <http://cr.openjdk.java.net/%7Evtewari/8075484/webrev0.3/index.html>). I
> did some code refactoring, JPRT is in progress.

In terms of NET_Timeout**, I'm happy with this version, but I
have an additional comment.

If NET_ReadWithTimeout returns -1 because NET_TimeoutWithCurrentTime
returns <= 0, then a JNI pending exception will be set. So the code
calling NET_ReadWithTimeout should, a) free bufP, and b) return -1,
immediately rather than continuing and possibly trying to set another
JNI pending exception.

One option is to check the return value from NET_ReadWithTimeout and
also (*env)->ExceptionCheck(env). Another is to possibly consolidate the 
setting of JNI pending exceptions in one place, towards the end
of Java_java_net_SocketInputStream_socketRead0, for example EBADF is
handled in two places.

-Chris.

> I need help from some one to build and test latest changes on AIX, may
> be Christoph can help.
>
> Thanks,
>
> Vyom
>
>
> On Tuesday 06 September 2016 06:25 PM, Chris Hegarty wrote:
>> Vyom,
>>
>>> On 6 Sep 2016, at 10:20, Vyom Tewari <vyom.tewari at oracle.com> wrote:
>>>
>>> Hi,
>>>
>>> Please find the latest
>>> webrev(http://cr.openjdk.java.net/~vtewari/8075484/webrev0.2/index.html
>>> <http://cr.openjdk.java.net/%7Evtewari/8075484/webrev0.2/index.html>). I
>>> have incorporated the review comments.
>> Your changes completely avoid NET_Timeout, which is quite complex on
>> Linux, Mac, and AIX, for various reasons. ( NET_Timeout will use the
>> async close mechanism to signal/interrupt a thread blocked in a poll /
>> select ). Also, select is used on Mac, which will use poll after your
>> changes ( I do remember that there was a bug in poll on Mac around
>> the time of the original Mac port ).
>>
>> Would is be better, rather than replicating the logic in  NET_Timeout,
>> to have a version that supports passing a function pointer, to run if
>> the poll/select returns before the timeout? Just an idea.
>>
>> -Chris.
>>
>>> Thanks,
>>>
>>> Vyom
>>>
>>>
>>> On Monday 05 September 2016 08:37 PM, Chris Hegarty wrote:
>>>> On 05/09/16 15:37, Mark Sheppard wrote:
>>>>> if the desire is to avoid making double calls on gettimeofday in the
>>>>> NET_ReadWithTimeout's while loop for its main call flow,
>>>> Yes, the desire is to make no more calls to gettimeofday, than are
>>>> already done.
>>>>
>>>>> then consider a modification to NET_Timeout and create a version
>>>>> int NET_TimeoutWithCurrentTime(int s, long timeout, stuct timeval *
>>>>> current_time)
>>>>>
>>>>> int NET_TimeoutWithCurrentTime(int s, long timeout, stuct timeval *
>>>>> current_time) {
>>>>>     int result;
>>>>>     struct timeval t;
>>>>>     long prevtime, newtime;
>>>>>     struct pollfd pfd;
>>>>>     pfd.fd = s;
>>>>>     pfd.events = POLLIN;
>>>>>
>>>>>     if (timeout > 0) {
>>>>>         prevtime = (current_time->tv_sec * 1000)  +
>>>>> current_time->tv_usec / 1000;
>>>>>     }
>>>>>
>>>>>     for(;;) {
>>>>>         result = poll(&pfd, 1, timeout);
>>>>>         if (result < 0 && errno == EINTR) {
>>>>>             if (timeout > 0) {
>>>>>                 gettimeofday(&t, NULL);
>>>>>                 newtime = (t.tv_sec * 1000)  +  t.tv_usec /1000;
>>>>>                 timeout -= newtime - prevtime;
>>>>>                 if (timeout <= 0)
>>>>>                     return 0;
>>>>>                 prevtime = newtime;
>>>>>             }
>>>>>         } else {
>>>>>             return result;
>>>>>         }
>>>>>     }
>>>>> }
>>>>>
>>>>> the NET_ReadWithTimeout is modified with.
>>>>>
>>>>>    while (timeout > 0) {
>>>>>         result = NET_TimeoutWithCurrentTime(fd, timeout, &t);
>>>>>
>>>>> in any case there is the potential for multiple invocation of
>>>>> gettimeofday due to a need to make
>>>>> poll restartable,
>>>> Yes, and that is fine. Just no more than is already there.
>>>>
>>>> and adjust the originally specified timeout
>>>>> accordingly, and for the edge case
>>>>> after the non blocking read.
>>>> After an initial skim, your suggestion seems reasonable.
>>>>
>>>> -Chris.
>>>>
>>>>> regards
>>>>> Mark
>>>>>
>>>>>
>>>>>
>>>>> On 05/09/2016 12:02, Chris Hegarty wrote:
>>>>>> Vyom,
>>>>>>
>>>>>> webrev(http://cr.openjdk.java.net/~vtewari/8075484/webrev0.1/index.html
>>>>>>
>>>>>>
>>>>>> There is some concern about the potential performance effect, etc,
>>>>>> of gettimeofday, maybe there is a way out of this. The reuse of
>>>>>> NET_Timeout is good, but it also calls gettimeofday. It seems that
>>>>>> a specific NET_ReadWithTimeout could be written to NOT reuse
>>>>>> NET_Timeout, but effectively inline its interruptible operation.
>>>>>> Or write a variant of NET_Timeout that takes a function to
>>>>>> execute. Rather than effectively two loops conditioned on the
>>>>>> timeout.  Disclaimer: I have not actually tried to do this, but
>>>>>> it seems worth considering / evaluating.
>>>>>>
>>>>>> -Chris.
>>>>>>
>>>>>> On 02/09/16 04:39, Vyom Tewari wrote:
>>>>>>> hi Dimitry,
>>>>>>>
>>>>>>> thanks for review, I did consider to use  a monotonically increasing
>>>>>>> clock like "clock_gettime" but  existing nearby
>>>>>>> code("NET_Timeout") uses
>>>>>>> "gettimeofday"  so i choose to be consistent with the existing code.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Vyom
>>>>>>>
>>>>>>>
>>>>>>> On Friday 02 September 2016 01:38 AM, Dmitry Samersoff wrote:
>>>>>>>> Vyom,
>>>>>>>>
>>>>>>>> Did you consider to use select() to calculate timeout instead of
>>>>>>>> gettimeofday ?
>>>>>>>>
>>>>>>>> gettimeofday is affected by system time changes, so running ntpd
>>>>>>>> can
>>>>>>>> cause unpredictable behavior of this code. Also it's rather
>>>>>>>> expensive
>>>>>>>> syscall.
>>>>>>>>
>>>>>>>> -Dmitry
>>>>>>>>
>>>>>>>> On 2016-09-01 19:03, Vyom Tewari wrote:
>>>>>>>>> please find the updated
>>>>>>>>> webrev(http://cr.openjdk.java.net/~vtewari/8075484/webrev0.1/index.html
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> <http://cr.openjdk.java.net/%7Evtewari/8075484/webrev0.1/index.html>).
>>>>>>>>>
>>>>>>>>> I
>>>>>>>>> incorporated the review comments.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Vyom
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tuesday 30 August 2016 04:11 PM, Mark Sheppard wrote:
>>>>>>>>>> Hi
>>>>>>>>>>    perhaps there is an opportunity to do  some refactoring
>>>>>>>>>> here (...
>>>>>>>>>> for me a "goto " carries a code smell! )
>>>>>>>>>>
>>>>>>>>>> along the lines
>>>>>>>>>>
>>>>>>>>>> if (timeout) {
>>>>>>>>>>      nread =  NET_ReadWithTimeout(...);
>>>>>>>>>> } else {
>>>>>>>>>>       nread = NET_Read(...);
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> the NET_ReadWithTimeout (...) function will contain a
>>>>>>>>>> restructuring of
>>>>>>>>>> your goto loop
>>>>>>>>>> while (_timeout > 0) { nread = NET_Timeout(fd, _timeout);
>>>>>>>>>>            if (nread <= 0) {
>>>>>>>>>>                if (nread == 0) {
>>>>>>>>>>                    JNU_ThrowByName(env, JNU_JAVANETPKG
>>>>>>>>>> "SocketTimeoutException",
>>>>>>>>>>                                "Read timed out");
>>>>>>>>>>                } else if (nread == -1) {
>>>>>>>>>>                    if (errno == EBADF) {
>>>>>>>>>>                         JNU_ThrowByName(env, JNU_JAVANETPKG
>>>>>>>>>> "SocketException", "Socket closed");
>>>>>>>>>>                    } else if (errno == ENOMEM) {
>>>>>>>>>>                        JNU_ThrowOutOfMemoryError(env,
>>>>>>>>>> "NET_Timeout
>>>>>>>>>> native heap allocation failed");
>>>>>>>>>>                    } else {
>>>>>>>>>> JNU_ThrowByNameWithMessageAndLastError
>>>>>>>>>>                            (env, JNU_JAVANETPKG
>>>>>>>>>> "SocketException",
>>>>>>>>>> "select/poll failed");
>>>>>>>>>>                    }
>>>>>>>>>>                }
>>>>>>>>>>                  // release buffer in main call flow
>>>>>>>>>> //              if (bufP != BUF) {
>>>>>>>>>> //                  free(bufP);
>>>>>>>>>> //             }
>>>>>>>>>>               nread = -1;
>>>>>>>>>>               break;
>>>>>>>>>>            } else {
>>>>>>>>>> nread = NET_NonBlockingRead(fd, bufP, len);
>>>>>>>>>> if (nread == -1 && ((errno == EAGAIN) || (errno ==
>>>>>>>>>> EWOULDBLOCK))) {
>>>>>>>>>>                gettimeofday(&t, NULL);
>>>>>>>>>> newtime = t.tv_sec * 1000 + t.tv_usec / 1000;
>>>>>>>>>> _timeout -= newtime - prevtime;
>>>>>>>>>> if(_timeout > 0){
>>>>>>>>>> prevtime = newtime;
>>>>>>>>>>                    }
>>>>>>>>>> } else { break; } } } return nread;
>>>>>>>>>>
>>>>>>>>>> e&oe
>>>>>>>>>>
>>>>>>>>>> regards
>>>>>>>>>> Mark
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 29/08/2016 10:58, Vyom Tewari wrote:
>>>>>>>>>>> gentle reminder, please review the below code change.
>>>>>>>>>>>
>>>>>>>>>>> Vyom
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Monday 22 August 2016 05:12 PM, Vyom Tewari wrote:
>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>
>>>>>>>>>>>> Please review the code changes for below issue.
>>>>>>>>>>>>
>>>>>>>>>>>> Bug         : https://bugs.openjdk.java.net/browse/JDK-8075484
>>>>>>>>>>>>
>>>>>>>>>>>> webrev    :
>>>>>>>>>>>> http://cr.openjdk.java.net/~vtewari/8075484/webrev0.0/index.html
>>>>>>>>>>>>
>>>>>>>>>>>> <http://cr.openjdk.java.net/%7Evtewari/8075484/webrev0.0/index.html>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> This issue is SocketInputStream.socketread0() hangs even with
>>>>>>>>>>>> "soTimeout" set.the implementation of
>>>>>>>>>>>> Java_java_net_SocketInputStream_socketRead0 assumes that read()
>>>>>>>>>>>> won't block after poll() reports that a read is possible.
>>>>>>>>>>>>
>>>>>>>>>>>> This assumption does not hold, as noted on the man page for
>>>>>>>>>>>> select
>>>>>>>>>>>> (referenced by the man page for poll): Under Linux, select()
>>>>>>>>>>>> may
>>>>>>>>>>>> report a socket file descriptor as "ready for reading", while
>>>>>>>>>>>> nevertheless a subsequent read blocks. This could for example
>>>>>>>>>>>> happen
>>>>>>>>>>>> when data has arrived but upon examination has wrong checksum
>>>>>>>>>>>> and is
>>>>>>>>>>>> discarded.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Vyom
>>>>>>>>>>>>
>>>>>>>>>>>>
>


More information about the net-dev mailing list