RFR 8075484:SocketInputStream.socketRead0 can hang even with soTimeout set

Langer, Christoph christoph.langer at sap.com
Wed Sep 7 15:14:20 UTC 2016


Hi Vyom,

did you have a look at my suggestions regarding AIX and refactoring? I can't find none of it in the new webrev nor did you comment on it.

Best regards
Christoph

> -----Original Message-----
> From: net-dev [mailto:net-dev-bounces at openjdk.java.net] On Behalf Of Vyom
> Tewari
> Sent: Mittwoch, 7. September 2016 17:10
> To: Chris Hegarty <chris.hegarty at oracle.com>
> Cc: net-dev at openjdk.java.net
> Subject: Re: RFR 8075484:SocketInputStream.socketRead0 can hang even with
> soTimeout set
> 
> Hi All,
> 
> Please find the latest
> webrev(http://cr.openjdk.java.net/~vtewari/8075484/webrev0.4/index.html
> <http://cr.openjdk.java.net/%7Evtewari/8075484/webrev0.4/index.html>).
> there were some test failures in JPRT and Chris also pointed the same
> issue in modified code. Now with latest code JPRT is clean.
> 
> Thanks,
> 
> Vyom
> 
> 
> On Wednesday 07 September 2016 03:18 PM, Chris Hegarty wrote:
> >
> >
> > On 07/09/16 07:45, Vyom Tewari wrote:
> >> Hi Chris,
> >>
> >> Please find the latest
> >> webrev(http://cr.openjdk.java.net/~vtewari/8075484/webrev0.3/index.html
> >> <http://cr.openjdk.java.net/%7Evtewari/8075484/webrev0.3/index.html>). I
> >> did some code refactoring, JPRT is in progress.
> >
> > In terms of NET_Timeout**, I'm happy with this version, but I
> > have an additional comment.
> >
> > If NET_ReadWithTimeout returns -1 because NET_TimeoutWithCurrentTime
> > returns <= 0, then a JNI pending exception will be set. So the code
> > calling NET_ReadWithTimeout should, a) free bufP, and b) return -1,
> > immediately rather than continuing and possibly trying to set another
> > JNI pending exception.
> >
> > One option is to check the return value from NET_ReadWithTimeout and
> > also (*env)->ExceptionCheck(env). Another is to possibly consolidate
> > the setting of JNI pending exceptions in one place, towards the end
> > of Java_java_net_SocketInputStream_socketRead0, for example EBADF is
> > handled in two places.
> >
> > -Chris.
> >
> >> I need help from some one to build and test latest changes on AIX, may
> >> be Christoph can help.
> >>
> >> Thanks,
> >>
> >> Vyom
> >>
> >>
> >> On Tuesday 06 September 2016 06:25 PM, Chris Hegarty wrote:
> >>> Vyom,
> >>>
> >>>> On 6 Sep 2016, at 10:20, Vyom Tewari <vyom.tewari at oracle.com> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> Please find the latest
> >>>>
> webrev(http://cr.openjdk.java.net/~vtewari/8075484/webrev0.2/index.html
> >>>>
> >>>>
> <http://cr.openjdk.java.net/%7Evtewari/8075484/webrev0.2/index.html>).
> >>>> I
> >>>> have incorporated the review comments.
> >>> Your changes completely avoid NET_Timeout, which is quite complex on
> >>> Linux, Mac, and AIX, for various reasons. ( NET_Timeout will use the
> >>> async close mechanism to signal/interrupt a thread blocked in a poll /
> >>> select ). Also, select is used on Mac, which will use poll after your
> >>> changes ( I do remember that there was a bug in poll on Mac around
> >>> the time of the original Mac port ).
> >>>
> >>> Would is be better, rather than replicating the logic in NET_Timeout,
> >>> to have a version that supports passing a function pointer, to run if
> >>> the poll/select returns before the timeout? Just an idea.
> >>>
> >>> -Chris.
> >>>
> >>>> Thanks,
> >>>>
> >>>> Vyom
> >>>>
> >>>>
> >>>> On Monday 05 September 2016 08:37 PM, Chris Hegarty wrote:
> >>>>> On 05/09/16 15:37, Mark Sheppard wrote:
> >>>>>> if the desire is to avoid making double calls on gettimeofday in the
> >>>>>> NET_ReadWithTimeout's while loop for its main call flow,
> >>>>> Yes, the desire is to make no more calls to gettimeofday, than are
> >>>>> already done.
> >>>>>
> >>>>>> then consider a modification to NET_Timeout and create a version
> >>>>>> int NET_TimeoutWithCurrentTime(int s, long timeout, stuct timeval *
> >>>>>> current_time)
> >>>>>>
> >>>>>> int NET_TimeoutWithCurrentTime(int s, long timeout, stuct timeval *
> >>>>>> current_time) {
> >>>>>>     int result;
> >>>>>>     struct timeval t;
> >>>>>>     long prevtime, newtime;
> >>>>>>     struct pollfd pfd;
> >>>>>>     pfd.fd = s;
> >>>>>>     pfd.events = POLLIN;
> >>>>>>
> >>>>>>     if (timeout > 0) {
> >>>>>>         prevtime = (current_time->tv_sec * 1000)  +
> >>>>>> current_time->tv_usec / 1000;
> >>>>>>     }
> >>>>>>
> >>>>>>     for(;;) {
> >>>>>>         result = poll(&pfd, 1, timeout);
> >>>>>>         if (result < 0 && errno == EINTR) {
> >>>>>>             if (timeout > 0) {
> >>>>>>                 gettimeofday(&t, NULL);
> >>>>>>                 newtime = (t.tv_sec * 1000)  + t.tv_usec /1000;
> >>>>>>                 timeout -= newtime - prevtime;
> >>>>>>                 if (timeout <= 0)
> >>>>>>                     return 0;
> >>>>>>                 prevtime = newtime;
> >>>>>>             }
> >>>>>>         } else {
> >>>>>>             return result;
> >>>>>>         }
> >>>>>>     }
> >>>>>> }
> >>>>>>
> >>>>>> the NET_ReadWithTimeout is modified with.
> >>>>>>
> >>>>>>    while (timeout > 0) {
> >>>>>>         result = NET_TimeoutWithCurrentTime(fd, timeout, &t);
> >>>>>>
> >>>>>> in any case there is the potential for multiple invocation of
> >>>>>> gettimeofday due to a need to make
> >>>>>> poll restartable,
> >>>>> Yes, and that is fine. Just no more than is already there.
> >>>>>
> >>>>> and adjust the originally specified timeout
> >>>>>> accordingly, and for the edge case
> >>>>>> after the non blocking read.
> >>>>> After an initial skim, your suggestion seems reasonable.
> >>>>>
> >>>>> -Chris.
> >>>>>
> >>>>>> regards
> >>>>>> Mark
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 05/09/2016 12:02, Chris Hegarty wrote:
> >>>>>>> Vyom,
> >>>>>>>
> >>>>>>>
> webrev(http://cr.openjdk.java.net/~vtewari/8075484/webrev0.1/index.html
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> There is some concern about the potential performance effect, etc,
> >>>>>>> of gettimeofday, maybe there is a way out of this. The reuse of
> >>>>>>> NET_Timeout is good, but it also calls gettimeofday. It seems that
> >>>>>>> a specific NET_ReadWithTimeout could be written to NOT reuse
> >>>>>>> NET_Timeout, but effectively inline its interruptible operation.
> >>>>>>> Or write a variant of NET_Timeout that takes a function to
> >>>>>>> execute. Rather than effectively two loops conditioned on the
> >>>>>>> timeout.  Disclaimer: I have not actually tried to do this, but
> >>>>>>> it seems worth considering / evaluating.
> >>>>>>>
> >>>>>>> -Chris.
> >>>>>>>
> >>>>>>> On 02/09/16 04:39, Vyom Tewari wrote:
> >>>>>>>> hi Dimitry,
> >>>>>>>>
> >>>>>>>> thanks for review, I did consider to use  a monotonically
> >>>>>>>> increasing
> >>>>>>>> clock like "clock_gettime" but  existing nearby
> >>>>>>>> code("NET_Timeout") uses
> >>>>>>>> "gettimeofday"  so i choose to be consistent with the existing
> >>>>>>>> code.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>>
> >>>>>>>> Vyom
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Friday 02 September 2016 01:38 AM, Dmitry Samersoff wrote:
> >>>>>>>>> Vyom,
> >>>>>>>>>
> >>>>>>>>> Did you consider to use select() to calculate timeout instead of
> >>>>>>>>> gettimeofday ?
> >>>>>>>>>
> >>>>>>>>> gettimeofday is affected by system time changes, so running ntpd
> >>>>>>>>> can
> >>>>>>>>> cause unpredictable behavior of this code. Also it's rather
> >>>>>>>>> expensive
> >>>>>>>>> syscall.
> >>>>>>>>>
> >>>>>>>>> -Dmitry
> >>>>>>>>>
> >>>>>>>>> On 2016-09-01 19:03, Vyom Tewari wrote:
> >>>>>>>>>> please find the updated
> >>>>>>>>>>
> webrev(http://cr.openjdk.java.net/~vtewari/8075484/webrev0.1/index.html
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> <http://cr.openjdk.java.net/%7Evtewari/8075484/webrev0.1/index.html>).
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> I
> >>>>>>>>>> incorporated the review comments.
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>>
> >>>>>>>>>> Vyom
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Tuesday 30 August 2016 04:11 PM, Mark Sheppard wrote:
> >>>>>>>>>>> Hi
> >>>>>>>>>>>    perhaps there is an opportunity to do  some refactoring
> >>>>>>>>>>> here (...
> >>>>>>>>>>> for me a "goto " carries a code smell! )
> >>>>>>>>>>>
> >>>>>>>>>>> along the lines
> >>>>>>>>>>>
> >>>>>>>>>>> if (timeout) {
> >>>>>>>>>>>      nread =  NET_ReadWithTimeout(...);
> >>>>>>>>>>> } else {
> >>>>>>>>>>>       nread = NET_Read(...);
> >>>>>>>>>>> }
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> the NET_ReadWithTimeout (...) function will contain a
> >>>>>>>>>>> restructuring of
> >>>>>>>>>>> your goto loop
> >>>>>>>>>>> while (_timeout > 0) { nread = NET_Timeout(fd, _timeout);
> >>>>>>>>>>>            if (nread <= 0) {
> >>>>>>>>>>>                if (nread == 0) {
> >>>>>>>>>>>                    JNU_ThrowByName(env, JNU_JAVANETPKG
> >>>>>>>>>>> "SocketTimeoutException",
> >>>>>>>>>>>                                "Read timed out");
> >>>>>>>>>>>                } else if (nread == -1) {
> >>>>>>>>>>>                    if (errno == EBADF) {
> >>>>>>>>>>>                         JNU_ThrowByName(env, JNU_JAVANETPKG
> >>>>>>>>>>> "SocketException", "Socket closed");
> >>>>>>>>>>>                    } else if (errno == ENOMEM) {
> >>>>>>>>>>> JNU_ThrowOutOfMemoryError(env,
> >>>>>>>>>>> "NET_Timeout
> >>>>>>>>>>> native heap allocation failed");
> >>>>>>>>>>>                    } else {
> >>>>>>>>>>> JNU_ThrowByNameWithMessageAndLastError
> >>>>>>>>>>>                            (env, JNU_JAVANETPKG
> >>>>>>>>>>> "SocketException",
> >>>>>>>>>>> "select/poll failed");
> >>>>>>>>>>>                    }
> >>>>>>>>>>>                }
> >>>>>>>>>>>                  // release buffer in main call flow
> >>>>>>>>>>> //              if (bufP != BUF) {
> >>>>>>>>>>> //                  free(bufP);
> >>>>>>>>>>> //             }
> >>>>>>>>>>>               nread = -1;
> >>>>>>>>>>>               break;
> >>>>>>>>>>>            } else {
> >>>>>>>>>>> nread = NET_NonBlockingRead(fd, bufP, len);
> >>>>>>>>>>> if (nread == -1 && ((errno == EAGAIN) || (errno ==
> >>>>>>>>>>> EWOULDBLOCK))) {
> >>>>>>>>>>>                gettimeofday(&t, NULL);
> >>>>>>>>>>> newtime = t.tv_sec * 1000 + t.tv_usec / 1000;
> >>>>>>>>>>> _timeout -= newtime - prevtime;
> >>>>>>>>>>> if(_timeout > 0){
> >>>>>>>>>>> prevtime = newtime;
> >>>>>>>>>>>                    }
> >>>>>>>>>>> } else { break; } } } return nread;
> >>>>>>>>>>>
> >>>>>>>>>>> e&oe
> >>>>>>>>>>>
> >>>>>>>>>>> regards
> >>>>>>>>>>> Mark
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On 29/08/2016 10:58, Vyom Tewari wrote:
> >>>>>>>>>>>> gentle reminder, please review the below code change.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Vyom
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Monday 22 August 2016 05:12 PM, Vyom Tewari wrote:
> >>>>>>>>>>>>> Hi All,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Please review the code changes for below issue.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Bug         :
> >>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8075484
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> webrev    :
> >>>>>>>>>>>>>
> http://cr.openjdk.java.net/~vtewari/8075484/webrev0.0/index.html
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> <http://cr.openjdk.java.net/%7Evtewari/8075484/webrev0.0/index.html>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> This issue is SocketInputStream.socketread0() hangs even with
> >>>>>>>>>>>>> "soTimeout" set.the implementation of
> >>>>>>>>>>>>> Java_java_net_SocketInputStream_socketRead0 assumes that
> >>>>>>>>>>>>> read()
> >>>>>>>>>>>>> won't block after poll() reports that a read is possible.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> This assumption does not hold, as noted on the man page for
> >>>>>>>>>>>>> select
> >>>>>>>>>>>>> (referenced by the man page for poll): Under Linux, select()
> >>>>>>>>>>>>> may
> >>>>>>>>>>>>> report a socket file descriptor as "ready for reading", while
> >>>>>>>>>>>>> nevertheless a subsequent read blocks. This could for example
> >>>>>>>>>>>>> happen
> >>>>>>>>>>>>> when data has arrived but upon examination has wrong
> checksum
> >>>>>>>>>>>>> and is
> >>>>>>>>>>>>> discarded.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Vyom
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>



More information about the net-dev mailing list