RFR 8075484:SocketInputStream.socketRead0 can hang even with soTimeout set
Chris Hegarty
chris.hegarty at oracle.com
Mon Sep 5 15:07:14 UTC 2016
On 05/09/16 15:37, Mark Sheppard wrote:
>
> if the desire is to avoid making double calls on gettimeofday in the
> NET_ReadWithTimeout's while loop for its main call flow,
Yes, the desire is to make no more calls to gettimeofday, than are
already done.
> then consider a modification to NET_Timeout and create a version
> int NET_TimeoutWithCurrentTime(int s, long timeout, stuct timeval *
> current_time)
>
> int NET_TimeoutWithCurrentTime(int s, long timeout, stuct timeval *
> current_time) {
> int result;
> struct timeval t;
> long prevtime, newtime;
> struct pollfd pfd;
> pfd.fd = s;
> pfd.events = POLLIN;
>
> if (timeout > 0) {
> prevtime = (current_time->tv_sec * 1000) +
> current_time->tv_usec / 1000;
> }
>
> for(;;) {
> result = poll(&pfd, 1, timeout);
> if (result < 0 && errno == EINTR) {
> if (timeout > 0) {
> gettimeofday(&t, NULL);
> newtime = (t.tv_sec * 1000) + t.tv_usec /1000;
> timeout -= newtime - prevtime;
> if (timeout <= 0)
> return 0;
> prevtime = newtime;
> }
> } else {
> return result;
> }
> }
> }
>
> the NET_ReadWithTimeout is modified with.
>
> while (timeout > 0) {
> result = NET_TimeoutWithCurrentTime(fd, timeout, &t);
>
> in any case there is the potential for multiple invocation of
> gettimeofday due to a need to make
> poll restartable,
Yes, and that is fine. Just no more than is already there.
and adjust the originally specified timeout
> accordingly, and for the edge case
> after the non blocking read.
After an initial skim, your suggestion seems reasonable.
-Chris.
> regards
> Mark
>
>
>
> On 05/09/2016 12:02, Chris Hegarty wrote:
>> Vyom,
>>
>> >>>
>> webrev(http://cr.openjdk.java.net/~vtewari/8075484/webrev0.1/index.html
>>
>> There is some concern about the potential performance effect, etc,
>> of gettimeofday, maybe there is a way out of this. The reuse of
>> NET_Timeout is good, but it also calls gettimeofday. It seems that
>> a specific NET_ReadWithTimeout could be written to NOT reuse
>> NET_Timeout, but effectively inline its interruptible operation.
>> Or write a variant of NET_Timeout that takes a function to
>> execute. Rather than effectively two loops conditioned on the
>> timeout. Disclaimer: I have not actually tried to do this, but
>> it seems worth considering / evaluating.
>>
>> -Chris.
>>
>> On 02/09/16 04:39, Vyom Tewari wrote:
>>> hi Dimitry,
>>>
>>> thanks for review, I did consider to use a monotonically increasing
>>> clock like "clock_gettime" but existing nearby code("NET_Timeout") uses
>>> "gettimeofday" so i choose to be consistent with the existing code.
>>>
>>> Thanks,
>>>
>>> Vyom
>>>
>>>
>>> On Friday 02 September 2016 01:38 AM, Dmitry Samersoff wrote:
>>>> Vyom,
>>>>
>>>> Did you consider to use select() to calculate timeout instead of
>>>> gettimeofday ?
>>>>
>>>> gettimeofday is affected by system time changes, so running ntpd can
>>>> cause unpredictable behavior of this code. Also it's rather expensive
>>>> syscall.
>>>>
>>>> -Dmitry
>>>>
>>>> On 2016-09-01 19:03, Vyom Tewari wrote:
>>>>> please find the updated
>>>>> webrev(http://cr.openjdk.java.net/~vtewari/8075484/webrev0.1/index.html
>>>>>
>>>>> <http://cr.openjdk.java.net/%7Evtewari/8075484/webrev0.1/index.html>).
>>>>> I
>>>>> incorporated the review comments.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Vyom
>>>>>
>>>>>
>>>>> On Tuesday 30 August 2016 04:11 PM, Mark Sheppard wrote:
>>>>>> Hi
>>>>>> perhaps there is an opportunity to do some refactoring here (...
>>>>>> for me a "goto " carries a code smell! )
>>>>>>
>>>>>> along the lines
>>>>>>
>>>>>> if (timeout) {
>>>>>> nread = NET_ReadWithTimeout(...);
>>>>>> } else {
>>>>>> nread = NET_Read(...);
>>>>>> }
>>>>>>
>>>>>>
>>>>>> the NET_ReadWithTimeout (...) function will contain a
>>>>>> restructuring of
>>>>>> your goto loop
>>>>>> while (_timeout > 0) { nread = NET_Timeout(fd, _timeout);
>>>>>> if (nread <= 0) {
>>>>>> if (nread == 0) {
>>>>>> JNU_ThrowByName(env, JNU_JAVANETPKG
>>>>>> "SocketTimeoutException",
>>>>>> "Read timed out");
>>>>>> } else if (nread == -1) {
>>>>>> if (errno == EBADF) {
>>>>>> JNU_ThrowByName(env, JNU_JAVANETPKG
>>>>>> "SocketException", "Socket closed");
>>>>>> } else if (errno == ENOMEM) {
>>>>>> JNU_ThrowOutOfMemoryError(env, "NET_Timeout
>>>>>> native heap allocation failed");
>>>>>> } else {
>>>>>> JNU_ThrowByNameWithMessageAndLastError
>>>>>> (env, JNU_JAVANETPKG "SocketException",
>>>>>> "select/poll failed");
>>>>>> }
>>>>>> }
>>>>>> // release buffer in main call flow
>>>>>> // if (bufP != BUF) {
>>>>>> // free(bufP);
>>>>>> // }
>>>>>> nread = -1;
>>>>>> break;
>>>>>> } else {
>>>>>> nread = NET_NonBlockingRead(fd, bufP, len);
>>>>>> if (nread == -1 && ((errno == EAGAIN) || (errno == EWOULDBLOCK))) {
>>>>>> gettimeofday(&t, NULL);
>>>>>> newtime = t.tv_sec * 1000 + t.tv_usec / 1000;
>>>>>> _timeout -= newtime - prevtime;
>>>>>> if(_timeout > 0){
>>>>>> prevtime = newtime;
>>>>>> }
>>>>>> } else { break; } } } return nread;
>>>>>>
>>>>>> e&oe
>>>>>>
>>>>>> regards
>>>>>> Mark
>>>>>>
>>>>>>
>>>>>> On 29/08/2016 10:58, Vyom Tewari wrote:
>>>>>>> gentle reminder, please review the below code change.
>>>>>>>
>>>>>>> Vyom
>>>>>>>
>>>>>>>
>>>>>>> On Monday 22 August 2016 05:12 PM, Vyom Tewari wrote:
>>>>>>>> Hi All,
>>>>>>>>
>>>>>>>> Please review the code changes for below issue.
>>>>>>>>
>>>>>>>> Bug : https://bugs.openjdk.java.net/browse/JDK-8075484
>>>>>>>>
>>>>>>>> webrev :
>>>>>>>> http://cr.openjdk.java.net/~vtewari/8075484/webrev0.0/index.html
>>>>>>>> <http://cr.openjdk.java.net/%7Evtewari/8075484/webrev0.0/index.html>
>>>>>>>>
>>>>>>>>
>>>>>>>> This issue is SocketInputStream.socketread0() hangs even with
>>>>>>>> "soTimeout" set.the implementation of
>>>>>>>> Java_java_net_SocketInputStream_socketRead0 assumes that read()
>>>>>>>> won't block after poll() reports that a read is possible.
>>>>>>>>
>>>>>>>> This assumption does not hold, as noted on the man page for select
>>>>>>>> (referenced by the man page for poll): Under Linux, select() may
>>>>>>>> report a socket file descriptor as "ready for reading", while
>>>>>>>> nevertheless a subsequent read blocks. This could for example
>>>>>>>> happen
>>>>>>>> when data has arrived but upon examination has wrong checksum
>>>>>>>> and is
>>>>>>>> discarded.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Vyom
>>>>>>>>
>>>>>>>>
>>>>
>>>
>
More information about the net-dev
mailing list