From Alan.Bateman at oracle.com Tue May 1 00:54:19 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Tue, 01 May 2012 08:54:19 +0100 Subject: 7164570: (fs) WatchService queues CREATE event but not DELETE event for very short lived files [sol11] Message-ID: <4F9F96AB.9070209@oracle.com> I need a reviewer for a small fix to the WatchService implementation that we use on Solaris 11. If a file is created in a watched directory and is immediately removed or renamed out then one should expect to receive either zero or two events (two events = ENTRY_CREATE + ENTRY_DELETE, assuming that the directory is watched for both events). On Solaris 11 only then it's possible to receive an ENTRY_CREATE event without a corresponding ENTRY_DELETE event. This arises because the watch service implementation on this platform has to scan a directory when it gets a notification that the directory has changed. It's possible that the file is detected during the scan but the attempt to register it for events fails because it is deleted. The bug in the current code is that the ENTRY_CREATE event is queued when the file is detected but before we attempt to register it. There are two ways to fix this, we either don't post the ENTRY_CREATE event for this case, or we special case the ENOENT error and queue both an ENTRY_CREATE and ENTRY_DELETE event. I decided to go for the latter. The webrev with the change is here: http://cr.openjdk.java.net/~alanb/7164570/webrev/ Thanks, Alan. From chris.hegarty at oracle.com Tue May 1 02:35:13 2012 From: chris.hegarty at oracle.com (Chris Hegarty) Date: Tue, 01 May 2012 10:35:13 +0100 Subject: 7164570: (fs) WatchService queues CREATE event but not DELETE event for very short lived files [sol11] In-Reply-To: <4F9F96AB.9070209@oracle.com> References: <4F9F96AB.9070209@oracle.com> Message-ID: <4F9FAE51.3080009@oracle.com> Looks fine Alan, nice test! Trivially, typo in test description: < "Test that are CREATE and DELETE evenets are paired for very short lived files" --- > "Test that our CREATE and DELETE events are paired for very short lived files" *OR* "Tests that CREATE..." -Chris. On 01/05/12 08:54, Alan Bateman wrote: > > I need a reviewer for a small fix to the WatchService implementation > that we use on Solaris 11. > > If a file is created in a watched directory and is immediately removed > or renamed out then one should expect to receive either zero or two > events (two events = ENTRY_CREATE + ENTRY_DELETE, assuming that the > directory is watched for both events). On Solaris 11 only then it's > possible to receive an ENTRY_CREATE event without a corresponding > ENTRY_DELETE event. This arises because the watch service implementation > on this platform has to scan a directory when it gets a notification > that the directory has changed. It's possible that the file is detected > during the scan but the attempt to register it for events fails because > it is deleted. The bug in the current code is that the ENTRY_CREATE > event is queued when the file is detected but before we attempt to > register it. There are two ways to fix this, we either don't post the > ENTRY_CREATE event for this case, or we special case the ENOENT error > and queue both an ENTRY_CREATE and ENTRY_DELETE event. I decided to go > for the latter. > > The webrev with the change is here: > > http://cr.openjdk.java.net/~alanb/7164570/webrev/ > > Thanks, > Alan. From Alan.Bateman at oracle.com Tue May 1 03:19:21 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Tue, 01 May 2012 11:19:21 +0100 Subject: 7164570: (fs) WatchService queues CREATE event but not DELETE event for very short lived files [sol11] In-Reply-To: <4F9FAE51.3080009@oracle.com> References: <4F9F96AB.9070209@oracle.com> <4F9FAE51.3080009@oracle.com> Message-ID: <4F9FB8A9.7090308@oracle.com> On 01/05/2012 10:35, Chris Hegarty wrote: > Looks fine Alan, nice test! > > Trivially, typo in test description: > < "Test that are CREATE and DELETE evenets are paired for very short > lived files" > --- > > "Test that our CREATE and DELETE events are paired for very short > lived files" *OR* "Tests that CREATE..." Thanks Chris, I change the comment to: * @summary Test that CREATE and DELETE events are paired for very short lived files before pushing the change. -Alan. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20120501/ce94a464/attachment.html From vitalyd at gmail.com Tue May 1 06:18:42 2012 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Tue, 1 May 2012 09:18:42 -0400 Subject: 7164570: (fs) WatchService queues CREATE event but not DELETE event for very short lived files [sol11] In-Reply-To: <4F9FB8A9.7090308@oracle.com> References: <4F9F96AB.9070209@oracle.com> <4F9FAE51.3080009@oracle.com> <4F9FB8A9.7090308@oracle.com> Message-ID: Hi Alan, registerChildren() swallows IOException and DirectoryIteratorException. Does this mean caller may think that events got registered but in fact it didn't? Would it be better if these exceptions were propagated to the caller? Thanks Sent from my phone On May 1, 2012 6:19 AM, "Alan Bateman" wrote: > On 01/05/2012 10:35, Chris Hegarty wrote: > > Looks fine Alan, nice test! > > Trivially, typo in test description: > < "Test that are CREATE and DELETE evenets are paired for very short > lived files" > --- > > "Test that our CREATE and DELETE events are paired for very short > lived files" *OR* "Tests that CREATE..." > > Thanks Chris, I change the comment to: > > * @summary Test that CREATE and DELETE events are paired for very short > lived files > > before pushing the change. > > -Alan. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20120501/fc31a49c/attachment.html From Alan.Bateman at oracle.com Tue May 1 06:50:10 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Tue, 01 May 2012 14:50:10 +0100 Subject: 7164570: (fs) WatchService queues CREATE event but not DELETE event for very short lived files [sol11] In-Reply-To: References: <4F9F96AB.9070209@oracle.com> <4F9FAE51.3080009@oracle.com> <4F9FB8A9.7090308@oracle.com> Message-ID: <4F9FEA12.9080107@oracle.com> On 01/05/2012 14:18, Vitaly Davidovich wrote: > > Hi Alan, > > registerChildren() swallows IOException and > DirectoryIteratorException. Does this mean caller may think that > events got registered but in fact it didn't? Would it be better if > these exceptions were propagated to the caller? > If the directory can't be registered then the IOException is propagated, you need to look at implRegister to see the details. The code in this patch is the code for scanning of the directory that is done after the directory is initially registered or after we are notified that the directory has changed, usually because of new files being created. This all happens in the background poller thread that is servicing the event port so it's not always possible to propagate errors (the swallowing of exceptions that you see has always been there). However as you bring this up, then we could potentially queue an OVERFLOW event. That would at least cause the user of the API to re-scan and synchronize its view of the directory. -Alan. From vitalyd at gmail.com Tue May 1 06:54:43 2012 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Tue, 1 May 2012 09:54:43 -0400 Subject: 7164570: (fs) WatchService queues CREATE event but not DELETE event for very short lived files [sol11] In-Reply-To: <4F9FEA12.9080107@oracle.com> References: <4F9F96AB.9070209@oracle.com> <4F9FAE51.3080009@oracle.com> <4F9FB8A9.7090308@oracle.com> <4F9FEA12.9080107@oracle.com> Message-ID: understood, I only glanced at the diff and saw the swallowing so clearly didn't have a wholistic view. Your proposal of OVERFLOW to notify client to re-sync sounds good. Thanks Sent from my phone On May 1, 2012 9:50 AM, "Alan Bateman" wrote: > On 01/05/2012 14:18, Vitaly Davidovich wrote: > >> >> Hi Alan, >> >> registerChildren() swallows IOException and DirectoryIteratorException. >> Does this mean caller may think that events got registered but in fact it >> didn't? Would it be better if these exceptions were propagated to the >> caller? >> >> If the directory can't be registered then the IOException is propagated, > you need to look at implRegister to see the details. > > The code in this patch is the code for scanning of the directory that is > done after the directory is initially registered or after we are notified > that the directory has changed, usually because of new files being created. > This all happens in the background poller thread that is servicing the > event port so it's not always possible to propagate errors (the swallowing > of exceptions that you see has always been there). However as you bring > this up, then we could potentially queue an OVERFLOW event. That would at > least cause the user of the API to re-scan and synchronize its view of the > directory. > > -Alan. > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20120501/53d878c2/attachment.html From zhouyx at linux.vnet.ibm.com Fri May 4 00:35:50 2012 From: zhouyx at linux.vnet.ibm.com (Sean Chou) Date: Fri, 4 May 2012 15:35:50 +0800 Subject: Request for review: 7166048: remove the embedded epoll data structure Message-ID: Hi all, I found the src/solaris/native/sun/nio/ch/EPollArrayWrapper.c embedded the epoll data structure epoll_data and epoll_event . It is duplicated with the definition in sys/epoll.h . This redundancy would cause failure if the underlying system has different epoll data structure (eg. different alignment) . I reported 7166048 for it and made a patch: http://cr.openjdk.java.net/~zhouyx/7166048/webrev.00/ . Please take a look. -- Best Regards, Sean Chou -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20120504/7e255ab4/attachment.html From Alan.Bateman at oracle.com Fri May 4 00:59:09 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 04 May 2012 08:59:09 +0100 Subject: Request for review: 7166048: remove the embedded epoll data structure In-Reply-To: References: Message-ID: <4FA38C4D.8060209@oracle.com> On 04/05/2012 08:35, Sean Chou wrote: > Hi all, > > I found the src/solaris/native/sun/nio/ch/EPollArrayWrapper.c > embedded the epoll data structure epoll_data and epoll_event . It is > duplicated with the definition in sys/epoll.h . This redundancy would > cause failure if the underlying system has different epoll data > structure (eg. different alignment) . > > I reported 7166048 for it and made a patch: > http://cr.openjdk.java.net/~zhouyx/7166048/webrev.00/ > . > > As background, when the epoll Selector was added then the JDK had to be build on distributions that were still 2.4 based and didn't have epoll.h. We had meant to go back and clear this up in JDK7 but didn't get around to it. So thanks for taking this on now. I looked at your patch but it doesn't seem to be complete as it doesn't change the usages to call the epoll functions directly. Attached is the patch that I had for cleaning this up, it should still be current because this code has not changed. The only thing I notice is that it doesn't remove the include of dlfcn.h, that shouldn't be needed now. On the RFE that you submitted, I've moved this to the right place as: 7166048: (se) EPollArrayWrapper.c no longer needs to define epoll data structures -Alan. diff --git a/src/solaris/native/sun/nio/ch/EPollArrayWrapper.c b/src/solaris/native/sun/nio/ch/EPollArrayWrapper.c --- a/src/solaris/native/sun/nio/ch/EPollArrayWrapper.c +++ b/src/solaris/native/sun/nio/ch/EPollArrayWrapper.c @@ -34,55 +34,13 @@ #include #include #include - -#ifdef __cplusplus -extern "C" { -#endif - -/* epoll_wait(2) man page */ - -typedef union epoll_data { - void *ptr; - int fd; - __uint32_t u32; - __uint64_t u64; -} epoll_data_t; - - -/* x86-64 has same alignment as 32-bit */ -#ifdef __x86_64__ -#define EPOLL_PACKED __attribute__((packed)) -#else -#define EPOLL_PACKED -#endif - -struct epoll_event { - __uint32_t events; /* Epoll events */ - epoll_data_t data; /* User data variable */ -} EPOLL_PACKED; - -#ifdef __cplusplus -} -#endif +#include #define RESTARTABLE(_cmd, _result) do { \ do { \ _result = _cmd; \ } while((_result == -1) && (errno == EINTR)); \ } while(0) - -/* - * epoll event notification is new in 2.6 kernel. As the offical build - * platform for the JDK is on a 2.4-based distribution then we must - * obtain the addresses of the epoll functions dynamically. - */ -typedef int (*epoll_create_t)(int size); -typedef int (*epoll_ctl_t) (int epfd, int op, int fd, struct epoll_event *event); -typedef int (*epoll_wait_t) (int epfd, struct epoll_event *events, int maxevents, int timeout); - -static epoll_create_t epoll_create_func; -static epoll_ctl_t epoll_ctl_func; -static epoll_wait_t epoll_wait_func; static int iepoll(int epfd, struct epoll_event *events, int numfds, jlong timeout) @@ -96,7 +54,7 @@ iepoll(int epfd, struct epoll_event *eve start = t.tv_sec * 1000 + t.tv_usec / 1000; for (;;) { - int res = (*epoll_wait_func)(epfd, events, numfds, timeout); + int res = epoll_wait(epfd, events, numfds, timeout); if (res < 0 && errno == EINTR) { if (remaining >= 0) { gettimeofday(&t, NULL); @@ -117,14 +75,6 @@ JNIEXPORT void JNICALL JNIEXPORT void JNICALL Java_sun_nio_ch_EPollArrayWrapper_init(JNIEnv *env, jclass this) { - epoll_create_func = (epoll_create_t) dlsym(RTLD_DEFAULT, "epoll_create"); - epoll_ctl_func = (epoll_ctl_t) dlsym(RTLD_DEFAULT, "epoll_ctl"); - epoll_wait_func = (epoll_wait_t) dlsym(RTLD_DEFAULT, "epoll_wait"); - - if ((epoll_create_func == NULL) || (epoll_ctl_func == NULL) || - (epoll_wait_func == NULL)) { - JNU_ThrowInternalError(env, "unable to get address of epoll functions, pre-2.6 kernel?"); - } } JNIEXPORT jint JNICALL @@ -134,7 +84,7 @@ Java_sun_nio_ch_EPollArrayWrapper_epollC * epoll_create expects a size as a hint to the kernel about how to * dimension internal structures. We can't predict the size in advance. */ - int epfd = (*epoll_create_func)(256); + int epfd = epoll_create(256); if (epfd < 0) { JNU_ThrowIOExceptionWithLastError(env, "epoll_create failed"); } @@ -173,7 +123,7 @@ Java_sun_nio_ch_EPollArrayWrapper_epollC event.events = events; event.data.fd = fd; - RESTARTABLE((*epoll_ctl_func)(epfd, (int)opcode, (int)fd, &event), res); + RESTARTABLE(epoll_ctl(epfd, (int)opcode, (int)fd, &event), res); /* * A channel may be registered with several Selectors. When each Selector @@ -199,7 +149,7 @@ Java_sun_nio_ch_EPollArrayWrapper_epollW int res; if (timeout <= 0) { /* Indefinite or no wait */ - RESTARTABLE((*epoll_wait_func)(epfd, events, numfds, timeout), res); + RESTARTABLE(epoll_wait(epfd, events, numfds, timeout), res); } else { /* Bounded wait; bounded restarts */ res = iepoll(epfd, events, numfds, timeout); } From zhouyx at linux.vnet.ibm.com Fri May 4 01:27:22 2012 From: zhouyx at linux.vnet.ibm.com (Sean Chou) Date: Fri, 4 May 2012 16:27:22 +0800 Subject: Request for review: 7166048: remove the embedded epoll data structure In-Reply-To: <4FA38C4D.8060209@oracle.com> References: <4FA38C4D.8060209@oracle.com> Message-ID: Hi Alan, Thank you, I'll update the patch. On Fri, May 4, 2012 at 3:59 PM, Alan Bateman wrote: > On 04/05/2012 08:35, Sean Chou wrote: > >> Hi all, >> >> I found the src/solaris/native/sun/nio/ch/**EPollArrayWrapper.c >> embedded the epoll data structure epoll_data and epoll_event . It is >> duplicated with the definition in sys/epoll.h . This redundancy would cause >> failure if the underlying system has different epoll data structure (eg. >> different alignment) . >> >> I reported 7166048 for it and made a patch: >> http://cr.openjdk.java.net/~**zhouyx/7166048/webrev.00/< >> http://cr.openjdk.java.net/%**7Ezhouyx/7166048/webrev.00/> >> . >> >> >> As background, when the epoll Selector was added then the JDK had to be > build on distributions that were still 2.4 based and didn't have epoll.h. > We had meant to go back and clear this up in JDK7 but didn't get around to > it. So thanks for taking this on now. I looked at your patch but it doesn't > seem to be complete as it doesn't change the usages to call the epoll > functions directly. Attached is the patch that I had for cleaning this up, > it should still be current because this code has not changed. The only > thing I notice is that it doesn't remove the include of dlfcn.h, that > shouldn't be needed now. > > On the RFE that you submitted, I've moved this to the right place as: > > 7166048: (se) EPollArrayWrapper.c no longer needs to define epoll data > structures > > -Alan. > > > > diff --git a/src/solaris/native/sun/nio/**ch/EPollArrayWrapper.c > b/src/solaris/native/sun/nio/**ch/EPollArrayWrapper.c > --- a/src/solaris/native/sun/nio/**ch/EPollArrayWrapper.c > +++ b/src/solaris/native/sun/nio/**ch/EPollArrayWrapper.c > @@ -34,55 +34,13 @@ > #include > #include > #include > - > -#ifdef __cplusplus > -extern "C" { > -#endif > - > -/* epoll_wait(2) man page */ > - > -typedef union epoll_data { > - void *ptr; > - int fd; > - __uint32_t u32; > - __uint64_t u64; > -} epoll_data_t; > - > - > -/* x86-64 has same alignment as 32-bit */ > -#ifdef __x86_64__ > -#define EPOLL_PACKED __attribute__((packed)) > -#else > -#define EPOLL_PACKED > -#endif > - > -struct epoll_event { > - __uint32_t events; /* Epoll events */ > - epoll_data_t data; /* User data variable */ > -} EPOLL_PACKED; > - > -#ifdef __cplusplus > -} > -#endif > +#include > > #define RESTARTABLE(_cmd, _result) do { \ > do { \ > _result = _cmd; \ > } while((_result == -1) && (errno == EINTR)); \ > } while(0) > - > -/* > - * epoll event notification is new in 2.6 kernel. As the offical build > - * platform for the JDK is on a 2.4-based distribution then we must > - * obtain the addresses of the epoll functions dynamically. > - */ > -typedef int (*epoll_create_t)(int size); > -typedef int (*epoll_ctl_t) (int epfd, int op, int fd, struct > epoll_event *event); > -typedef int (*epoll_wait_t) (int epfd, struct epoll_event *events, int > maxevents, int timeout); > - > -static epoll_create_t epoll_create_func; > -static epoll_ctl_t epoll_ctl_func; > -static epoll_wait_t epoll_wait_func; > > static int > iepoll(int epfd, struct epoll_event *events, int numfds, jlong timeout) > @@ -96,7 +54,7 @@ iepoll(int epfd, struct epoll_event *eve > start = t.tv_sec * 1000 + t.tv_usec / 1000; > > for (;;) { > - int res = (*epoll_wait_func)(epfd, events, numfds, timeout); > + int res = epoll_wait(epfd, events, numfds, timeout); > if (res < 0 && errno == EINTR) { > if (remaining >= 0) { > gettimeofday(&t, NULL); > @@ -117,14 +75,6 @@ JNIEXPORT void JNICALL > JNIEXPORT void JNICALL > Java_sun_nio_ch_**EPollArrayWrapper_init(JNIEnv *env, jclass this) > { > - epoll_create_func = (epoll_create_t) dlsym(RTLD_DEFAULT, > "epoll_create"); > - epoll_ctl_func = (epoll_ctl_t) dlsym(RTLD_DEFAULT, "epoll_ctl"); > - epoll_wait_func = (epoll_wait_t) dlsym(RTLD_DEFAULT, > "epoll_wait"); > - > - if ((epoll_create_func == NULL) || (epoll_ctl_func == NULL) || > - (epoll_wait_func == NULL)) { > - JNU_ThrowInternalError(env, "unable to get address of epoll > functions, pre-2.6 kernel?"); > - } > } > > JNIEXPORT jint JNICALL > @@ -134,7 +84,7 @@ Java_sun_nio_ch_**EPollArrayWrapper_epollC > * epoll_create expects a size as a hint to the kernel about how to > * dimension internal structures. We can't predict the size in advance. > */ > - int epfd = (*epoll_create_func)(256); > + int epfd = epoll_create(256); > if (epfd < 0) { > JNU_**ThrowIOExceptionWithLastError(**env, "epoll_create failed"); > } > @@ -173,7 +123,7 @@ Java_sun_nio_ch_**EPollArrayWrapper_epollC > event.events = events; > event.data.fd = fd; > > - RESTARTABLE((*epoll_ctl_func)(**epfd, (int)opcode, (int)fd, &event), > res); > + RESTARTABLE(epoll_ctl(epfd, (int)opcode, (int)fd, &event), res); > > /* > * A channel may be registered with several Selectors. When each > Selector > @@ -199,7 +149,7 @@ Java_sun_nio_ch_**EPollArrayWrapper_epollW > int res; > > if (timeout <= 0) { /* Indefinite or no wait */ > - RESTARTABLE((*epoll_wait_func)**(epfd, events, numfds, timeout), > res); > + RESTARTABLE(epoll_wait(epfd, events, numfds, timeout), res); > } else { /* Bounded wait; bounded restarts */ > res = iepoll(epfd, events, numfds, timeout); > } > > -- Best Regards, Sean Chou -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20120504/7b222540/attachment.html From kurchi.subhra.hazra at oracle.com Fri May 4 11:46:43 2012 From: kurchi.subhra.hazra at oracle.com (Kurchi Hazra) Date: Fri, 04 May 2012 11:46:43 -0700 Subject: Code Review Request: 7096436: (sc) SocketChannel.connect fails on Windows 8 when channel configured non-blocking Message-ID: <4FA42413.1080109@oracle.com> Hi, We were seting localAddress when establishing the connection but for the non-blocking case, it is possible that the socket is not yet bound. We therefore change this behavior and retrieve the localAddress in SocketChannelImpl.getLocalAddress() when the socket is bound and localAddress is null or isAnyLocalAddress. However, I also had to introduce an additional boolean field to keep track of when the socket is bound. This is to circumvent the problem that we don't get an error (similar to WSAEINVAL on windows) on solaris/linux/mac when calling getsockopt on an unbound socket. Consequently, localAddress and localPort are being set to 0.0.0.0 and 0 on these platforms, instead of the required null and -1. Bug: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7096436 Webrev: http://cr.openjdk.java.net/~khazra/7096436/webrev.00/ Thanks, Kurchi From Alan.Bateman at oracle.com Sat May 5 04:42:39 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Sat, 05 May 2012 12:42:39 +0100 Subject: Code Review Request: 7096436: (sc) SocketChannel.connect fails on Windows 8 when channel configured non-blocking In-Reply-To: <4FA42413.1080109@oracle.com> References: <4FA42413.1080109@oracle.com> Message-ID: <4FA5122F.8010605@oracle.com> On 04/05/2012 19:46, Kurchi Hazra wrote: > Hi, > > We were seting localAddress when establishing the connection but for > the non-blocking case, it is possible that the socket is not yet bound. > We therefore change this behavior and retrieve the localAddress in > SocketChannelImpl.getLocalAddress() when the socket is bound and > localAddress is null or isAnyLocalAddress. > > However, I also had to introduce an additional boolean field to keep > track of when the socket is bound. This is to circumvent the problem > that we > don't get an error (similar to WSAEINVAL on windows) on > solaris/linux/mac when calling getsockopt on an unbound socket. > Consequently, localAddress > and localPort are being set to 0.0.0.0 and 0 on these platforms, > instead of the required null and -1. > > Bug: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7096436 > Webrev: http://cr.openjdk.java.net/~khazra/7096436/webrev.00/ Thanks for taking this one, the assumption has always been that the connect would cause the socket to be bound immediately, if not already bound. I agree with removing the code at L632-640 as that's where the assumption is that the socket is implicitly bound. However if we assign localAddress in the two places where we set the state to ST_CONNECTED (connect and finishConnect) then it should solve the issue. So unless I'm missing something (could very well b) then the "bound" field shouldn't be needed. I also have concern about adding this field because "state" and "localAddress" should already cover all states. An alternative to not setting the localAddress after the connection is establish is to change getLocalAddress to: if (localAddress == null && (state == ST_CONNECTED)) localAddress = Net.localAddress(fd); The downside to that is that on some platforms this can fail for the case that the peer has closed the connection. -Alan. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20120505/0774ad91/attachment.html From zhouyx at linux.vnet.ibm.com Mon May 7 00:36:39 2012 From: zhouyx at linux.vnet.ibm.com (Sean Chou) Date: Mon, 7 May 2012 15:36:39 +0800 Subject: Request for review: 7166048: remove the embedded epoll data structure In-Reply-To: <4FA38C4D.8060209@oracle.com> References: <4FA38C4D.8060209@oracle.com> Message-ID: Hi Alan, The new patch is at: http://cr.openjdk.java.net/~zhouyx/7166048/webrev.03/ . There are some spaces difference between the old and new file. Please take a look again. On Fri, May 4, 2012 at 3:59 PM, Alan Bateman wrote: > On 04/05/2012 08:35, Sean Chou wrote: > >> Hi all, >> >> I found the src/solaris/native/sun/nio/ch/**EPollArrayWrapper.c >> embedded the epoll data structure epoll_data and epoll_event . It is >> duplicated with the definition in sys/epoll.h . This redundancy would cause >> failure if the underlying system has different epoll data structure (eg. >> different alignment) . >> >> I reported 7166048 for it and made a patch: >> http://cr.openjdk.java.net/~**zhouyx/7166048/webrev.00/< >> http://cr.openjdk.java.net/%**7Ezhouyx/7166048/webrev.00/> >> . >> >> >> As background, when the epoll Selector was added then the JDK had to be > build on distributions that were still 2.4 based and didn't have epoll.h. > We had meant to go back and clear this up in JDK7 but didn't get around to > it. So thanks for taking this on now. I looked at your patch but it doesn't > seem to be complete as it doesn't change the usages to call the epoll > functions directly. Attached is the patch that I had for cleaning this up, > it should still be current because this code has not changed. The only > thing I notice is that it doesn't remove the include of dlfcn.h, that > shouldn't be needed now. > > On the RFE that you submitted, I've moved this to the right place as: > > 7166048: (se) EPollArrayWrapper.c no longer needs to define epoll data > structures > > -Alan. > > > > diff --git a/src/solaris/native/sun/nio/**ch/EPollArrayWrapper.c > b/src/solaris/native/sun/nio/**ch/EPollArrayWrapper.c > --- a/src/solaris/native/sun/nio/**ch/EPollArrayWrapper.c > +++ b/src/solaris/native/sun/nio/**ch/EPollArrayWrapper.c > @@ -34,55 +34,13 @@ > #include > #include > #include > - > -#ifdef __cplusplus > -extern "C" { > -#endif > - > -/* epoll_wait(2) man page */ > - > -typedef union epoll_data { > - void *ptr; > - int fd; > - __uint32_t u32; > - __uint64_t u64; > -} epoll_data_t; > - > - > -/* x86-64 has same alignment as 32-bit */ > -#ifdef __x86_64__ > -#define EPOLL_PACKED __attribute__((packed)) > -#else > -#define EPOLL_PACKED > -#endif > - > -struct epoll_event { > - __uint32_t events; /* Epoll events */ > - epoll_data_t data; /* User data variable */ > -} EPOLL_PACKED; > - > -#ifdef __cplusplus > -} > -#endif > +#include > > #define RESTARTABLE(_cmd, _result) do { \ > do { \ > _result = _cmd; \ > } while((_result == -1) && (errno == EINTR)); \ > } while(0) > - > -/* > - * epoll event notification is new in 2.6 kernel. As the offical build > - * platform for the JDK is on a 2.4-based distribution then we must > - * obtain the addresses of the epoll functions dynamically. > - */ > -typedef int (*epoll_create_t)(int size); > -typedef int (*epoll_ctl_t) (int epfd, int op, int fd, struct > epoll_event *event); > -typedef int (*epoll_wait_t) (int epfd, struct epoll_event *events, int > maxevents, int timeout); > - > -static epoll_create_t epoll_create_func; > -static epoll_ctl_t epoll_ctl_func; > -static epoll_wait_t epoll_wait_func; > > static int > iepoll(int epfd, struct epoll_event *events, int numfds, jlong timeout) > @@ -96,7 +54,7 @@ iepoll(int epfd, struct epoll_event *eve > start = t.tv_sec * 1000 + t.tv_usec / 1000; > > for (;;) { > - int res = (*epoll_wait_func)(epfd, events, numfds, timeout); > + int res = epoll_wait(epfd, events, numfds, timeout); > if (res < 0 && errno == EINTR) { > if (remaining >= 0) { > gettimeofday(&t, NULL); > @@ -117,14 +75,6 @@ JNIEXPORT void JNICALL > JNIEXPORT void JNICALL > Java_sun_nio_ch_**EPollArrayWrapper_init(JNIEnv *env, jclass this) > { > - epoll_create_func = (epoll_create_t) dlsym(RTLD_DEFAULT, > "epoll_create"); > - epoll_ctl_func = (epoll_ctl_t) dlsym(RTLD_DEFAULT, "epoll_ctl"); > - epoll_wait_func = (epoll_wait_t) dlsym(RTLD_DEFAULT, > "epoll_wait"); > - > - if ((epoll_create_func == NULL) || (epoll_ctl_func == NULL) || > - (epoll_wait_func == NULL)) { > - JNU_ThrowInternalError(env, "unable to get address of epoll > functions, pre-2.6 kernel?"); > - } > } > > JNIEXPORT jint JNICALL > @@ -134,7 +84,7 @@ Java_sun_nio_ch_**EPollArrayWrapper_epollC > * epoll_create expects a size as a hint to the kernel about how to > * dimension internal structures. We can't predict the size in advance. > */ > - int epfd = (*epoll_create_func)(256); > + int epfd = epoll_create(256); > if (epfd < 0) { > JNU_**ThrowIOExceptionWithLastError(**env, "epoll_create failed"); > } > @@ -173,7 +123,7 @@ Java_sun_nio_ch_**EPollArrayWrapper_epollC > event.events = events; > event.data.fd = fd; > > - RESTARTABLE((*epoll_ctl_func)(**epfd, (int)opcode, (int)fd, &event), > res); > + RESTARTABLE(epoll_ctl(epfd, (int)opcode, (int)fd, &event), res); > > /* > * A channel may be registered with several Selectors. When each > Selector > @@ -199,7 +149,7 @@ Java_sun_nio_ch_**EPollArrayWrapper_epollW > int res; > > if (timeout <= 0) { /* Indefinite or no wait */ > - RESTARTABLE((*epoll_wait_func)**(epfd, events, numfds, timeout), > res); > + RESTARTABLE(epoll_wait(epfd, events, numfds, timeout), res); > } else { /* Bounded wait; bounded restarts */ > res = iepoll(epfd, events, numfds, timeout); > } > > -- Best Regards, Sean Chou -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20120507/00f43349/attachment.html From Alan.Bateman at oracle.com Mon May 7 01:18:12 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Mon, 07 May 2012 09:18:12 +0100 Subject: Request for review: 7166048: remove the embedded epoll data structure In-Reply-To: References: <4FA38C4D.8060209@oracle.com> Message-ID: <4FA78544.6040501@oracle.com> On 07/05/2012 08:36, Sean Chou wrote: > Hi Alan, > > The new patch is at: > http://cr.openjdk.java.net/~zhouyx/7166048/webrev.03/ > . There are > some spaces difference between the old and new file. Please take a > look again. > This looks good to me, and thanks for removing the include of dlfcn.h as I missed that in the patch that I had for this. So will Charles push this for you or do you need me to do it? -Alan From littlee at linux.vnet.ibm.com Mon May 7 01:48:31 2012 From: littlee at linux.vnet.ibm.com (Charles Lee) Date: Mon, 07 May 2012 16:48:31 +0800 Subject: Request for review: 7166048: remove the embedded epoll data structure In-Reply-To: References: <4FA38C4D.8060209@oracle.com> Message-ID: <4FA78C5F.40907@linux.vnet.ibm.com> Hi Sean, The patch is committed @ Changeset: 62557a1336c0 Author: zhouyx Date: 2012-05-07 16:43 +0800 URL:http://hg.openjdk.java.net/jdk8/tl/jdk/rev/62557a1336c0 7166048: Remove the embeded epoll data structure. Reviewed-by: alanb Please verify it. And thanks Alan to review it. On 05/07/2012 03:36 PM, Sean Chou wrote: > Hi Alan, > > The new patch is at: > http://cr.openjdk.java.net/~zhouyx/7166048/webrev.03/ > . There are > some spaces difference between the old and new file. Please take a > look again. > > On Fri, May 4, 2012 at 3:59 PM, Alan Bateman > wrote: > > On 04/05/2012 08:35, Sean Chou wrote: > > Hi all, > > I found the > src/solaris/native/sun/nio/ch/EPollArrayWrapper.c embedded the > epoll data structure epoll_data and epoll_event . It is > duplicated with the definition in sys/epoll.h . This > redundancy would cause failure if the underlying system has > different epoll data structure (eg. different alignment) . > > I reported 7166048 for it and made a patch: > http://cr.openjdk.java.net/~zhouyx/7166048/webrev.00/ > > . > > > As background, when the epoll Selector was added then the JDK had > to be build on distributions that were still 2.4 based and didn't > have epoll.h. We had meant to go back and clear this up in JDK7 > but didn't get around to it. So thanks for taking this on now. I > looked at your patch but it doesn't seem to be complete as it > doesn't change the usages to call the epoll functions directly. > Attached is the patch that I had for cleaning this up, it should > still be current because this code has not changed. The only thing > I notice is that it doesn't remove the include of dlfcn.h, that > shouldn't be needed now. > > On the RFE that you submitted, I've moved this to the right place as: > > 7166048: (se) EPollArrayWrapper.c no longer needs to define epoll > data structures > > -Alan. > > > > diff --git a/src/solaris/native/sun/nio/ch/EPollArrayWrapper.c > b/src/solaris/native/sun/nio/ch/EPollArrayWrapper.c > --- a/src/solaris/native/sun/nio/ch/EPollArrayWrapper.c > +++ b/src/solaris/native/sun/nio/ch/EPollArrayWrapper.c > @@ -34,55 +34,13 @@ > #include > #include > #include > - > -#ifdef __cplusplus > -extern "C" { > -#endif > - > -/* epoll_wait(2) man page */ > - > -typedef union epoll_data { > - void *ptr; > - int fd; > - __uint32_t u32; > - __uint64_t u64; > -} epoll_data_t; > - > - > -/* x86-64 has same alignment as 32-bit */ > -#ifdef __x86_64__ > -#define EPOLL_PACKED __attribute__((packed)) > -#else > -#define EPOLL_PACKED > -#endif > - > -struct epoll_event { > - __uint32_t events; /* Epoll events */ > - epoll_data_t data; /* User data variable */ > -} EPOLL_PACKED; > - > -#ifdef __cplusplus > -} > -#endif > +#include > > #define RESTARTABLE(_cmd, _result) do { \ > do { \ > _result = _cmd; \ > } while((_result == -1) && (errno == EINTR)); \ > } while(0) > - > -/* > - * epoll event notification is new in 2.6 kernel. As the offical > build > - * platform for the JDK is on a 2.4-based distribution then we must > - * obtain the addresses of the epoll functions dynamically. > - */ > -typedef int (*epoll_create_t)(int size); > -typedef int (*epoll_ctl_t) (int epfd, int op, int fd, struct > epoll_event *event); > -typedef int (*epoll_wait_t) (int epfd, struct epoll_event > *events, int maxevents, int timeout); > - > -static epoll_create_t epoll_create_func; > -static epoll_ctl_t epoll_ctl_func; > -static epoll_wait_t epoll_wait_func; > > static int > iepoll(int epfd, struct epoll_event *events, int numfds, jlong > timeout) > @@ -96,7 +54,7 @@ iepoll(int epfd, struct epoll_event *eve > start = t.tv_sec * 1000 + t.tv_usec / 1000; > > for (;;) { > - int res = (*epoll_wait_func)(epfd, events, numfds, timeout); > + int res = epoll_wait(epfd, events, numfds, timeout); > if (res < 0 && errno == EINTR) { > if (remaining >= 0) { > gettimeofday(&t, NULL); > @@ -117,14 +75,6 @@ JNIEXPORT void JNICALL > JNIEXPORT void JNICALL > Java_sun_nio_ch_EPollArrayWrapper_init(JNIEnv *env, jclass this) > { > - epoll_create_func = (epoll_create_t) dlsym(RTLD_DEFAULT, > "epoll_create"); > - epoll_ctl_func = (epoll_ctl_t) dlsym(RTLD_DEFAULT, > "epoll_ctl"); > - epoll_wait_func = (epoll_wait_t) dlsym(RTLD_DEFAULT, > "epoll_wait"); > - > - if ((epoll_create_func == NULL) || (epoll_ctl_func == NULL) || > - (epoll_wait_func == NULL)) { > - JNU_ThrowInternalError(env, "unable to get address of > epoll functions, pre-2.6 kernel?"); > - } > } > > JNIEXPORT jint JNICALL > @@ -134,7 +84,7 @@ Java_sun_nio_ch_EPollArrayWrapper_epollC > * epoll_create expects a size as a hint to the kernel about > how to > * dimension internal structures. We can't predict the size in > advance. > */ > - int epfd = (*epoll_create_func)(256); > + int epfd = epoll_create(256); > if (epfd < 0) { > JNU_ThrowIOExceptionWithLastError(env, "epoll_create failed"); > } > @@ -173,7 +123,7 @@ Java_sun_nio_ch_EPollArrayWrapper_epollC > event.events = events; > event.data.fd = fd; > > - RESTARTABLE((*epoll_ctl_func)(epfd, (int)opcode, (int)fd, > &event), res); > + RESTARTABLE(epoll_ctl(epfd, (int)opcode, (int)fd, &event), res); > > /* > * A channel may be registered with several Selectors. When > each Selector > @@ -199,7 +149,7 @@ Java_sun_nio_ch_EPollArrayWrapper_epollW > int res; > > if (timeout <= 0) { /* Indefinite or no wait */ > - RESTARTABLE((*epoll_wait_func)(epfd, events, numfds, > timeout), res); > + RESTARTABLE(epoll_wait(epfd, events, numfds, timeout), res); > } else { /* Bounded wait; bounded restarts */ > res = iepoll(epfd, events, numfds, timeout); > } > > > > > -- > Best Regards, > Sean Chou > -- Yours Charles -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20120507/c9af9092/attachment-0001.html From zhouyx at linux.vnet.ibm.com Mon May 7 02:07:21 2012 From: zhouyx at linux.vnet.ibm.com (Sean Chou) Date: Mon, 7 May 2012 17:07:21 +0800 Subject: Request for review: 7166048: remove the embedded epoll data structure In-Reply-To: <4FA78C5F.40907@linux.vnet.ibm.com> References: <4FA38C4D.8060209@oracle.com> <4FA78C5F.40907@linux.vnet.ibm.com> Message-ID: Hi Alan and Charles, Many thanks. Patch confirmed. On Mon, May 7, 2012 at 4:48 PM, Charles Lee wrote: > Hi Sean, > > The patch is committed @ > > Changeset: 62557a1336c0 > Author: zhouyx > Date: 2012-05-07 16:43 +0800 > URL: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/62557a1336c0 > > 7166048: Remove the embeded epoll data structure. > Reviewed-by: alanb > > Please verify it. And thanks Alan to review it. > > > On 05/07/2012 03:36 PM, Sean Chou wrote: > > Hi Alan, > > The new patch is at: > http://cr.openjdk.java.net/~zhouyx/7166048/webrev.03/ . There are some > spaces difference between the old and new file. Please take a look again. > > On Fri, May 4, 2012 at 3:59 PM, Alan Bateman wrote: > >> On 04/05/2012 08:35, Sean Chou wrote: >> >>> Hi all, >>> >>> I found the src/solaris/native/sun/nio/ch/EPollArrayWrapper.c >>> embedded the epoll data structure epoll_data and epoll_event . It is >>> duplicated with the definition in sys/epoll.h . This redundancy would cause >>> failure if the underlying system has different epoll data structure (eg. >>> different alignment) . >>> >>> I reported 7166048 for it and made a patch: >>> http://cr.openjdk.java.net/~zhouyx/7166048/webrev.00/ < >>> http://cr.openjdk.java.net/%7Ezhouyx/7166048/webrev.00/> . >>> >>> >>> As background, when the epoll Selector was added then the JDK had to be >> build on distributions that were still 2.4 based and didn't have epoll.h. >> We had meant to go back and clear this up in JDK7 but didn't get around to >> it. So thanks for taking this on now. I looked at your patch but it doesn't >> seem to be complete as it doesn't change the usages to call the epoll >> functions directly. Attached is the patch that I had for cleaning this up, >> it should still be current because this code has not changed. The only >> thing I notice is that it doesn't remove the include of dlfcn.h, that >> shouldn't be needed now. >> >> On the RFE that you submitted, I've moved this to the right place as: >> >> 7166048: (se) EPollArrayWrapper.c no longer needs to define epoll data >> structures >> >> -Alan. >> >> >> >> diff --git a/src/solaris/native/sun/nio/ch/EPollArrayWrapper.c >> b/src/solaris/native/sun/nio/ch/EPollArrayWrapper.c >> --- a/src/solaris/native/sun/nio/ch/EPollArrayWrapper.c >> +++ b/src/solaris/native/sun/nio/ch/EPollArrayWrapper.c >> @@ -34,55 +34,13 @@ >> #include >> #include >> #include >> - >> -#ifdef __cplusplus >> -extern "C" { >> -#endif >> - >> -/* epoll_wait(2) man page */ >> - >> -typedef union epoll_data { >> - void *ptr; >> - int fd; >> - __uint32_t u32; >> - __uint64_t u64; >> -} epoll_data_t; >> - >> - >> -/* x86-64 has same alignment as 32-bit */ >> -#ifdef __x86_64__ >> -#define EPOLL_PACKED __attribute__((packed)) >> -#else >> -#define EPOLL_PACKED >> -#endif >> - >> -struct epoll_event { >> - __uint32_t events; /* Epoll events */ >> - epoll_data_t data; /* User data variable */ >> -} EPOLL_PACKED; >> - >> -#ifdef __cplusplus >> -} >> -#endif >> +#include >> >> #define RESTARTABLE(_cmd, _result) do { \ >> do { \ >> _result = _cmd; \ >> } while((_result == -1) && (errno == EINTR)); \ >> } while(0) >> - >> -/* >> - * epoll event notification is new in 2.6 kernel. As the offical build >> - * platform for the JDK is on a 2.4-based distribution then we must >> - * obtain the addresses of the epoll functions dynamically. >> - */ >> -typedef int (*epoll_create_t)(int size); >> -typedef int (*epoll_ctl_t) (int epfd, int op, int fd, struct >> epoll_event *event); >> -typedef int (*epoll_wait_t) (int epfd, struct epoll_event *events, int >> maxevents, int timeout); >> - >> -static epoll_create_t epoll_create_func; >> -static epoll_ctl_t epoll_ctl_func; >> -static epoll_wait_t epoll_wait_func; >> >> static int >> iepoll(int epfd, struct epoll_event *events, int numfds, jlong timeout) >> @@ -96,7 +54,7 @@ iepoll(int epfd, struct epoll_event *eve >> start = t.tv_sec * 1000 + t.tv_usec / 1000; >> >> for (;;) { >> - int res = (*epoll_wait_func)(epfd, events, numfds, timeout); >> + int res = epoll_wait(epfd, events, numfds, timeout); >> if (res < 0 && errno == EINTR) { >> if (remaining >= 0) { >> gettimeofday(&t, NULL); >> @@ -117,14 +75,6 @@ JNIEXPORT void JNICALL >> JNIEXPORT void JNICALL >> Java_sun_nio_ch_EPollArrayWrapper_init(JNIEnv *env, jclass this) >> { >> - epoll_create_func = (epoll_create_t) dlsym(RTLD_DEFAULT, >> "epoll_create"); >> - epoll_ctl_func = (epoll_ctl_t) dlsym(RTLD_DEFAULT, >> "epoll_ctl"); >> - epoll_wait_func = (epoll_wait_t) dlsym(RTLD_DEFAULT, >> "epoll_wait"); >> - >> - if ((epoll_create_func == NULL) || (epoll_ctl_func == NULL) || >> - (epoll_wait_func == NULL)) { >> - JNU_ThrowInternalError(env, "unable to get address of epoll >> functions, pre-2.6 kernel?"); >> - } >> } >> >> JNIEXPORT jint JNICALL >> @@ -134,7 +84,7 @@ Java_sun_nio_ch_EPollArrayWrapper_epollC >> * epoll_create expects a size as a hint to the kernel about how to >> * dimension internal structures. We can't predict the size in >> advance. >> */ >> - int epfd = (*epoll_create_func)(256); >> + int epfd = epoll_create(256); >> if (epfd < 0) { >> JNU_ThrowIOExceptionWithLastError(env, "epoll_create failed"); >> } >> @@ -173,7 +123,7 @@ Java_sun_nio_ch_EPollArrayWrapper_epollC >> event.events = events; >> event.data.fd = fd; >> >> - RESTARTABLE((*epoll_ctl_func)(epfd, (int)opcode, (int)fd, &event), >> res); >> + RESTARTABLE(epoll_ctl(epfd, (int)opcode, (int)fd, &event), res); >> >> /* >> * A channel may be registered with several Selectors. When each >> Selector >> @@ -199,7 +149,7 @@ Java_sun_nio_ch_EPollArrayWrapper_epollW >> int res; >> >> if (timeout <= 0) { /* Indefinite or no wait */ >> - RESTARTABLE((*epoll_wait_func)(epfd, events, numfds, timeout), >> res); >> + RESTARTABLE(epoll_wait(epfd, events, numfds, timeout), res); >> } else { /* Bounded wait; bounded restarts */ >> res = iepoll(epfd, events, numfds, timeout); >> } >> >> > > > -- > Best Regards, > Sean Chou > > > > -- > Yours Charles > > -- Best Regards, Sean Chou -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20120507/55517459/attachment.html From kurchi.subhra.hazra at oracle.com Mon May 7 09:02:35 2012 From: kurchi.subhra.hazra at oracle.com (Kurchi Subhra Hazra) Date: Mon, 07 May 2012 09:02:35 -0700 Subject: Code Review Request: 7096436: (sc) SocketChannel.connect fails on Windows 8 when channel configured non-blocking In-Reply-To: <4FA5122F.8010605@oracle.com> References: <4FA42413.1080109@oracle.com> <4FA5122F.8010605@oracle.com> Message-ID: <4FA7F21B.9010505@oracle.com> Hi Alan, >> >> Bug: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7096436 >> Webrev: http://cr.openjdk.java.net/~khazra/7096436/webrev.00/ > Thanks for taking this one, the assumption has always been that the > connect would cause the socket to be bound immediately, if not already > bound. > > I agree with removing the code at L632-640 as that's where the > assumption is that the socket is implicitly bound. However if we > assign localAddress in the two places where we set the state to > ST_CONNECTED (connect and finishConnect) then it should solve the > issue. So unless I'm missing something (could very well b) then the > "bound" field shouldn't be needed. I also have concern about adding > this field because "state" and "localAddress" should already cover all > states. An alternative to not setting the localAddress after the > connection is establish is to change getLocalAddress to: In this case what happens if someone calls: sc.socket.bind(address); sc.getLocalAddress(); I am guessing that state will still be ST_UNCONNECTED, and getLocalAddress will return null. Or am I missing something? - Kurchi > > if (localAddress == null && (state == ST_CONNECTED)) > localAddress = Net.localAddress(fd); > > The downside to that is that on some platforms this can fail for the > case that the peer has closed the connection. > > -Alan. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20120507/83398fb1/attachment.html From Alan.Bateman at oracle.com Mon May 7 10:11:56 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Mon, 07 May 2012 18:11:56 +0100 Subject: Code Review Request: 7096436: (sc) SocketChannel.connect fails on Windows 8 when channel configured non-blocking In-Reply-To: <4FA7F21B.9010505@oracle.com> References: <4FA42413.1080109@oracle.com> <4FA5122F.8010605@oracle.com> <4FA7F21B.9010505@oracle.com> Message-ID: <4FA8025C.2070701@oracle.com> On 07/05/2012 17:02, Kurchi Subhra Hazra wrote: > > In this case what happens if someone calls: > sc.socket.bind(address); > sc.getLocalAddress(); > > I am guessing that state will still be ST_UNCONNECTED, and > getLocalAddress will return null. Or am I missing something? The bind method will set localAddress after binding the socket and shouldn't need to change. It's possible that after the connection is established that the local address will be some specific that the address that we explicitly bound to, in which case the logic to set localAddress when changing the state to ST_CONNECTED will handle this. -Alan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20120507/39628fcd/attachment-0001.html From kurchi.subhra.hazra at oracle.com Mon May 7 12:25:52 2012 From: kurchi.subhra.hazra at oracle.com (Kurchi Hazra) Date: Mon, 07 May 2012 12:25:52 -0700 Subject: Code Review Request: 7096436: (sc) SocketChannel.connect fails on Windows 8 when channel configured non-blocking In-Reply-To: <4FA8025C.2070701@oracle.com> References: <4FA42413.1080109@oracle.com> <4FA5122F.8010605@oracle.com> <4FA7F21B.9010505@oracle.com> <4FA8025C.2070701@oracle.com> Message-ID: <4FA821C0.4050809@oracle.com> I see. Updated webrev: http://cr.openjdk.java.net/~khazra/7096436/webrev.01/ Thanks, Kurchi On 5/7/2012 10:11 AM, Alan Bateman wrote: > On 07/05/2012 17:02, Kurchi Subhra Hazra wrote: >> >> In this case what happens if someone calls: >> sc.socket.bind(address); >> sc.getLocalAddress(); >> >> I am guessing that state will still be ST_UNCONNECTED, and >> getLocalAddress will return null. Or am I missing something? > The bind method will set localAddress after binding the socket and > shouldn't need to change. It's possible that after the connection is > established that the local address will be some specific that the > address that we explicitly bound to, in which case the logic to set > localAddress when changing the state to ST_CONNECTED will handle this. > > -Alan -- -Kurchi -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20120507/840dcceb/attachment.html From kurchi.subhra.hazra at oracle.com Mon May 7 22:39:19 2012 From: kurchi.subhra.hazra at oracle.com (Kurchi Subhra Hazra) Date: Mon, 07 May 2012 22:39:19 -0700 Subject: Code Review Request: 7096436: (sc) SocketChannel.connect fails on Windows 8 when channel configured non-blocking In-Reply-To: <4FA821C0.4050809@oracle.com> References: <4FA42413.1080109@oracle.com> <4FA5122F.8010605@oracle.com> <4FA7F21B.9010505@oracle.com> <4FA8025C.2070701@oracle.com> <4FA821C0.4050809@oracle.com> Message-ID: <4FA8B187.80901@oracle.com> Looking back now and as rightly pointed out by you, all that we need to change to fix this issue is: http://cr.openjdk.java.net/~khazra/7096436/webrev.02/ Thanks, Kurchi On 5/7/12 12:25 PM, Kurchi Hazra wrote: > I see. Updated webrev: > > http://cr.openjdk.java.net/~khazra/7096436/webrev.01/ > > > > Thanks, > Kurchi > > On 5/7/2012 10:11 AM, Alan Bateman wrote: >> On 07/05/2012 17:02, Kurchi Subhra Hazra wrote: >>> >>> In this case what happens if someone calls: >>> sc.socket.bind(address); >>> sc.getLocalAddress(); >>> >>> I am guessing that state will still be ST_UNCONNECTED, and >>> getLocalAddress will return null. Or am I missing something? >> The bind method will set localAddress after binding the socket and >> shouldn't need to change. It's possible that after the connection is >> established that the local address will be some specific that the >> address that we explicitly bound to, in which case the logic to set >> localAddress when changing the state to ST_CONNECTED will handle this. >> >> -Alan > > -- > -Kurchi -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20120507/75c0e826/attachment.html From Alan.Bateman at oracle.com Tue May 8 01:05:53 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Tue, 08 May 2012 09:05:53 +0100 Subject: Code Review Request: 7096436: (sc) SocketChannel.connect fails on Windows 8 when channel configured non-blocking In-Reply-To: <4FA8B187.80901@oracle.com> References: <4FA42413.1080109@oracle.com> <4FA5122F.8010605@oracle.com> <4FA7F21B.9010505@oracle.com> <4FA8025C.2070701@oracle.com> <4FA821C0.4050809@oracle.com> <4FA8B187.80901@oracle.com> Message-ID: <4FA8D3E1.5@oracle.com> On 08/05/2012 06:39, Kurchi Subhra Hazra wrote: > Looking back now and as rightly pointed out by you, all that we need > to change to fix this issue is: > http://cr.openjdk.java.net/~khazra/7096436/webrev.02/ This looks right to me. -Alan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20120508/633080f0/attachment.html From youdwei at linux.vnet.ibm.com Wed May 9 00:33:29 2012 From: youdwei at linux.vnet.ibm.com (Deven You) Date: Wed, 09 May 2012 15:33:29 +0800 Subject: Using OP_CONNECT with Selector.select causes selector to fire repeatedly In-Reply-To: <4FAA1C40.6040709@linux.vnet.ibm.com> References: <4FAA1C40.6040709@linux.vnet.ibm.com> Message-ID: <4FAA1DC9.4030605@linux.vnet.ibm.com> I suddenly realized this topic should be in nio-dev mailing list. Please ignore previous mail. Thanks a lot! On 05/09/2012 03:26 PM, Deven You wrote: > Hi All, > > When start a simple server, listening on a port like 8765, which just > accepts connections. We then register a non-blocking SocketChannel > (the client) with a selector with interest in OP_CONNECT, so that we > can use the selector to notify us when the channel is ready to finish > connecting. > > We call client.connect and then selector.select in a loop. The > selector fires and with the client channel in the selected-keys set > and we call finishConnect() on the client's channel. > > Then the problem occurs: > The selector repeatedly fires with no entries in its selected-keys > set, whereas it should block in the next select operation until there > is at least one key in the selected-keys set. > > There is already a sun bug for this issue[1], when I looked into this > sun bug I realize the second scenario of this sun bug is described > very detailed by a duplicate sun bug[2]. > > One way to solve this problem is let selector reset the OP_CONNECT > bit as 0 after the channel is connected for the corresponding key > using key.interestOps(). I just make a patch[3] for this approach. > > Could anyone take a look at this patch to see if we could solve this > problem in this way! > > [1] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4919127 > > [2] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4960791 > > [3] http://cr.openjdk.java.net/~littlee/ojdk-317/webrev.00/ > > > Thanks a lot! > -- > Best Regards, > > Deven -- Best Regards, Deven -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20120509/c128de9f/attachment.html From Alan.Bateman at oracle.com Wed May 9 01:19:28 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Wed, 09 May 2012 09:19:28 +0100 Subject: Using OP_CONNECT with Selector.select causes selector to fire repeatedly In-Reply-To: <4FAA1DC9.4030605@linux.vnet.ibm.com> References: <4FAA1C40.6040709@linux.vnet.ibm.com> <4FAA1DC9.4030605@linux.vnet.ibm.com> Message-ID: <4FAA2890.9030501@oracle.com> Deven, I'm busy at the moment and don't have time to write a detailed reply. However I think this one is just incorrect usage of the API that should be fixed by adding a warning or some clarification to the javadoc, not an implementation change. For example, if your test then the return value from connect is ignored and therefore the test is not handling the case where the connection is established immediately. Even if the connection is not established immediately then the test code that invokes finishConnect doesn't change interestOps, say to OP_READ for the case that it expects to read from the connection. I will get back to you when I can but I really think this one just requires adding something to the javadoc. -Alan On 09/05/2012 08:33, Deven You wrote: > I suddenly realized this topic should be in nio-dev mailing list. > Please ignore previous mail. > > Thanks a lot! > > On 05/09/2012 03:26 PM, Deven You wrote: >> Hi All, >> >> When start a simple server, listening on a port like 8765, which just >> accepts connections. We then register a non-blocking SocketChannel >> (the client) with a selector with interest in OP_CONNECT, so that we >> can use the selector to notify us when the channel is ready to finish >> connecting. >> >> We call client.connect and then selector.select in a loop. The >> selector fires and with the client channel in the selected-keys set >> and we call finishConnect() on the client's channel. >> >> Then the problem occurs: >> The selector repeatedly fires with no entries in its selected-keys >> set, whereas it should block in the next select operation until there >> is at least one key in the selected-keys set. >> >> There is already a sun bug for this issue[1], when I looked into this >> sun bug I realize the second scenario of this sun bug is described >> very detailed by a duplicate sun bug[2]. >> >> One way to solve this problem is let selector reset the OP_CONNECT >> bit as 0 after the channel is connected for the corresponding key >> using key.interestOps(). I just make a patch[3] for this approach. >> >> Could anyone take a look at this patch to see if we could solve this >> problem in this way! >> >> [1] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4919127 >> >> [2] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4960791 >> >> [3] http://cr.openjdk.java.net/~littlee/ojdk-317/webrev.00/ >> >> >> Thanks a lot! >> -- >> Best Regards, >> >> Deven > > > -- > Best Regards, > > Deven -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20120509/5c04c1bf/attachment.html From zhong.j.yu at gmail.com Thu May 10 20:09:08 2012 From: zhong.j.yu at gmail.com (Zhong Yu) Date: Thu, 10 May 2012 22:09:08 -0500 Subject: java.nio.file.Files.isReadable() slow on Windows 7 Message-ID: java.nio.file.Files.isReadable() seems to be really slow on Windows 7; on my machine it takes 3ms per call (tested in a tight loop on the same file). That means only 300 calls per second. It's much faster to test readability by opening a file channel for read then close it. Any other workarounds? Zhong Yu From Alan.Bateman at oracle.com Fri May 11 00:32:04 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 11 May 2012 08:32:04 +0100 Subject: java.nio.file.Files.isReadable() slow on Windows 7 In-Reply-To: References: Message-ID: <4FACC074.2040800@oracle.com> On 11/05/2012 04:09, Zhong Yu wrote: > java.nio.file.Files.isReadable() seems to be really slow on Windows 7; > on my machine it takes 3ms per call (tested in a tight loop on the > same file). That means only 300 calls per second. > > It's much faster to test readability by opening a file channel for > read then close it. > > Any other workarounds? > > Zhong Yu Oops, I thought we had put in a fast path for the check read case. The background to this is that checking access to a file on Windows is very expensive because it requires reading the DACL and determining the user's effective access. I don't have time to check a Windows machine just at the minute but can you change: Files.isReadable(path) to path.getFileSystem().provider().checkAccess(path) in your test, re-run and tell us if this fixes the issue. I suspect it will because checkAccess has a fast path for the check read case, and that fast path doesn't take correctly take access of the isReadable usage. -Alan. From zhong.j.yu at gmail.com Fri May 11 07:29:13 2012 From: zhong.j.yu at gmail.com (Zhong Yu) Date: Fri, 11 May 2012 09:29:13 -0500 Subject: java.nio.file.Files.isReadable() slow on Windows 7 In-Reply-To: <4FACC074.2040800@oracle.com> References: <4FACC074.2040800@oracle.com> Message-ID: On Fri, May 11, 2012 at 2:32 AM, Alan Bateman wrote: > On 11/05/2012 04:09, Zhong Yu wrote: >> >> java.nio.file.Files.isReadable() seems to be really slow on Windows 7; >> on my machine it takes 3ms per call (tested in a tight loop on the >> same file). That means only 300 calls per second. >> >> It's much faster to test readability by opening a file channel for >> read then close it. >> >> Any other workarounds? >> >> Zhong Yu > > Oops, I thought we had put in a fast path for the check read case. The > background to this is that checking access to a file on Windows is very > expensive because it requires reading the DACL and determining the user's > effective access. > > I don't have time to check a Windows machine just at the minute but can you > change: > > Files.isReadable(path) > > to > > path.getFileSystem().provider().checkAccess(path) yes it works, this one takes only 0.02ms! > in your test, re-run and tell us if this fixes the issue. I suspect it will > because checkAccess has a fast path for the check read case, and that fast > path doesn't take correctly take access of the isReadable usage. > > -Alan. From Alan.Bateman at oracle.com Fri May 11 07:42:37 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 11 May 2012 15:42:37 +0100 Subject: java.nio.file.Files.isReadable() slow on Windows 7 In-Reply-To: References: <4FACC074.2040800@oracle.com> Message-ID: <4FAD255D.6020307@oracle.com> On 11/05/2012 15:29, Zhong Yu wrote: > : > > I don't have time to check a Windows machine just at the minute but can you > change: > > Files.isReadable(path) > > to > > path.getFileSystem().provider().checkAccess(path) > yes it works, this one takes only 0.02ms! > Thanks for checking, I've created bug 7168172 to make sure this isn't forgotten. -Alan From zhong.j.yu at gmail.com Fri May 11 07:47:03 2012 From: zhong.j.yu at gmail.com (Zhong Yu) Date: Fri, 11 May 2012 09:47:03 -0500 Subject: java.nio.file.Files.isReadable() slow on Windows 7 In-Reply-To: References: <4FACC074.2040800@oracle.com> Message-ID: However... test it on an unreadable file, checkAccess(path) does not throw an exception. Please take note of it The file has Read permission Deny for all users. For this file Files.isReadable() returns false. checkAccess(path, AccessMode.READ) throws java.nio.file.AccessDeniedException: \tmp\unreadable.txt: Effective permissions does not allow requested access AsynchronousFileChannel.open(path, READ) throws java.nio.file.AccessDeniedException: C:\tmp\unreadable.txt On Fri, May 11, 2012 at 9:29 AM, Zhong Yu wrote: > On Fri, May 11, 2012 at 2:32 AM, Alan Bateman wrote: >> On 11/05/2012 04:09, Zhong Yu wrote: >>> >>> java.nio.file.Files.isReadable() seems to be really slow on Windows 7; >>> on my machine it takes 3ms per call (tested in a tight loop on the >>> same file). That means only 300 calls per second. >>> >>> It's much faster to test readability by opening a file channel for >>> read then close it. >>> >>> Any other workarounds? >>> >>> Zhong Yu >> >> Oops, I thought we had put in a fast path for the check read case. The >> background to this is that checking access to a file on Windows is very >> expensive because it requires reading the DACL and determining the user's >> effective access. >> >> I don't have time to check a Windows machine just at the minute but can you >> change: >> >> Files.isReadable(path) >> >> to >> >> path.getFileSystem().provider().checkAccess(path) > > yes it works, this one takes only 0.02ms! > >> in your test, re-run and tell us if this fixes the issue. I suspect it will >> because checkAccess has a fast path for the check read case, and that fast >> path doesn't take correctly take access of the isReadable usage. >> >> -Alan. From Alan.Bateman at oracle.com Fri May 11 07:51:54 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 11 May 2012 15:51:54 +0100 Subject: java.nio.file.Files.isReadable() slow on Windows 7 In-Reply-To: References: <4FACC074.2040800@oracle.com> Message-ID: <4FAD278A.2040907@oracle.com> On 11/05/2012 15:47, Zhong Yu wrote: > However... test it on an unreadable file, checkAccess(path) does not > throw an exception. Please take note of it > > The file has Read permission Deny for all users. For this file > > Files.isReadable() returns false. > > checkAccess(path, AccessMode.READ) throws > java.nio.file.AccessDeniedException: \tmp\unreadable.txt: > Effective permissions does not allow requested access > > AsynchronousFileChannel.open(path, READ) throws > java.nio.file.AccessDeniedException: C:\tmp\unreadable.txt > Understood and I'll add it to the bug. The issue is that fast path is assuming that the file is readable when it have READ_ATTRIBUTES access. You can verify that by denying READ_ATTRIBUTES + READ. -Alan From Alan.Bateman at oracle.com Mon May 14 07:31:07 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Mon, 14 May 2012 15:31:07 +0100 Subject: 7168505: (bf) MappedByteBuffer.load does not load buffer's content into memory Message-ID: <4FB1172B.6030000@oracle.com> This one is somewhat assuming. MappedByteBuffer.load touches each page in the mapping into order to load the buffer into memory. Unfortunately it gets compiled into essentially a no-op server compiler (aside from the madvise). Someone ran into this on the Mac porting list [1]. The reason it isn't observed with JDK6 and older is because the load method is implemented in native code (we moved it into into Java as part of changes to sort out some alignment issues and also to address the issue of the file being truncated). The webrev with the proposed changes is here: http://cr.openjdk.java.net/~alanb/7168505/webrev/ There are other ways of course. -Alan [1] http://mail.openjdk.java.net/pipermail/macosx-port-dev/2012-May/004143.html From mike.duigou at oracle.com Mon May 14 08:32:47 2012 From: mike.duigou at oracle.com (Mike Duigou) Date: Mon, 14 May 2012 08:32:47 -0700 Subject: 7168505: (bf) MappedByteBuffer.load does not load buffer's content into memory In-Reply-To: <4FB1172B.6030000@oracle.com> References: <4FB1172B.6030000@oracle.com> Message-ID: <7F798140-C178-408C-BAD5-7FBB5B1166ED@oracle.com> Looks good. It's kind of amazing that the server compiler can figure out that unsafe.getByte() is side effect free. The comment on unsused could be javadoc. You could initialize x to unused and eliminate the conditional before assignment of unused from x Mike On May 14 2012, at 07:31 , Alan Bateman wrote: > > This one is somewhat assuming. MappedByteBuffer.load touches each page in the mapping into order to load the buffer into memory. Unfortunately it gets compiled into essentially a no-op server compiler (aside from the madvise). Someone ran into this on the Mac porting list [1]. The reason it isn't observed with JDK6 and older is because the load method is implemented in native code (we moved it into into Java as part of changes to sort out some alignment issues and also to address the issue of the file being truncated). The webrev with the proposed changes is here: > > http://cr.openjdk.java.net/~alanb/7168505/webrev/ > > There are other ways of course. > > -Alan > > [1] http://mail.openjdk.java.net/pipermail/macosx-port-dev/2012-May/004143.html From Alan.Bateman at oracle.com Mon May 14 08:55:46 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Mon, 14 May 2012 16:55:46 +0100 Subject: 7168505: (bf) MappedByteBuffer.load does not load buffer's content into memory In-Reply-To: <7F798140-C178-408C-BAD5-7FBB5B1166ED@oracle.com> References: <4FB1172B.6030000@oracle.com> <7F798140-C178-408C-BAD5-7FBB5B1166ED@oracle.com> Message-ID: <4FB12B02.9040601@oracle.com> On 14/05/2012 16:32, Mike Duigou wrote: > Looks good. It's kind of amazing that the server compiler can figure out that unsafe.getByte() is side effect free. > > The comment on unsused could be javadoc. > > You could initialize x to unused and eliminate the conditional before assignment of unused from x > > Mike > Thanks Mike. The comment on the private fields in this code use // so I just kept it consistent. I didn't get comment about x = unused. The test if unused != 0 was just to avoid storing to that field as it's a static. Performance wise of course it doesn't matter what we do as touching each page will typically result in lots of faulting and I/O. -Alan From forax at univ-mlv.fr Tue May 15 00:07:48 2012 From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=) Date: Tue, 15 May 2012 09:07:48 +0200 Subject: 7168505: (bf) MappedByteBuffer.load does not load buffer's content into memory In-Reply-To: <4FB12B02.9040601@oracle.com> References: <4FB1172B.6030000@oracle.com> <7F798140-C178-408C-BAD5-7FBB5B1166ED@oracle.com> <4FB12B02.9040601@oracle.com> Message-ID: <4FB200C4.5000509@univ-mlv.fr> On 05/14/2012 05:55 PM, Alan Bateman wrote: > On 14/05/2012 16:32, Mike Duigou wrote: >> Looks good. It's kind of amazing that the server compiler can figure >> out that unsafe.getByte() is side effect free. >> >> The comment on unsused could be javadoc. >> >> You could initialize x to unused and eliminate the conditional before >> assignment of unused from x >> >> Mike >> > Thanks Mike. The comment on the private fields in this code use // so > I just kept it consistent. > > I didn't get comment about x = unused. The test if unused != 0 was > just to avoid storing to that field as it's a static. Performance wise > of course it doesn't matter what we do as touching each page will > typically result in lots of faulting and I/O. > > -Alan I wonder if it's not better to have a fake method recognized by the VM for that, because if the VM is smart enough, 'unused' is only used in load() so it doesn't really escape. A VM/JIT that will generate assembly code for the whole class, by example when module is installed in module repository, will see that easily. R?mi From Alan.Bateman at oracle.com Tue May 15 02:33:14 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Tue, 15 May 2012 10:33:14 +0100 Subject: 7168505: (bf) MappedByteBuffer.load does not load buffer's content into memory In-Reply-To: <4FB200C4.5000509@univ-mlv.fr> References: <4FB1172B.6030000@oracle.com> <7F798140-C178-408C-BAD5-7FBB5B1166ED@oracle.com> <4FB12B02.9040601@oracle.com> <4FB200C4.5000509@univ-mlv.fr> Message-ID: <4FB222DA.5090700@oracle.com> On 15/05/2012 08:07, R?mi Forax wrote: > > I wonder if it's not better to have a fake method recognized by the VM > for that, > because if the VM is smart enough, 'unused' is only used in load() so > it doesn't really escape. > A VM/JIT that will generate assembly code for the whole class, by > example when module is installed > in module repository, will see that easily. I suspect the potential for reflection usage will make it harder for prove. I think the solution in the webrev is okay for now, but I agree that we might need something special, like a get method that is known to the VM as having potential side effects. -Alan From forax at univ-mlv.fr Tue May 15 03:28:07 2012 From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=) Date: Tue, 15 May 2012 12:28:07 +0200 Subject: 7168505: (bf) MappedByteBuffer.load does not load buffer's content into memory In-Reply-To: <4FB222DA.5090700@oracle.com> References: <4FB1172B.6030000@oracle.com> <7F798140-C178-408C-BAD5-7FBB5B1166ED@oracle.com> <4FB12B02.9040601@oracle.com> <4FB200C4.5000509@univ-mlv.fr> <4FB222DA.5090700@oracle.com> Message-ID: <4FB22FB7.7050305@univ-mlv.fr> On 05/15/2012 11:33 AM, Alan Bateman wrote: > On 15/05/2012 08:07, R?mi Forax wrote: >> >> I wonder if it's not better to have a fake method recognized by the >> VM for that, >> because if the VM is smart enough, 'unused' is only used in load() so >> it doesn't really escape. >> A VM/JIT that will generate assembly code for the whole class, by >> example when module is installed >> in module repository, will see that easily. > I suspect the potential for reflection usage will make it harder for > prove. The VM can deoptimize when you ask for a Field on the static field. I think Cliff Click has played with something like that. > I think the solution in the webrev is okay for now, but I agree that > we might need something special, like a get method that is known to > the VM as having potential side effects. It will be also easier to write micro-benchmarks that are not reduced to a constant. > > -Alan R?mi From Alan.Bateman at oracle.com Tue May 15 05:57:42 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Tue, 15 May 2012 13:57:42 +0100 Subject: 7168505: (bf) MappedByteBuffer.load does not load buffer's content into memory In-Reply-To: <4FB22FB7.7050305@univ-mlv.fr> References: <4FB1172B.6030000@oracle.com> <7F798140-C178-408C-BAD5-7FBB5B1166ED@oracle.com> <4FB12B02.9040601@oracle.com> <4FB200C4.5000509@univ-mlv.fr> <4FB222DA.5090700@oracle.com> <4FB22FB7.7050305@univ-mlv.fr> Message-ID: <4FB252C6.7070505@oracle.com> On 15/05/2012 11:28, R?mi Forax wrote: > > The VM can deoptimize when you ask for a Field on the static field. > I think Cliff Click has played with something like that. Off-hand, I don't know without checking into it. Do you any objection to the change proposed? I'd like to get it into 7u6 so that load works as it should. This means getting it fixed in jdk8 first. We can of course explore other approaches after that. -Alan From Alan.Bateman at oracle.com Tue May 15 13:40:16 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Tue, 15 May 2012 21:40:16 +0100 Subject: Using OP_CONNECT with Selector.select causes selector to fire repeatedly In-Reply-To: <4FAA1C40.6040709@linux.vnet.ibm.com> References: <4FAA1C40.6040709@linux.vnet.ibm.com> Message-ID: <4FB2BF30.9060902@oracle.com> Devon, Just to follow up from my previous mail. I looked at the changes in the patch and also tried it out to see the side effects. One issue is that the changes mean that OP_CONNECT will be "automatically" removed from the interest ops, minimally that would require a spec change. When I ran the Selector tests (in the jdk repository) then I noticed a few failures so there are other side effects too. In any case, I think the one is really just a mis-use of the API and that we should instead adding wording to the javadoc rather than change the implementation. -Alan From kurchi.subhra.hazra at oracle.com Tue May 15 14:41:06 2012 From: kurchi.subhra.hazra at oracle.com (Kurchi Hazra) Date: Tue, 15 May 2012 14:41:06 -0700 Subject: [7u6] Request for approval: 7096436: (sc) SocketChannel.connect fails on Windows 8 when channel configured non-blocking Message-ID: <4FB2CD72.8040300@oracle.com> Requesting approval to commit fix for CR 7096436. Bug: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7096436 Webrev: http://cr.openjdk.java.net/~khazra/7096436/7u6/webrev.00/ This had been reviewed by Alan Bateman. [1] This fix has been pushed into jdk8 [2] Thanks, Kurchi [1] http://mail.openjdk.java.net/pipermail/nio-dev/2012-May/001653.html [2] http://hg.openjdk.java.net/jdk8/tl/jdk/rev/5152c832745a From Alan.Bateman at oracle.com Sat May 19 02:11:17 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Sat, 19 May 2012 10:11:17 +0100 Subject: 7170203: TEST_BUG: test/java/nio/MappedByteBuffer/Truncate.java failing intermittently Message-ID: <4FB763B5.9010107@oracle.com> A few days we fixed MappedByteBuffer.load so that it that it works even after being compiled at runtime: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/332bebb463d1 It turns out that exposes an issue, of sorts, with one of the tests that I did not see before pushing the change. The test truncates a file that is mapped and then attempts to read from the mapped buffer, causing the thread to terminate with an uncaught Error. The issue is that jtreg runs the test in a thread group that has an uncaught exception handler and so the uncaught error is causing the test to fail. There are various ways to fix this, one would be change the task to catch Error or Throwable. Another way, which I prefer, is to just set the uncaught exception handler on the thread so that we have the exception in the test output. Attached is the proposed patch. -Alan. diff --git a/test/java/nio/MappedByteBuffer/Truncate.java b/test/java/nio/MappedByteBuffer/Truncate.java --- a/test/java/nio/MappedByteBuffer/Truncate.java +++ b/test/java/nio/MappedByteBuffer/Truncate.java @@ -88,6 +88,11 @@ } }; Thread t = new Thread(r); + t.setUncaughtExceptionHandler(new Thread.UncaughtExceptionHandler() { + public void uncaughtException(Thread t, Throwable e) { + e.printStackTrace(); + } + }); t.start(); From Alan.Bateman at oracle.com Sat May 19 03:26:29 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Sat, 19 May 2012 11:26:29 +0100 Subject: 7169050: (se) Selector.select slow on Solaris due to insertion of POLLREMOVE and 0 events Message-ID: <4FB77555.2050407@oracle.com> Some recent changes in the Solaris kernel have exposed performance issues that are caused by the way that the /dev/poll based Selectors uses the driver. These issues have always been there, but are were completely invisible (until now and thanks to several people in the Solaris kernel and performance teams to diagnose the issues). One issue is the batch updates to the driver. If the update contains more than one pollfd entry for a file descriptor then the events are OR'ed. To workaround this then the original Selector implementation inserts a POLLREMOVE event before each update. This just happens to work, but isn't guaranteed. The impact is that the driver is spending a lot of time scanning for entries that are not present. A second issue is that deregistration step in the Selector is writing the POLLREMOVE after the file descriptor has been closed, again leading to additional overhead in the driver. A third issue relates to file descriptors that are registered with an event mask of 0, again leading to more performance issues. The webrev with a patch to fix these issues is here: http://cr.openjdk.java.net/~alanb/7169050/webrev/ In summary: 1. Pending updates are queued as before except that at most one update is pending for a file descriptor (to eliminates the effects of OR'ing). A bit set is used to indicate if a file descriptor is in the update list. It should only be very rarely that it needs to scan the update list for the file descriptor. 2. The release (deregistration) no longer queues a POLLREMOVE but instead writes the POLLREMOVE immediately (after dropping any pending update for the file descriptor). This means the descriptor is removed from the driver before the file descriptor is closed. This is similar to how we did this in the epoll based Selector. 3. The batch update no longer inserts a POLLREMOVE before each update (not needed now). Additionally it removes the file descriptor when then event mask is changed to 0 (this is also something we do in the epoll Selector). That's mostly it. I should explain that I went through a couple of iterations on this (and thanks to Joy Xiong from the Solaris Performance team for running benchmarks on several test builds). Previous iterations included writing single updates rather than in batches, and changing the update list into a Map that is keyed on the file descriptor. There may be further tuning later but for now this addresses the major issues. -Alan From chris.hegarty at oracle.com Mon May 21 02:38:52 2012 From: chris.hegarty at oracle.com (Chris Hegarty) Date: Mon, 21 May 2012 10:38:52 +0100 Subject: 7170203: TEST_BUG: test/java/nio/MappedByteBuffer/Truncate.java failing intermittently In-Reply-To: <4FB763B5.9010107@oracle.com> References: <4FB763B5.9010107@oracle.com> Message-ID: <4FBA0D2C.9010308@oracle.com> The test change looks fine to me. I'm still learning new stuff about jtreg ;-) -Chris. On 19/05/2012 10:11, Alan Bateman wrote: > > A few days we fixed MappedByteBuffer.load so that it that it works even > after being compiled at runtime: > > http://hg.openjdk.java.net/jdk8/tl/jdk/rev/332bebb463d1 > > It turns out that exposes an issue, of sorts, with one of the tests that > I did not see before pushing the change. The test truncates a file that > is mapped and then attempts to read from the mapped buffer, causing the > thread to terminate with an uncaught Error. The issue is that jtreg runs > the test in a thread group that has an uncaught exception handler and so > the uncaught error is causing the test to fail. > > There are various ways to fix this, one would be change the task to > catch Error or Throwable. Another way, which I prefer, is to just set > the uncaught exception handler on the thread so that we have the > exception in the test output. Attached is the proposed patch. > > -Alan. > > > diff --git a/test/java/nio/MappedByteBuffer/Truncate.java > b/test/java/nio/MappedByteBuffer/Truncate.java > --- a/test/java/nio/MappedByteBuffer/Truncate.java > +++ b/test/java/nio/MappedByteBuffer/Truncate.java > @@ -88,6 +88,11 @@ > } > }; > Thread t = new Thread(r); > + t.setUncaughtExceptionHandler(new Thread.UncaughtExceptionHandler() { > + public void uncaughtException(Thread t, Throwable e) { > + e.printStackTrace(); > + } > + }); > t.start(); > > From Alan.Bateman at oracle.com Wed May 23 02:48:23 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Wed, 23 May 2012 10:48:23 +0100 Subject: 7169050: (se) Selector.select slow on Solaris due to insertion of POLLREMOVE and 0 events In-Reply-To: <4FB77555.2050407@oracle.com> References: <4FB77555.2050407@oracle.com> Message-ID: <4FBCB267.8020705@oracle.com> On 19/05/2012 11:26, Alan Bateman wrote: > : > > The webrev with a patch to fix these issues is here: > > http://cr.openjdk.java.net/~alanb/7169050/webrev/ I've refreshed the webrev with an update that it a bit more efficient. The changes are relatively simple and as per the original mail, have the net effect of not queuing a POLLREMOVE per update and also removing the file descriptor from /dev/poll when the interest ops are changed to 0. -Alan. From chris.hegarty at oracle.com Wed May 23 05:46:58 2012 From: chris.hegarty at oracle.com (Chris Hegarty) Date: Wed, 23 May 2012 13:46:58 +0100 Subject: 7169050: (se) Selector.select slow on Solaris due to insertion of POLLREMOVE and 0 events In-Reply-To: <4FBCB267.8020705@oracle.com> References: <4FB77555.2050407@oracle.com> <4FBCB267.8020705@oracle.com> Message-ID: <4FBCDC42.8090207@oracle.com> These changes look great Alan. One minor comment, the events are now stored as a byte. Is it ever possible to have an event that has a valid value that uses the upper order byte ( that is passed to the native short pollfd.events )? It must not be, just thought I'd ask. -Chris. On 23/05/2012 10:48, Alan Bateman wrote: > On 19/05/2012 11:26, Alan Bateman wrote: >> : >> >> The webrev with a patch to fix these issues is here: >> >> http://cr.openjdk.java.net/~alanb/7169050/webrev/ > I've refreshed the webrev with an update that it a bit more efficient. > The changes are relatively simple and as per the original mail, have the > net effect of not queuing a POLLREMOVE per update and also removing the > file descriptor from /dev/poll when the interest ops are changed to 0. > > -Alan. From Alan.Bateman at oracle.com Wed May 23 06:15:30 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Wed, 23 May 2012 14:15:30 +0100 Subject: 7169050: (se) Selector.select slow on Solaris due to insertion of POLLREMOVE and 0 events In-Reply-To: <4FBCDC42.8090207@oracle.com> References: <4FB77555.2050407@oracle.com> <4FBCB267.8020705@oracle.com> <4FBCDC42.8090207@oracle.com> Message-ID: <4FBCE2F2.7030302@oracle.com> On 23/05/2012 13:46, Chris Hegarty wrote: > These changes look great Alan. > > One minor comment, the events are now stored as a byte. Is it ever > possible to have an event that has a valid value that uses the upper > order byte ( that is passed to the native short pollfd.events )? It > must not be, just thought I'd ask. In theory it should be a short[] but that takes up too much memory when a fd limit of 64k or higher. A byte works for us because the values of POLLIN and POLLOUT fit without any shifting or mapping. So thanks for reviewing, this resolves a long standing performance issue. -Alan. From sean.coffey at oracle.com Wed May 23 07:55:59 2012 From: sean.coffey at oracle.com (=?ISO-8859-1?Q?Se=E1n_Coffey?=) Date: Wed, 23 May 2012 15:55:59 +0100 Subject: 7169050: (se) Selector.select slow on Solaris due to insertion of POLLREMOVE and 0 events In-Reply-To: <4FBCE2F2.7030302@oracle.com> References: <4FB77555.2050407@oracle.com> <4FBCB267.8020705@oracle.com> <4FBCDC42.8090207@oracle.com> <4FBCE2F2.7030302@oracle.com> Message-ID: <4FBCFA7F.4050206@oracle.com> Had a look here also Alan. Looks good. I'm wondering if INITIAL_PENDING_UPDATE_SIZE of 64 is the optimum start size there. In any case, I guess a busy server won't be long arriving at the optimal array size. some typos in comments : // cancel(s) ? any pending update // skip update if key can has cancelled // write any reminaing updates regards, Sean. On 23/05/12 14:15, Alan Bateman wrote: > On 23/05/2012 13:46, Chris Hegarty wrote: >> These changes look great Alan. >> >> One minor comment, the events are now stored as a byte. Is it ever >> possible to have an event that has a valid value that uses the upper >> order byte ( that is passed to the native short pollfd.events )? It >> must not be, just thought I'd ask. > In theory it should be a short[] but that takes up too much memory > when a fd limit of 64k or higher. A byte works for us because the > values of POLLIN and POLLOUT fit without any shifting or mapping. So > thanks for reviewing, this resolves a long standing performance issue. > > -Alan. From Alan.Bateman at oracle.com Wed May 23 08:38:24 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Wed, 23 May 2012 16:38:24 +0100 Subject: 7169050: (se) Selector.select slow on Solaris due to insertion of POLLREMOVE and 0 events In-Reply-To: <4FBCFA7F.4050206@oracle.com> References: <4FB77555.2050407@oracle.com> <4FBCB267.8020705@oracle.com> <4FBCDC42.8090207@oracle.com> <4FBCE2F2.7030302@oracle.com> <4FBCFA7F.4050206@oracle.com> Message-ID: <4FBD0470.9060709@oracle.com> On 23/05/2012 15:55, Se?n Coffey wrote: > > Had a look here also Alan. Looks good. I'm wondering if > INITIAL_PENDING_UPDATE_SIZE of 64 is the optimum start size there. In > any case, I guess a busy server won't be long arriving at the optimal > array size. > > some typos in comments : > > // cancel(s) ? any pending update > // skip update if key can has cancelled > // write any reminaing updates Thanks Se?n, I'll fix these typos before pushing the change. -Alan From youdwei at linux.vnet.ibm.com Thu May 31 00:47:13 2012 From: youdwei at linux.vnet.ibm.com (Deven You) Date: Thu, 31 May 2012 15:47:13 +0800 Subject: Using OP_CONNECT with Selector.select causes selector to fire repeatedly In-Reply-To: <4FB2BF30.9060902@oracle.com> References: <4FAA1C40.6040709@linux.vnet.ibm.com> <4FB2BF30.9060902@oracle.com> Message-ID: <4FC72201.7090307@linux.vnet.ibm.com> Hi Alan, Thanks for your comments. How do you feel the javadoc change should be, from my knowledge, for the Selector.select(), we may add a sentence like "For OP_CONNECT, once a channel is already connected, an application should not check OP_CONNECT again, in this situation, Selector will ignore OP_CONNECT and return immediately." On 05/16/2012 04:40 AM, Alan Bateman wrote: > Devon, > > Just to follow up from my previous mail. I looked at the changes in > the patch and also tried it out to see the side effects. One issue is > that the changes mean that OP_CONNECT will be "automatically" removed > from the interest ops, minimally that would require a spec change. > When I ran the Selector tests (in the jdk repository) then I noticed a > few failures so there are other side effects too. In any case, I think > the one is really just a mis-use of the API and that we should instead > adding wording to the javadoc rather than change the implementation. > > -Alan > -- Best Regards, Deven