From yiming.wang at oracle.com Tue Sep 3 05:15:07 2013 From: yiming.wang at oracle.com (Eric Wang) Date: Tue, 03 Sep 2013 20:15:07 +0800 Subject: [PATCH] Review quest for bug 8023878 TEST_BUG java/nio/file/WatchService/SensitivityModifier.java fails intermittently Message-ID: <5225D2CB.8040709@oracle.com> Hi, Please help to review the fix http://cr.openjdk.java.net/~ewang/8023878/webrev.00/ for bug https://bugs.openjdk.java.net/browse/JDK-8023878. There's a defect in the loop of old test as below, if the eventRecived is true, the while loop is broken which causes keys of watcher are not reset, it causes the watcher.take() is hung in next execution in the for loop. for (int i=0; i<10; i++) { ...... boolean eventReceived = false; WatchKey key = watcher.take(); do { for (WatchEvent event: key.pollEvents()) { if (event.kind() != ENTRY_MODIFY) throw new RuntimeException("Unexpected event: " + event); Path name = ((WatchEvent)event).context(); if (name.equals(file.getFileName())) { eventReceived = true; break; // jump out to while loop and the events of the key is not reset } } key.reset(); key = watcher.poll(1, TimeUnit.SECONDS); } while (key != null && !eventReceived) Thanks, Eric From jeremymanson at google.com Tue Sep 3 15:00:17 2013 From: jeremymanson at google.com (Jeremy Manson) Date: Tue, 3 Sep 2013 15:00:17 -0700 Subject: 8022594: Potential deadlock in of sun.nio.ch.Util/IOUtil In-Reply-To: <521B17AE.6080607@oracle.com> References: <521B17AE.6080607@oracle.com> Message-ID: I note that the bug for this suggests that it might have been a one-off on our side. As it turns out, we can reproduce this relatively easily, but a) flakily, and b) not in a way that we can share externally. Jeremy On Mon, Aug 26, 2013 at 1:54 AM, Alan Bateman wrote: > > Jeremy Manson recently reported a sighting of a deadlock at startup in the > static initializers that are involved in loading libnio/equivalent [1]. > > Digging through the JDK 1.4/1.5 era history then it seems there were other > issues that lead to the current implementation. Looking at now and since > sun.nio.ch.Util doesn't have any native methods then the simplest thing to > do (but not the only solution) is to move the loading of the native > libraries to IOUtil. That eliminates the need for the additional locking. > > The webrev with the changes is here: > > http://cr.openjdk.java.net/~**alanb/8022594/webrev/ > > The patch does not include a test as this is a deadlock at startup that > just isn't easy to reproduce (at least not on the systems that I tried). > > -Alan > > [1] http://mail.openjdk.java.net/**pipermail/core-libs-dev/2013-** > August/019630.html > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130903/4660f0e1/attachment.html From Alan.Bateman at oracle.com Thu Sep 5 12:44:22 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Thu, 05 Sep 2013 20:44:22 +0100 Subject: [PATCH] Review quest for bug 8023878 TEST_BUG java/nio/file/WatchService/SensitivityModifier.java fails intermittently In-Reply-To: <5225D2CB.8040709@oracle.com> References: <5225D2CB.8040709@oracle.com> Message-ID: <5228DF16.8080509@oracle.com> On 03/09/2013 13:15, Eric Wang wrote: > Hi, > > Please help to review the fix > http://cr.openjdk.java.net/~ewang/8023878/webrev.00/ for bug > https://bugs.openjdk.java.net/browse/JDK-8023878. > > There's a defect in the loop of old test as below, if the eventRecived > is true, the while loop is broken which causes keys of watcher are not > reset, it causes the watcher.take() is hung in next execution in the > for loop. Sorry the delay, I've been busy with other things. I think your analysis is right, the while loop terminates when the event is detected but there may be more than one event queued. Your fixes looks okay, an alternative would be to just change L106 only. -Alan From yiming.wang at oracle.com Mon Sep 9 05:35:20 2013 From: yiming.wang at oracle.com (Eric Wang) Date: Mon, 09 Sep 2013 20:35:20 +0800 Subject: [PATCH] Review quest for bug 8023878 TEST_BUG java/nio/file/WatchService/SensitivityModifier.java fails intermittently In-Reply-To: <5228DF16.8080509@oracle.com> References: <5225D2CB.8040709@oracle.com> <5228DF16.8080509@oracle.com> Message-ID: <522DC088.7010000@oracle.com> On 2013/9/6 3:44, Alan Bateman wrote: > On 03/09/2013 13:15, Eric Wang wrote: >> Hi, >> >> Please help to review the fix >> http://cr.openjdk.java.net/~ewang/8023878/webrev.00/ for bug >> https://bugs.openjdk.java.net/browse/JDK-8023878. >> >> There's a defect in the loop of old test as below, if the >> eventRecived is true, the while loop is broken which causes keys of >> watcher are not reset, it causes the watcher.take() is hung in next >> execution in the for loop. > Sorry the delay, I've been busy with other things. > > I think your analysis is right, the while loop terminates when the > event is detected but there may be more than one event queued. Your > fixes looks okay, an alternative would be to just change L106 only. > > -Alan Hi Alan, I added the if (!eventReceived) to avoid unnecessary loop for performance concern, but it is not a issue for a simple test. below is the alternative. if you are OK with it, please help to be my sponsor. http://cr.openjdk.java.net/~ewang/8023878/webrev.01/ Thanks, Eric From Alan.Bateman at oracle.com Mon Sep 9 13:34:46 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Mon, 09 Sep 2013 21:34:46 +0100 Subject: [PATCH] Review quest for bug 8023878 TEST_BUG java/nio/file/WatchService/SensitivityModifier.java fails intermittently In-Reply-To: <522DC088.7010000@oracle.com> References: <5225D2CB.8040709@oracle.com> <5228DF16.8080509@oracle.com> <522DC088.7010000@oracle.com> Message-ID: <522E30E6.2080109@oracle.com> On 09/09/2013 13:35, Eric Wang wrote: > Hi Alan, > > I added the if (!eventReceived) to avoid unnecessary loop for > performance concern, but it is not a issue for a simple test. below is > the alternative. if you are OK with it, please help to be my sponsor. > > http://cr.openjdk.java.net/~ewang/8023878/webrev.01/ > That looks good, much clearer. I'll push this for you. -Alan From yiming.wang at oracle.com Wed Sep 11 01:58:40 2013 From: yiming.wang at oracle.com (Eric Wang) Date: Wed, 11 Sep 2013 16:58:40 +0800 Subject: Review request for bug 8015762: java/nio/channels/DatagramChannel/AdaptDatagramSocket.java fails intermittently In-Reply-To: <520A9D30.8020107@oracle.com> References: <520A188D.2070808@oracle.com> <520A9D30.8020107@oracle.com> Message-ID: <523030C0.8020102@oracle.com> Hi Alan, Sorry for late. I have re-fixed this failure, Can you please help to review? I executed the tests on the host jsn-vm49.us for thousands times and found the test failed as setting SO_TIMEOUT for 5 seconds is not enough to wait response sent by a new created thread of UdpEchoRequest. It may caused by thread schedule as there's maybe more than 3 threads executing at sametime or full GC as lots of UdpEchoRequest created in runtime. The fix is to change the SO_TIMEOUT from 5 seconds to 10 and not create a new thread of UdpEchoRequest to send response. I have run the fix for 20000 times, it works fine. http://cr.openjdk.java.net/~ewang/8015762/webrev.01/ Thanks, Eric On 2013/8/14 4:55, Alan Bateman wrote: > On 13/08/2013 12:29, Eric Wang wrote: >> Hi, >> >> Please help to review the fix below for bug >> https://jbs.oracle.com/bugs/browse/JDK-8015762. >> http://cr.openjdk.java.net/~ewang/8015762/webrev.00/ >> >> The test AdaptDatagramSocket fails intermittently as it runs depended >> on thread schedule. The fix to adjust the thread priority and sleep >> time to make the test more stable. > I think I need to understand this issue further to properly review > this. From the stack trace in the bug report then it looks like it's > the "test(address, 5000, false, false)" case that is timing out. Is it > definitely the case that is failing (intermittently) and have you > managed to duplicate it? > > -Alan From Alan.Bateman at oracle.com Wed Sep 11 02:23:50 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Wed, 11 Sep 2013 10:23:50 +0100 Subject: Review request for bug 8015762: java/nio/channels/DatagramChannel/AdaptDatagramSocket.java fails intermittently In-Reply-To: <523030C0.8020102@oracle.com> References: <520A188D.2070808@oracle.com> <520A9D30.8020107@oracle.com> <523030C0.8020102@oracle.com> Message-ID: <523036A6.3040009@oracle.com> On 11/09/2013 09:58, Eric Wang wrote: > Hi Alan, > > Sorry for late. I have re-fixed this failure, Can you please help to > review? > I executed the tests on the host jsn-vm49.us for thousands times and > found the test failed as setting SO_TIMEOUT for 5 seconds is not > enough to wait response sent by a new created thread of > UdpEchoRequest. It may caused by thread schedule as there's maybe more > than 3 threads executing at sametime or full GC as lots of > UdpEchoRequest created in runtime. > > The fix is to change the SO_TIMEOUT from 5 seconds to 10 and not > create a new thread of UdpEchoRequest to send response. I have run the > fix for 20000 times, it works fine. > http://cr.openjdk.java.net/~ewang/8015762/webrev.01/ > > Thanks for confirming that the 5 second timeout is insufficient, that part is clear now. The webrev also updates TestServers so that start runs the task directly. Is this meant to be part of this change? -Alan. From yiming.wang at oracle.com Wed Sep 11 02:47:21 2013 From: yiming.wang at oracle.com (Eric Wang) Date: Wed, 11 Sep 2013 17:47:21 +0800 Subject: Review request for bug 8015762: java/nio/channels/DatagramChannel/AdaptDatagramSocket.java fails intermittently In-Reply-To: <523036A6.3040009@oracle.com> References: <520A188D.2070808@oracle.com> <520A9D30.8020107@oracle.com> <523030C0.8020102@oracle.com> <523036A6.3040009@oracle.com> Message-ID: <52303C29.5040307@oracle.com> On 2013/9/11 17:23, Alan Bateman wrote: > On 11/09/2013 09:58, Eric Wang wrote: >> Hi Alan, >> >> Sorry for late. I have re-fixed this failure, Can you please help to >> review? >> I executed the tests on the host jsn-vm49.us for thousands times and >> found the test failed as setting SO_TIMEOUT for 5 seconds is not >> enough to wait response sent by a new created thread of >> UdpEchoRequest. It may caused by thread schedule as there's maybe >> more than 3 threads executing at sametime or full GC as lots of >> UdpEchoRequest created in runtime. >> >> The fix is to change the SO_TIMEOUT from 5 seconds to 10 and not >> create a new thread of UdpEchoRequest to send response. I have run >> the fix for 20000 times, it works fine. >> http://cr.openjdk.java.net/~ewang/8015762/webrev.01/ >> >> > Thanks for confirming that the 5 second timeout is insufficient, that > part is clear now. > > The webrev also updates TestServers so that start runs the task > directly. Is this meant to be part of this change? > > -Alan. > Yes, it is a part of the fix, i have tested if only update timeout to 10 sec, the test maybe still failed as the new created thread of UdpEchoRequest doesn't get chance to run. so I workaround the thread. Thanks, Eric From daniel.fuchs at oracle.com Wed Sep 11 03:06:45 2013 From: daniel.fuchs at oracle.com (Daniel Fuchs) Date: Wed, 11 Sep 2013 12:06:45 +0200 Subject: Review request for bug 8015762: java/nio/channels/DatagramChannel/AdaptDatagramSocket.java fails intermittently In-Reply-To: <52303C29.5040307@oracle.com> References: <520A188D.2070808@oracle.com> <520A9D30.8020107@oracle.com> <523030C0.8020102@oracle.com> <523036A6.3040009@oracle.com> <52303C29.5040307@oracle.com> Message-ID: <523040B5.9030804@oracle.com> On 9/11/13 11:47 AM, Eric Wang wrote: > > On 2013/9/11 17:23, Alan Bateman wrote: >> On 11/09/2013 09:58, Eric Wang wrote: >>> Hi Alan, >>> >>> Sorry for late. I have re-fixed this failure, Can you please help to >>> review? >>> I executed the tests on the host jsn-vm49.us for thousands times and >>> found the test failed as setting SO_TIMEOUT for 5 seconds is not >>> enough to wait response sent by a new created thread of >>> UdpEchoRequest. It may caused by thread schedule as there's maybe >>> more than 3 threads executing at sametime or full GC as lots of >>> UdpEchoRequest created in runtime. >>> >>> The fix is to change the SO_TIMEOUT from 5 seconds to 10 and not >>> create a new thread of UdpEchoRequest to send response. I have run >>> the fix for 20000 times, it works fine. >>> http://cr.openjdk.java.net/~ewang/8015762/webrev.01/ >>> >>> >> Thanks for confirming that the 5 second timeout is insufficient, that >> part is clear now. >> >> The webrev also updates TestServers so that start runs the task >> directly. Is this meant to be part of this change? >> >> -Alan. >> > Yes, it is a part of the fix, i have tested if only update timeout to 10 > sec, the test maybe still failed as the new created thread of > UdpEchoRequest doesn't get chance to run. so I workaround the thread. Hi Eric, This looks very strange - I had to look twice to convince me that the UDP packet would not be sent twice. Did you double check whether there were other tests using the UdpEchoServer - and whether such a change might affect them too? best regards, -- daniel > > Thanks, > Eric > From chris.hegarty at oracle.com Wed Sep 11 03:30:19 2013 From: chris.hegarty at oracle.com (Chris Hegarty) Date: Wed, 11 Sep 2013 11:30:19 +0100 Subject: Review request for bug 8015762: java/nio/channels/DatagramChannel/AdaptDatagramSocket.java fails intermittently In-Reply-To: <52303C29.5040307@oracle.com> References: <520A188D.2070808@oracle.com> <520A9D30.8020107@oracle.com> <523030C0.8020102@oracle.com> <523036A6.3040009@oracle.com> <52303C29.5040307@oracle.com> Message-ID: <5230463B.8060402@oracle.com> On 11/09/2013 10:47, Eric Wang wrote: > > On 2013/9/11 17:23, Alan Bateman wrote: >> On 11/09/2013 09:58, Eric Wang wrote: >>> Hi Alan, >>> >>> Sorry for late. I have re-fixed this failure, Can you please help to >>> review? >>> I executed the tests on the host jsn-vm49.us for thousands times and >>> found the test failed as setting SO_TIMEOUT for 5 seconds is not >>> enough to wait response sent by a new created thread of >>> UdpEchoRequest. It may caused by thread schedule as there's maybe >>> more than 3 threads executing at sametime or full GC as lots of >>> UdpEchoRequest created in runtime. >>> >>> The fix is to change the SO_TIMEOUT from 5 seconds to 10 and not >>> create a new thread of UdpEchoRequest to send response. I have run >>> the fix for 20000 times, it works fine. >>> http://cr.openjdk.java.net/~ewang/8015762/webrev.01/ >>> >>> >> Thanks for confirming that the 5 second timeout is insufficient, that >> part is clear now. >> >> The webrev also updates TestServers so that start runs the task >> directly. Is this meant to be part of this change? >> >> -Alan. >> > Yes, it is a part of the fix, i have tested if only update timeout to 10 > sec, the test maybe still failed as the new created thread of > UdpEchoRequest doesn't get chance to run. so I workaround the thread. This does look a little odd. I'm don't see why it is necessary. If there is a still a timing issue should there be some other form of synchronization between threads? -Chris. > > Thanks, > Eric > From yiming.wang at oracle.com Wed Sep 11 03:44:39 2013 From: yiming.wang at oracle.com (Eric Wang) Date: Wed, 11 Sep 2013 18:44:39 +0800 Subject: Review request for bug 8015762: java/nio/channels/DatagramChannel/AdaptDatagramSocket.java fails intermittently In-Reply-To: <523040B5.9030804@oracle.com> References: <520A188D.2070808@oracle.com> <520A9D30.8020107@oracle.com> <523030C0.8020102@oracle.com> <523036A6.3040009@oracle.com> <52303C29.5040307@oracle.com> <523040B5.9030804@oracle.com> Message-ID: <52304997.8070609@oracle.com> On 2013/9/11 18:06, Daniel Fuchs wrote: > On 9/11/13 11:47 AM, Eric Wang wrote: >> >> On 2013/9/11 17:23, Alan Bateman wrote: >>> On 11/09/2013 09:58, Eric Wang wrote: >>>> Hi Alan, >>>> >>>> Sorry for late. I have re-fixed this failure, Can you please help to >>>> review? >>>> I executed the tests on the host jsn-vm49.us for thousands times and >>>> found the test failed as setting SO_TIMEOUT for 5 seconds is not >>>> enough to wait response sent by a new created thread of >>>> UdpEchoRequest. It may caused by thread schedule as there's maybe >>>> more than 3 threads executing at sametime or full GC as lots of >>>> UdpEchoRequest created in runtime. >>>> >>>> The fix is to change the SO_TIMEOUT from 5 seconds to 10 and not >>>> create a new thread of UdpEchoRequest to send response. I have run >>>> the fix for 20000 times, it works fine. >>>> http://cr.openjdk.java.net/~ewang/8015762/webrev.01/ >>>> >>>> >>> Thanks for confirming that the 5 second timeout is insufficient, that >>> part is clear now. >>> >>> The webrev also updates TestServers so that start runs the task >>> directly. Is this meant to be part of this change? >>> >>> -Alan. >>> >> Yes, it is a part of the fix, i have tested if only update timeout to 10 >> sec, the test maybe still failed as the new created thread of >> UdpEchoRequest doesn't get chance to run. so I workaround the thread. > > Hi Eric, > > This looks very strange - I had to look twice to convince me that > the UDP packet would not be sent twice. > > Did you double check whether there were other tests using the > UdpEchoServer - and whether such a change might affect them > too? > > > best regards, > > -- daniel > > >> >> Thanks, >> Eric >> > I have searched the test repo using keyword "UdpEchoServer", I didn't find other test references it. Thanks, Eric From cgdecker at google.com Wed Sep 11 14:36:23 2013 From: cgdecker at google.com (Colin Decker) Date: Wed, 11 Sep 2013 17:36:23 -0400 Subject: Two issues in Files Message-ID: Hi, I noticed a couple of small issues in Files for JDK8. The first, more significant issue, is that Files.readAllBytes(Path) now creates a FileChannel, while FileSystemProvider implementations are not required to support FileChannel. A SeekableByteChannel could be created instead, and at least seems much more likely to be supported than FileChannel. The second issue is with Files.write(Path, byte[], OpenOption...). Instead of passing the whole input byte array to OutputStream.write(byte[]), thus allowing the OutputStream implementation to decide how best to handle it, it loops through writing 8k slices of the array to the OutputStream. Is this intended? It seems wasteful, particularly if an implementation (such as an in-memory implementation) could just write the whole array at once (and might even benefit from knowing up front the total number of bytes being written.) Thanks, Colin -- Colin Decker | Software Engineer | cgdecker at google.com | Java Core Libraries Team -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130911/8fb623db/attachment.html From Alan.Bateman at oracle.com Thu Sep 12 01:30:58 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Thu, 12 Sep 2013 09:30:58 +0100 Subject: Two issues in Files In-Reply-To: References: Message-ID: <52317BC2.2000909@oracle.com> On 11/09/2013 22:36, Colin Decker wrote: > Hi, > I noticed a couple of small issues in Files for JDK8. > > The first, more significant issue, is that Files.readAllBytes(Path) > now creates a FileChannel, while FileSystemProvider implementations > are not required to support FileChannel. A SeekableByteChannel could > be created instead, and at least seems much more likely to be > supported than FileChannel. I'll create a bug for this, it should be using newSeekableChannel and then testing if it is a FileChannel before deciding how to get the size. > > The second issue is with Files.write(Path, byte[], OpenOption...). > Instead of passing the whole input byte array to > OutputStream.write(byte[]), thus allowing the OutputStream > implementation to decide how best to handle it, it loops through > writing 8k slices of the array to the OutputStream. Is this intended? > It seems wasteful, particularly if an implementation (such as an > in-memory implementation) could just write the whole array at once > (and might even benefit from knowing up front the total number of > bytes being written.) > I don't know if there is a right answer to this one. If the underlying write requires using an intermediate buffer outside of the heap (common case for the default file system when the VM doesn't support any means to pin byte[] in the heap) then writing the entire array might be undesirable. So the slicing was intentional although there 8k might be a bit small. -Alan. From cgdecker at google.com Thu Sep 12 06:54:38 2013 From: cgdecker at google.com (Colin Decker) Date: Thu, 12 Sep 2013 09:54:38 -0400 Subject: Two issues in Files In-Reply-To: <52317BC2.2000909@oracle.com> References: <52317BC2.2000909@oracle.com> Message-ID: On Thu, Sep 12, 2013 at 4:30 AM, Alan Bateman wrote: > On 11/09/2013 22:36, Colin Decker wrote: > >> Hi, >> I noticed a couple of small issues in Files for JDK8. >> >> The first, more significant issue, is that Files.readAllBytes(Path) now >> creates a FileChannel, while FileSystemProvider implementations are not >> required to support FileChannel. A SeekableByteChannel could be created >> instead, and at least seems much more likely to be supported than >> FileChannel. >> > I'll create a bug for this, it should be using newSeekableChannel and then > testing if it is a FileChannel before deciding how to get the size. Thanks. Since SeekableByteChannel has the size() method, it shouldn't even be necessary to check if it's a FileChannel I think. > > > >> The second issue is with Files.write(Path, byte[], OpenOption...). >> Instead of passing the whole input byte array to >> OutputStream.write(byte[]), thus allowing the OutputStream implementation >> to decide how best to handle it, it loops through writing 8k slices of the >> array to the OutputStream. Is this intended? It seems wasteful, >> particularly if an implementation (such as an in-memory implementation) >> could just write the whole array at once (and might even benefit from >> knowing up front the total number of bytes being written.) >> >> I don't know if there is a right answer to this one. If the underlying > write requires using an intermediate buffer outside of the heap (common > case for the default file system when the VM doesn't support any means to > pin byte[] in the heap) then writing the entire array might be undesirable. > So the slicing was intentional although there 8k might be a bit small. As I see it, if something like that is needed, it seems like the OutputStream implementation should be doing the slicing. Giving the whole array to the OutputStream gives it the best opportunity to do what works best for the implementation. > > > -Alan. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130912/232528b9/attachment.html From Alan.Bateman at oracle.com Thu Sep 12 07:17:28 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Thu, 12 Sep 2013 15:17:28 +0100 Subject: Two issues in Files In-Reply-To: References: <52317BC2.2000909@oracle.com> Message-ID: <5231CCF8.4050809@oracle.com> On 12/09/2013 14:54, Colin Decker wrote: > : > > Thanks. Since SeekableByteChannel has the size() method, it shouldn't > even be necessary to check if it's a FileChannel I think. You're right, I've forgotten that it defines size. > > As I see it, if something like that is needed, it seems like the > OutputStream implementation should be doing the slicing. Giving the > whole array to the OutputStream gives it the best opportunity to do > what works best for the implementation. Maybe but it brings up again the issue that OutputStream.write may fail after having successfully written some bytes. That's not a concern for Files.write of course as it covers this case but this may require re-visiting the OutputStream spec and of course implementation changes to consume less resources when writing big arrays. -Alan. From cgdecker at google.com Thu Sep 12 16:37:48 2013 From: cgdecker at google.com (Colin Decker) Date: Thu, 12 Sep 2013 19:37:48 -0400 Subject: Two issues in Files In-Reply-To: <5231CCF8.4050809@oracle.com> References: <52317BC2.2000909@oracle.com> <5231CCF8.4050809@oracle.com> Message-ID: On Thu, Sep 12, 2013 at 10:17 AM, Alan Bateman wrote: > As I see it, if something like that is needed, it seems like the >> OutputStream implementation should be doing the slicing. Giving the whole >> array to the OutputStream gives it the best opportunity to do what works >> best for the implementation. >> > Maybe but it brings up again the issue that OutputStream.write may fail > after having successfully written some bytes. That's not a concern for > Files.write of course as it covers this case but this may require > re-visiting the OutputStream spec and of course implementation changes to > consume less resources when writing big arrays. As for OutputStream.write failing after successfully writing some bytes: that's going to be true regardless of whether you're passing an array to write in slices or all at once, right? It could still fail at any point in that process. And if some OutputStream implementations do work better (creating less memory overhead) when written in slices of array rather than when given a big array up front, that definitely seems like something that might be good to fix in those implementations, given that they can always implement write(byte[]) to do the looping and writing slices itself. And while Files.write doesn't pass the whole byte array to the OutputStream, users can do that directly. Anyway, it's not that big a deal as the performance difference is probably pretty minimal, it just seemed odd. - Colin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130912/62b36ca0/attachment.html From Alan.Bateman at oracle.com Fri Sep 13 07:15:46 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 13 Sep 2013 15:15:46 +0100 Subject: Two issues in Files In-Reply-To: References: <52317BC2.2000909@oracle.com> <5231CCF8.4050809@oracle.com> Message-ID: <52331E12.6090508@oracle.com> On 13/09/2013 00:37, Colin Decker wrote: > : > > As for OutputStream.write failing after successfully writing some > bytes: that's going to be true regardless of whether you're passing an > array to write in slices or all at once, right? It could still fail at > any point in that process. > > And if some OutputStream implementations do work better (creating less > memory overhead) when written in slices of array rather than when > given a big array up front, that definitely seems like something that > might be good to fix in those implementations, given that they can > always implement write(byte[]) to do the looping and writing slices > itself. And while Files.write doesn't pass the whole byte array to the > OutputStream, users can do that directly. Anyway, it's not that big a > deal as the performance difference is probably pretty minimal, it just > seemed odd. > I've created JDK-8024788 on the issue of Files.readAllBytes using FileChannel. On the OutputStream.write discussion then we would need to move the slicing to Channels.newOutputStream, otherwise it would have the poor effects that I was mentioned. You are right that someone using the OutputStream directly and passing a huge byte[] would do the same thing (there are a number of OOME bugs on that). For OutputStream.write itself then I think we should look to clarify the javadoc on this point, maybe to make it clear that it may throw I/O exception when some (but not all) bytes have been written. It could have a warning to say that it could leave the stream in an inconsistent state. -Alan. From chris.w.dennis at gmail.com Fri Sep 13 08:04:53 2013 From: chris.w.dennis at gmail.com (Chris Dennis) Date: Fri, 13 Sep 2013 11:04:53 -0400 Subject: Bug in interrupt handling in FileChannelImpl.map(=?UTF-8?B?4oCm?=) Message-ID: Hi All, I have discovered what I'm pretty certain is a bug in FileChannelImpl's interrupt handling. The root cause is that the map(?) method is calling size() from within it's begin()/end(?) block. Due to the way that implCloseChannel() implementation interacts with the NativeThreadSet the begin() end() pair are not re-entrant for FileChannelImpl. If you interrupt the map() operation at the wrong time then you can cause the thread to deadlock as the interrupted thread will be waiting in signalAndWait() for itself to leave. I believe the patch at the bottom of this email (against jdk8/jdk) resolves this issue, which uses the same logic in map(?) to get the underlying file size as is used in the truncate(?) method at the moment. I also think it would be good to check on each begin call to make sure we are not re-entrant, which will avoid any future breakage of this type. I have struggled to reproduce this bug in a small test-case - the only way I have been able to do has been by modifying FileChannelImpl to put a spin-loop before the begin() call in size() to open up the window in which the interrupt needs to happen. Let me know if you need clarification or more explanation of anything. Thanks, Chris Dennis diff -r b67c8099ba28 src/share/classes/sun/nio/ch/FileChannelImpl.java --- a/src/share/classes/sun/nio/ch/FileChannelImpl.java Thu Sep 12 11:09:11 2013 -0700 +++ b/src/share/classes/sun/nio/ch/FileChannelImpl.java Fri Sep 13 10:42:07 2013 -0400 @@ -840,7 +840,15 @@ ti = threads.add(); if (!isOpen()) return null; - if (size() < position + size) { // Extend file size + + long filesize; + do { + filesize = nd.size(fd); + } while ((filesize == IOStatus.INTERRUPTED) && isOpen()); + if (!isOpen()) + return null; + + if (filesize < position + size) { // Extend file size if (!writable) { throw new IOException("Channel not open for writing " + "- cannot extend file to required size"); From Alan.Bateman at oracle.com Sat Sep 14 06:29:18 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Sat, 14 Sep 2013 14:29:18 +0100 Subject: Bug in interrupt handling in =?UTF-8?B?RmlsZUNoYW5uZWxJbXBsLg==?= =?UTF-8?B?bWFwKOKApik=?= In-Reply-To: References: Message-ID: <523464AE.9090104@oracle.com> On 13/09/2013 16:04, Chris Dennis wrote: > Hi All, > > I have discovered what I'm pretty certain is a bug in FileChannelImpl's > interrupt handling. The root cause is that the map(?) method is calling > size() from within it's begin()/end(?) block. Thanks for this, it is indeed a bug and should be using nd.size() rather than size(). I've created this bug to track it: 8024833: (fc) FileChannel.map does not handle async close/interrupt correctly -Alan. From nmaurer at redhat.com Mon Sep 16 08:45:05 2013 From: nmaurer at redhat.com (Norman Maurer) Date: Mon, 16 Sep 2013 17:45:05 +0200 Subject: Regression in EPollArrayWrapper causes NPE when fd > 64 * 1024 Message-ID: <9DC59A1F-9EFB-418C-89C9-4FDE26177B90@redhat.com> Hi there, this is my first bug-report here so bear with me if something is missing ;) During testing netty.io with many concurrent connections one of our users reported a NullPointerException which was thrown by sun.nio.ch.EPollArrayWrapper.setUpdateEvents(...). This was observed as soon as the concurrent connection count > 64 * 1024. After more investigating I was able to find the bug in EPollArrayWrapper.setUpdateEvents(...), which is a regression introduced by the following change: http://hg.openjdk.java.net/jdk7u/jdk7u/jdk/rev/017bd924a3c8 The problem here is that eventsHigh.get(key) will be called once the fd is > 64 * 1024. This may return "null" which is compared to KILLED (which is of type byte) and so may throw a NPE because the compare tries to unbox the return value (which is of type Byte). The regression is present in lastest openjdk8 and in openjdk7u40 and later. It seems to also affects oracle jdk 7u40. All OS'es that use epoll are affected, in my case linux (ubuntu). Attached you find the proposed fix for openjdk8 and openjdk7 and a reproducer which can be used. The fix does two things: * Eliminate the access to the eventsHigh Map if "force" is true. * Check for null before try to compare the stored events value Reproducer: The reproducer will bind a server to the specified port and just accept new connections. The clients will connect to the server until 80 * 1024 connections are reached, and then go to sleep. When you use the reproducer without the attached fix the server will fail with: Exception in thread "main" java.lang.NullPointerException at sun.nio.ch.EPollArrayWrapper.setUpdateEvents(EPollArrayWrapper.java:178) at sun.nio.ch.EPollArrayWrapper.add(EPollArrayWrapper.java:227) at sun.nio.ch.EPollSelectorImpl.implRegister(EPollSelectorImpl.java:164) at sun.nio.ch.SelectorImpl.register(SelectorImpl.java:132) at java.nio.channels.spi.AbstractSelectableChannel.register(AbstractSelectableChannel.java:212) at java.nio.channels.SelectableChannel.register(SelectableChannel.java:280) at BugReproducer.main(BugReproducer.java:39) Once the fix is applied no Exception is thrown anymore. To run the reproducer you may need to update the localport range. For this use the following command: # sudo sysctl -w net.ipv4.ip_local_port_range="1024 64000" Also you may need to increate ulimit to something big enough. I'm using 1048576 here to test it. You need two interfaces if you not have more then one you can use a virtual interface. So if you have for example eth0 with the ipaddress 10.0.0.9 you can use something like the following to setup a virtual interface: # sudo ifconfig eth0:1 10.0.0.10 broadcast 255.255.255.0 Compile the reproducer classes, which you should put in somedir (here we call it /tmp/niobug/) # cd /tmp/niobug/ # javac -cp . Bug*.class After the classes are compiled you will need 3 terminals. One for the server and the other 2 to run the clients. First Terminal: # java -Djava.net.preferIPv4Stack=true -cp . BugReproducer 8080 Second Terminal: # java -Djava.net.preferIPv4Stack=true -cp . BugReproducerClient 10.0.0.9 8080 Third Terminal: # java -Djava.net.preferIPv4Stack=true -cp . BugReproducerClient 10.0.0.10 8080 You will see the concurrent connection count in the First Terminal. Thanks, Norman --- Norman Maurer nmaurer at redhat.com JBoss, by Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130916/9ff39ec1/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: BugReproducer.java Type: application/octet-stream Size: 1753 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130916/9ff39ec1/BugReproducer.java -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130916/9ff39ec1/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: BugReproducerClient.java Type: application/octet-stream Size: 1927 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130916/9ff39ec1/BugReproducerClient.java -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130916/9ff39ec1/attachment-0002.html -------------- next part -------------- A non-text attachment was scrubbed... Name: openjdk7-epollarraywrapper-fix.patch Type: application/octet-stream Size: 1238 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130916/9ff39ec1/openjdk7-epollarraywrapper-fix.patch -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130916/9ff39ec1/attachment-0003.html -------------- next part -------------- A non-text attachment was scrubbed... Name: openjdk8-epollarraywrapper-fix.patch Type: application/octet-stream Size: 1238 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130916/9ff39ec1/openjdk8-epollarraywrapper-fix.patch -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130916/9ff39ec1/attachment-0004.html From Alan.Bateman at oracle.com Mon Sep 16 10:49:01 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Mon, 16 Sep 2013 18:49:01 +0100 Subject: Regression in EPollArrayWrapper causes NPE when fd > 64 * 1024 In-Reply-To: <9DC59A1F-9EFB-418C-89C9-4FDE26177B90@redhat.com> References: <9DC59A1F-9EFB-418C-89C9-4FDE26177B90@redhat.com> Message-ID: <5237448D.2050804@oracle.com> On 16/09/2013 16:45, Norman Maurer wrote: > Hi there, > > this is my first bug-report here so bear with me if something is > missing ;) > > During testing netty.io with many concurrent connections one of our > users reported a NullPointerException which was thrown by > sun.nio.ch.EPollArrayWrapper.setUpdateEvents(...). This was observed > as soon as the concurrent connection count > 64 * 1024. Thanks for this (and the analysis), embarrassing as this has been in 7u40 for a long time and doesn't seem to be have been noticed. Maybe there aren't too many people testing with this number of connections. I've created this bug to track it: 8024883: (se) SelectableChannel.register throws NPE of fd >= 64k (lnx) and just did a few initial testing with the attached. Your mail mentions attachments but they seem to have been dropped. If you are looking to contribute a patch then you can re-send with the patch inlined? -Alan diff --git a/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java b/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java --- a/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java +++ b/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java @@ -175,7 +175,8 @@ } } else { Integer key = Integer.valueOf(fd); - if ((eventsHigh.get(key) != KILLED) || force) { + Byte prev = eventsHigh.get(key); + if (prev == null || prev == KILLED || force) { eventsHigh.put(key, Byte.valueOf(events)); } } -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130916/c0796533/attachment.html From nmaurer at redhat.com Mon Sep 16 13:02:02 2013 From: nmaurer at redhat.com (Norman Maurer) Date: Mon, 16 Sep 2013 22:02:02 +0200 Subject: Regression in EPollArrayWrapper causes NPE when fd > 64 * 1024 In-Reply-To: <5237448D.2050804@oracle.com> References: <9DC59A1F-9EFB-418C-89C9-4FDE26177B90@redhat.com> <5237448D.2050804@oracle.com> Message-ID: Hi Alan, for sure I want to contribute a patch. Seems like the patches / reproducer are visible in the mail archive[1]. Anyway here are the files included inline: Patch does: * Eliminate the access to the eventsHigh Map if "force" is true. * Check for null before try to compare the stored events value. # openjdk7 patch: # HG changeset patch # User Norman Maurer # Date 1379332618 -7200 # Node ID 4dbc07e1f92fa1c1b6c2dd20f20d67702aa91b6c # Parent 861e489158effbf6a841119206eea2689fcb2a83 Fix NullPointerException which was thrown by EPollArrayWrapper.setUpdateEvents(...) if fd > 64 * 1024 was used. diff -r 861e489158ef -r 4dbc07e1f92f src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java --- a/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java Thu Sep 12 17:17:40 2013 -0700 +++ b/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java Mon Sep 16 13:56:58 2013 +0200 @@ -175,12 +175,18 @@ } } else { Integer key = Integer.valueOf(fd); - if ((eventsHigh.get(key) != KILLED) || force) { - eventsHigh.put(key, Byte.valueOf(events)); + if (force || isEventsHighNotKilled(key)) { + Byte value = Byte.valueOf(events); + eventsHigh.put(key, value); } } } + private boolean isEventsHighNotKilled(Integer key) { + Byte value = eventsHigh.get(key); + return value == null || value != KILLED; + } + /** * Returns the pending update events for the given file descriptor. */ # openjdk8 patch: # HG changeset patch # User Norman Maurer # Date 1379332812 -7200 # Node ID 5023e9319a3815c52654232b21258f4670732c8e # Parent b67c8099ba28b66ccafa85abc096ac592089ad00 Fix NullPointerException which was thrown by EPollArrayWrapper.setUpdateEvents(...) if fd > 64 * 1024 was used. diff -r b67c8099ba28 -r 5023e9319a38 src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java --- a/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java Thu Sep 12 11:09:11 2013 -0700 +++ b/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java Mon Sep 16 14:00:12 2013 +0200 @@ -175,12 +175,18 @@ } } else { Integer key = Integer.valueOf(fd); - if ((eventsHigh.get(key) != KILLED) || force) { - eventsHigh.put(key, Byte.valueOf(events)); + if (force || isEventsHighNotKilled(key)) { + Byte value = Byte.valueOf(events); + eventsHigh.put(key, value); } } } + private boolean isEventsHighNotKilled(Integer key) { + Byte value = eventsHigh.get(key); + return value == null || value != KILLED; + } + /** * Returns the pending update events for the given file descriptor. */ # BugReproducer.java: import java.net.InetSocketAddress; import java.net.StandardSocketOptions; import java.nio.channels.SelectionKey; import java.nio.channels.Selector; import java.nio.channels.ServerSocketChannel; import java.nio.channels.SocketChannel; import java.util.Iterator; /** * @author Norman Maurer */ public class BugReproducer { public static void main(String args[]) throws Exception { int connections = 0; Selector selector = Selector.open(); final ServerSocketChannel channel = ServerSocketChannel.open(); channel.configureBlocking(false); channel.register(selector, SelectionKey.OP_ACCEPT, channel); channel.bind(new InetSocketAddress("0.0.0.0", Integer.parseInt(args[0]))); for (;;) { selector.select(); Iterator keys = selector.selectedKeys().iterator(); while (keys.hasNext()) { SelectionKey key = keys.next(); keys.remove(); if (key.isAcceptable()) { ServerSocketChannel ch = (ServerSocketChannel) key.attachment(); for (;;) { SocketChannel sch = ch.accept(); if (sch == null) { break; } sch.setOption(StandardSocketOptions.SO_KEEPALIVE, true); sch.setOption(StandardSocketOptions.SO_REUSEADDR, true); sch.configureBlocking(false); sch.register(selector, SelectionKey.OP_READ); System.out.println(++connections); } } } } } } # BugReproducerClient.java: import java.net.InetSocketAddress; import java.net.StandardSocketOptions; import java.nio.channels.SelectionKey; import java.nio.channels.Selector; import java.nio.channels.SocketChannel; import java.util.Iterator; /** * @author Norman Maurer */ public class BugReproducerClient { private static final int MAX_CONNECTIONS = 40 * 1024; public static void main(String args[]) throws Exception { InetSocketAddress address = new InetSocketAddress(args[0], Integer.parseInt(args[1])); int connections = 0; Selector selector = Selector.open(); for (;;) { SocketChannel channel = SocketChannel.open(); channel.configureBlocking(false); channel.setOption(StandardSocketOptions.SO_KEEPALIVE, true); channel.setOption(StandardSocketOptions.SO_REUSEADDR, true); if (!channel.connect(address)) { channel.register(selector, SelectionKey.OP_CONNECT); selector.select(); Iterator keys = selector.selectedKeys().iterator(); while (keys.hasNext()) { SelectionKey key = keys.next(); keys.remove(); if (key.isConnectable()) { key.interestOps((key.interestOps() &~SelectionKey.OP_CONNECT) | SelectionKey.OP_READ); System.out.println(++connections); } } } else { channel.register(selector, SelectionKey.OP_READ); System.out.println(++connections); } if (MAX_CONNECTIONS == connections) { Thread.sleep(1000 * 60); break; } else { if (connections % 500 == 0) { Thread.sleep(100); } } } } } Let me know if you need anything else. [1 http://mail.openjdk.java.net/pipermail/nio-dev/2013-September/002284.html] --- Norman Maurer nmaurer at redhat.com JBoss, by Red Hat Am 16.09.2013 um 19:49 schrieb Alan Bateman : > On 16/09/2013 16:45, Norman Maurer wrote: >> >> Hi there, >> >> this is my first bug-report here so bear with me if something is missing ;) >> >> During testing netty.io with many concurrent connections one of our users reported a NullPointerException which was thrown by sun.nio.ch.EPollArrayWrapper.setUpdateEvents(...). This was observed as soon as the concurrent connection count > 64 * 1024. > Thanks for this (and the analysis), embarrassing as this has been in 7u40 for a long time and doesn't seem to be have been noticed. Maybe there aren't too many people testing with this number of connections. > > I've created this bug to track it: > 8024883: (se) SelectableChannel.register throws NPE of fd >= 64k (lnx) > > and just did a few initial testing with the attached. Your mail mentions attachments but they seem to have been dropped. If you are looking to contribute a patch then you can re-send with the patch inlined? > > -Alan > > > diff --git a/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java b/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java > --- a/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java > +++ b/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java > @@ -175,7 +175,8 @@ > } > } else { > Integer key = Integer.valueOf(fd); > - if ((eventsHigh.get(key) != KILLED) || force) { > + Byte prev = eventsHigh.get(key); > + if (prev == null || prev == KILLED || force) { > eventsHigh.put(key, Byte.valueOf(events)); > } > } -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130916/36a6b8e1/attachment-0001.html From chris.hegarty at oracle.com Mon Sep 16 13:04:19 2013 From: chris.hegarty at oracle.com (Chris Hegarty) Date: Mon, 16 Sep 2013 21:04:19 +0100 Subject: Regression in EPollArrayWrapper causes NPE when fd > 64 * 1024 In-Reply-To: <5237448D.2050804@oracle.com> References: <9DC59A1F-9EFB-418C-89C9-4FDE26177B90@redhat.com> <5237448D.2050804@oracle.com> Message-ID: <52376443.2040106@oracle.com> On 16/09/2013 18:49, Alan Bateman wrote: > .... > diff --git a/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java > b/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java > --- a/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java > +++ b/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java > @@ -175,7 +175,8 @@ > } > } else { > Integer key = Integer.valueOf(fd); > - if ((eventsHigh.get(key) != KILLED) || force) { > + Byte prev = eventsHigh.get(key); > + if (prev == null || prev == KILLED || force) { > eventsHigh.put(key, Byte.valueOf(events)); > } > } I know this is not a request for review, but the above changes look like they should resolve the NPE. -Chris From Alan.Bateman at oracle.com Tue Sep 17 03:10:44 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Tue, 17 Sep 2013 11:10:44 +0100 Subject: Regression in EPollArrayWrapper causes NPE when fd > 64 * 1024 In-Reply-To: References: <9DC59A1F-9EFB-418C-89C9-4FDE26177B90@redhat.com> <5237448D.2050804@oracle.com> Message-ID: <52382AA4.3050404@oracle.com> On 16/09/2013 21:02, Norman Maurer wrote: > Hi Alan, > > for sure I want to contribute a patch. Seems like the patches / > reproducer are visible in the mail archive[1]. > > Anyway here are the files included inline: Attachments are usually stripped by the OpenJDK mail servers so I'm surprised they are there. On the test case (or reproducer): as there is configuration required to allow >64k connections then a simpler way to demonstrate it is to initially consume 64k file descriptors (by opening files or create SocketChannels). That's what I did when I saw your mail in order to duplicate it quickly. We need to decide whether to attempt to include a test case with this patch. I'm in two minds on this as I don't know how often the test will be run in environments where the maximum number of file descriptors is unlimited or high enough. An alternative (for a future patch perhaps) is to make MAX_UPDATE_ARRAY_SIZE configurable so that the spilling can tested without requiring the maximum number of file descriptors to be increased. Your patch is essentially the same as what I tested with yesterday. I've changed to the attached (to make it consistent with the eventsLow code mostly). If you are okay with this then I will push it listing you as contributor. -Alan. diff --git a/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java b/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java --- a/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java +++ b/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java @@ -164,6 +164,16 @@ } /** + * Returns {@code true} if updates for the given key (file + * descriptor) are killed. + */ + private boolean isEventsHighKilled(Integer key) { + assert key >= MAX_UPDATE_ARRAY_SIZE; + Byte value = eventsHigh.get(key); + return (value != null && value == KILLED); + } + + /** * Sets the pending update events for the given file descriptor. This * method has no effect if the update events is already set to KILLED, * unless {@code force} is {@code true}. @@ -175,7 +185,7 @@ } } else { Integer key = Integer.valueOf(fd); - if ((eventsHigh.get(key) != KILLED) || force) { + if (!isEventsHighKilled(key) || force) { eventsHigh.put(key, Byte.valueOf(events)); } } -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130917/9309f067/attachment.html From nmaurer at redhat.com Tue Sep 17 04:54:24 2013 From: nmaurer at redhat.com (Norman Maurer) Date: Tue, 17 Sep 2013 13:54:24 +0200 Subject: Regression in EPollArrayWrapper causes NPE when fd > 64 * 1024 In-Reply-To: <52382AA4.3050404@oracle.com> References: <9DC59A1F-9EFB-418C-89C9-4FDE26177B90@redhat.com> <5237448D.2050804@oracle.com> <52382AA4.3050404@oracle.com> Message-ID: Hi Alan, just a tiny thing but why not check if force is true before try to access the Map (like in my proposed patches)? Not sure it gives much gain in terms of performance but it can't harm at all? About the test-case, I think it's important to have a test cover the usage of the eventsHigh Map. Otherwise it's just "too easy" to break things and it may be take some time to get noticed as only "a few" users are affected. I think you could even "adjust" the MAX_UPDATE_ARRAY_SIZE with reflection if you not want to expose it. WDYT ? --- Norman Maurer nmaurer at redhat.com JBoss, by Red Hat Am 17.09.2013 um 12:10 schrieb Alan Bateman : > On 16/09/2013 21:02, Norman Maurer wrote: >> >> Hi Alan, >> >> for sure I want to contribute a patch. Seems like the patches / reproducer are visible in the mail archive[1]. >> >> Anyway here are the files included inline: > Attachments are usually stripped by the OpenJDK mail servers so I'm surprised they are there. > > On the test case (or reproducer): as there is configuration required to allow >64k connections then a simpler way to demonstrate it is to initially consume 64k file descriptors (by opening files or create SocketChannels). That's what I did when I saw your mail in order to duplicate it quickly. We need to decide whether to attempt to include a test case with this patch. I'm in two minds on this as I don't know how often the test will be run in environments where the maximum number of file descriptors is unlimited or high enough. An alternative (for a future patch perhaps) is to make MAX_UPDATE_ARRAY_SIZE configurable so that the spilling can tested without requiring the maximum number of file descriptors to be increased. > > Your patch is essentially the same as what I tested with yesterday. I've changed to the attached (to make it consistent with the eventsLow code mostly). If you are okay with this then I will push it listing you as contributor. > > -Alan. > > > diff --git a/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java b/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java > --- a/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java > +++ b/src/solaris/classes/sun/nio/ch/EPollArrayWrapper.java > @@ -164,6 +164,16 @@ > } > > /** > + * Returns {@code true} if updates for the given key (file > + * descriptor) are killed. > + */ > + private boolean isEventsHighKilled(Integer key) { > + assert key >= MAX_UPDATE_ARRAY_SIZE; > + Byte value = eventsHigh.get(key); > + return (value != null && value == KILLED); > + } > + > + /** > * Sets the pending update events for the given file descriptor. This > * method has no effect if the update events is already set to KILLED, > * unless {@code force} is {@code true}. > @@ -175,7 +185,7 @@ > } > } else { > Integer key = Integer.valueOf(fd); > - if ((eventsHigh.get(key) != KILLED) || force) { > + if (!isEventsHighKilled(key) || force) { > eventsHigh.put(key, Byte.valueOf(events)); > } > } > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130917/503c764a/attachment.html From Alan.Bateman at oracle.com Tue Sep 17 05:08:53 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Tue, 17 Sep 2013 13:08:53 +0100 Subject: Regression in EPollArrayWrapper causes NPE when fd > 64 * 1024 In-Reply-To: References: <9DC59A1F-9EFB-418C-89C9-4FDE26177B90@redhat.com> <5237448D.2050804@oracle.com> <52382AA4.3050404@oracle.com> Message-ID: <52384655.6040001@oracle.com> On 17/09/2013 12:54, Norman Maurer wrote: > Hi Alan, > > just a tiny thing but why not check if force is true before try to > access the Map (like in my proposed patches)? Not sure it gives much > gain in terms of performance but it can't harm at all? It's so that it is consistent with the eventsLow check. Performance-wise I don't see any difference as updated aren't forced when changing the registration. If you want to check then it's okay but I'd prefer that the order be consistent for both eventsLow and eventsHigh. > > About the test-case, I think it's important to have a test cover the > usage of the eventsHigh Map. Otherwise it's just "too easy" to break > things and it may be take some time to get noticed as only "a few" > users are affected. I think you could even "adjust" the > MAX_UPDATE_ARRAY_SIZE with reflection if you not want to expose it. Using reflection to change is a good idea (better than using a system property as I was initially thinking). Do you want to contribute a test? The simplest would be to re-run one of the existing stress tests with this set to a small value. -Alan. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130917/29c87b49/attachment.html From nmaurer at redhat.com Tue Sep 17 05:12:10 2013 From: nmaurer at redhat.com (Norman Maurer) Date: Tue, 17 Sep 2013 14:12:10 +0200 Subject: Regression in EPollArrayWrapper causes NPE when fd > 64 * 1024 In-Reply-To: <52384655.6040001@oracle.com> References: <9DC59A1F-9EFB-418C-89C9-4FDE26177B90@redhat.com> <5237448D.2050804@oracle.com> <52382AA4.3050404@oracle.com> <52384655.6040001@oracle.com> Message-ID: Am 17.09.2013 um 14:08 schrieb Alan Bateman : > On 17/09/2013 12:54, Norman Maurer wrote: >> >> Hi Alan, >> >> just a tiny thing but why not check if force is true before try to access the Map (like in my proposed patches)? Not sure it gives much gain in terms of performance but it can't harm at all? > It's so that it is consistent with the eventsLow check. Performance-wise I don't see any difference as updated aren't forced when changing the registration. If you want to check then it's okay but I'd prefer that the order be consistent for both eventsLow and eventsHigh. Ok I think it should be ok then. Thanks for clarify. > >> >> About the test-case, I think it's important to have a test cover the usage of the eventsHigh Map. Otherwise it's just "too easy" to break things and it may be take some time to get noticed as only "a few" users are affected. I think you could even "adjust" the MAX_UPDATE_ARRAY_SIZE with reflection if you not want to expose it. > Using reflection to change is a good idea (better than using a system property as I was initially thinking). Do you want to contribute a test? The simplest would be to re-run one of the existing stress tests with this set to a small value. Sure why not, this would be good thing to complete the fix and proof it. Can you point me to one of the tests you are refer to ? > > -Alan. --- Norman Maurer nmaurer at redhat.com JBoss, by Red Hat From johannes.rudolph at googlemail.com Tue Sep 17 05:42:44 2013 From: johannes.rudolph at googlemail.com (Johannes Rudolph) Date: Tue, 17 Sep 2013 14:42:44 +0200 Subject: Fwd: OP_CONNECT, connect, and finishConnect fail In-Reply-To: References: Message-ID: Hi there, This is basically a follow up to http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6371630 The problem is that under Linux for outgoing connections bad things happen if you select for OP_CONNECT before even attempting the connect. The sequence of calls is this: * SocketChannel.open * ch.configureBlocking(false) * ch.register(selector, OP_CONNECT) * selector.select() instantly returns and reports the channel as connected (it doesn't matter if you do the actual select call after the connection attempt) * ch.connect(unrespondingHostAddress) which returns false * ch.finishConnect() which now always returns true, the OS-level socket itself, however, never received any response from the peer for its SYN packet * ch.isConnected() returns true * if the host would eventually establish the connection the socket would be usable (as OP_READ and OP_WRITE still work as intended), so the main problem is that a connection is reported as established when, in fact, it may never make any progress with connection establishment Of course, you could argue that registering for OP_CONNECT before calling connect is a user error but is neither forbidden by the documentation nor in any way prevented at runtime. All of the later behavior of `finishConnect` makes no sense at all. Also the actual call-sequence can usually be much more complicated in a common multi-threaded setting so the actual calls registering the channel to the selector and the connection attempt may be executed concurrently for some reasons making this bug even harder to find. Here's an standalone example exhibiting the behavior: https://gist.github.com/jrudolph/6535400 We discovered the problem here: https://www.assembla.com/spaces/akka/tickets/3602-io--tcp-connection-establishment-always-succeeds-even-if-endpoint-never-answers Cheers, Johannes ----------------------------------------------- Johannes Rudolph http://virtual-void.net -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130917/c1f7a407/attachment.html From Alan.Bateman at oracle.com Tue Sep 17 06:08:15 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Tue, 17 Sep 2013 14:08:15 +0100 Subject: Fwd: OP_CONNECT, connect, and finishConnect fail In-Reply-To: References: Message-ID: <5238543F.1090104@oracle.com> On 17/09/2013 13:42, Johannes Rudolph wrote: > : > > Of course, you could argue that registering for OP_CONNECT before > calling connect is a user error but is neither forbidden by the > documentation nor in any way prevented at runtime. All of the later > behavior of `finishConnect` makes no sense at all. Also the actual > call-sequence can usually be much more complicated in a common > multi-threaded setting so the actual calls registering the channel to > the selector and the connection attempt may be executed concurrently > for some reasons making this bug even harder to find. > This topic has been a discussed a number of times and we'd like to update the javadoc to make it clear how to use this API. I assume you don't have an issue when you register for OP_CONNECT after you have initiate the connect and after you have checked that it returned false (to indicate that the connection wasn't established immediately). -Alan. From nmaurer at redhat.com Tue Sep 17 11:11:40 2013 From: nmaurer at redhat.com (Norman Maurer) Date: Tue, 17 Sep 2013 20:11:40 +0200 Subject: Regression in EPollArrayWrapper causes NPE when fd > 64 * 1024 In-Reply-To: References: <9DC59A1F-9EFB-418C-89C9-4FDE26177B90@redhat.com> <5237448D.2050804@oracle.com> <52382AA4.3050404@oracle.com> <52384655.6040001@oracle.com> Message-ID: Sorry forgot to say that I'm ok with your proposed fix now. So feel free to push and add me as contributor. Hopefully I will have a test for you tomorrow. Bye, Norman --- Norman Maurer nmaurer at redhat.com JBoss, by Red Hat Am 17.09.2013 um 14:12 schrieb Norman Maurer : > Am 17.09.2013 um 14:08 schrieb Alan Bateman : > >> On 17/09/2013 12:54, Norman Maurer wrote: >>> >>> Hi Alan, >>> >>> just a tiny thing but why not check if force is true before try to access the Map (like in my proposed patches)? Not sure it gives much gain in terms of performance but it can't harm at all? >> It's so that it is consistent with the eventsLow check. Performance-wise I don't see any difference as updated aren't forced when changing the registration. If you want to check then it's okay but I'd prefer that the order be consistent for both eventsLow and eventsHigh. > > Ok I think it should be ok then. Thanks for clarify. > >> >>> >>> About the test-case, I think it's important to have a test cover the usage of the eventsHigh Map. Otherwise it's just "too easy" to break things and it may be take some time to get noticed as only "a few" users are affected. I think you could even "adjust" the MAX_UPDATE_ARRAY_SIZE with reflection if you not want to expose it. >> Using reflection to change is a good idea (better than using a system property as I was initially thinking). Do you want to contribute a test? The simplest would be to re-run one of the existing stress tests with this set to a small value. > > Sure why not, this would be good thing to complete the fix and proof it. Can you point me to one of the tests you are refer to ? > >> >> -Alan. > > > --- > Norman Maurer > nmaurer at redhat.com > > JBoss, by Red Hat > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130917/8730f9ac/attachment.html From cowwoc at bbs.darktech.org Tue Sep 17 21:36:38 2013 From: cowwoc at bbs.darktech.org (cowwoc) Date: Wed, 18 Sep 2013 00:36:38 -0400 Subject: Bug #7063249 Message-ID: <52392DD6.5060700@bbs.darktech.org> Hi Alan, Just reminding you that we originally agreed to try to get this into JDK7, and having failed that, into JDK8. I see that http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7063249 is still marked as unresolved and that we have under a month to get it into JDK8. Can you please let me know what needs to be done to get this into JDK8? Thank you, Gili From yiming.wang at oracle.com Wed Sep 18 01:14:36 2013 From: yiming.wang at oracle.com (Eric Wang) Date: Wed, 18 Sep 2013 16:14:36 +0800 Subject: Review request for bug 8015762: java/nio/channels/DatagramChannel/AdaptDatagramSocket.java fails intermittently In-Reply-To: <5230463B.8060402@oracle.com> References: <520A188D.2070808@oracle.com> <520A9D30.8020107@oracle.com> <523030C0.8020102@oracle.com> <523036A6.3040009@oracle.com> <52303C29.5040307@oracle.com> <5230463B.8060402@oracle.com> Message-ID: <523960EC.4080005@oracle.com> Hi Chris, Yes, It looks a bit odd, i tried to cut down thread number of UdpEchoRequest to make response sent before socket timeout. Essentially, it is a timing issue. so the new fix below is to update change to SO_TIMEOUT value from 5 seconds to 15. I have tested on jsn-vm49.us for 20000 times and all passed. http://cr.openjdk.java.net/~ewang/8015762/webrev.02/test/java/nio/channels/DatagramChannel/AdaptDatagramSocket.java.sdiff.html Can you please help to review? Thanks, Eric On 2013/9/11 18:30, Chris Hegarty wrote: > On 11/09/2013 10:47, Eric Wang wrote: >> >> On 2013/9/11 17:23, Alan Bateman wrote: >>> On 11/09/2013 09:58, Eric Wang wrote: >>>> Hi Alan, >>>> >>>> Sorry for late. I have re-fixed this failure, Can you please help to >>>> review? >>>> I executed the tests on the host jsn-vm49.us for thousands times and >>>> found the test failed as setting SO_TIMEOUT for 5 seconds is not >>>> enough to wait response sent by a new created thread of >>>> UdpEchoRequest. It may caused by thread schedule as there's maybe >>>> more than 3 threads executing at sametime or full GC as lots of >>>> UdpEchoRequest created in runtime. >>>> >>>> The fix is to change the SO_TIMEOUT from 5 seconds to 10 and not >>>> create a new thread of UdpEchoRequest to send response. I have run >>>> the fix for 20000 times, it works fine. >>>> http://cr.openjdk.java.net/~ewang/8015762/webrev.01/ >>>> >>>> >>> Thanks for confirming that the 5 second timeout is insufficient, that >>> part is clear now. >>> >>> The webrev also updates TestServers so that start runs the task >>> directly. Is this meant to be part of this change? >>> >>> -Alan. >>> >> Yes, it is a part of the fix, i have tested if only update timeout to 10 >> sec, the test maybe still failed as the new created thread of >> UdpEchoRequest doesn't get chance to run. so I workaround the thread. > > This does look a little odd. I'm don't see why it is necessary. If > there is a still a timing issue should there be some other form of > synchronization between threads? > > -Chris. > >> >> Thanks, >> Eric >> From chris.hegarty at oracle.com Wed Sep 18 01:35:58 2013 From: chris.hegarty at oracle.com (Chris Hegarty) Date: Wed, 18 Sep 2013 09:35:58 +0100 Subject: Review request for bug 8015762: java/nio/channels/DatagramChannel/AdaptDatagramSocket.java fails intermittently In-Reply-To: <523960EC.4080005@oracle.com> References: <520A188D.2070808@oracle.com> <520A9D30.8020107@oracle.com> <523030C0.8020102@oracle.com> <523036A6.3040009@oracle.com> <52303C29.5040307@oracle.com> <5230463B.8060402@oracle.com> <523960EC.4080005@oracle.com> Message-ID: <523965EE.60708@oracle.com> This looks ok to me ( since we are not expecting a timeout anyway, the change to 15 secs should not have any negative impact ). -Chris. On 09/18/2013 09:14 AM, Eric Wang wrote: > Hi Chris, > > Yes, It looks a bit odd, i tried to cut down thread number of > UdpEchoRequest to make response sent before socket timeout. > Essentially, it is a timing issue. so the new fix below is to update > change to SO_TIMEOUT value from 5 seconds to 15. I have tested on > jsn-vm49.us for 20000 times and all passed. > http://cr.openjdk.java.net/~ewang/8015762/webrev.02/test/java/nio/channels/DatagramChannel/AdaptDatagramSocket.java.sdiff.html > > > > Can you please help to review? > Thanks, > Eric > On 2013/9/11 18:30, Chris Hegarty wrote: >> On 11/09/2013 10:47, Eric Wang wrote: >>> >>> On 2013/9/11 17:23, Alan Bateman wrote: >>>> On 11/09/2013 09:58, Eric Wang wrote: >>>>> Hi Alan, >>>>> >>>>> Sorry for late. I have re-fixed this failure, Can you please help to >>>>> review? >>>>> I executed the tests on the host jsn-vm49.us for thousands times and >>>>> found the test failed as setting SO_TIMEOUT for 5 seconds is not >>>>> enough to wait response sent by a new created thread of >>>>> UdpEchoRequest. It may caused by thread schedule as there's maybe >>>>> more than 3 threads executing at sametime or full GC as lots of >>>>> UdpEchoRequest created in runtime. >>>>> >>>>> The fix is to change the SO_TIMEOUT from 5 seconds to 10 and not >>>>> create a new thread of UdpEchoRequest to send response. I have run >>>>> the fix for 20000 times, it works fine. >>>>> http://cr.openjdk.java.net/~ewang/8015762/webrev.01/ >>>>> >>>>> >>>> Thanks for confirming that the 5 second timeout is insufficient, that >>>> part is clear now. >>>> >>>> The webrev also updates TestServers so that start runs the task >>>> directly. Is this meant to be part of this change? >>>> >>>> -Alan. >>>> >>> Yes, it is a part of the fix, i have tested if only update timeout to 10 >>> sec, the test maybe still failed as the new created thread of >>> UdpEchoRequest doesn't get chance to run. so I workaround the thread. >> >> This does look a little odd. I'm don't see why it is necessary. If >> there is a still a timing issue should there be some other form of >> synchronization between threads? >> >> -Chris. >> >>> >>> Thanks, >>> Eric >>> > From Alan.Bateman at oracle.com Wed Sep 18 01:51:05 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Wed, 18 Sep 2013 09:51:05 +0100 Subject: Review request for bug 8015762: java/nio/channels/DatagramChannel/AdaptDatagramSocket.java fails intermittently In-Reply-To: <523960EC.4080005@oracle.com> References: <520A188D.2070808@oracle.com> <520A9D30.8020107@oracle.com> <523030C0.8020102@oracle.com> <523036A6.3040009@oracle.com> <52303C29.5040307@oracle.com> <5230463B.8060402@oracle.com> <523960EC.4080005@oracle.com> Message-ID: <52396979.10806@oracle.com> On 18/09/2013 09:14, Eric Wang wrote: > Hi Chris, > > Yes, It looks a bit odd, i tried to cut down thread number of > UdpEchoRequest to make response sent before socket timeout. > Essentially, it is a timing issue. so the new fix below is to update > change to SO_TIMEOUT value from 5 seconds to 15. I have tested on > jsn-vm49.us for 20000 times and all passed. > http://cr.openjdk.java.net/~ewang/8015762/webrev.02/test/java/nio/channels/DatagramChannel/AdaptDatagramSocket.java.sdiff.html > > > > I'm okay with this too. -Alan. From Alan.Bateman at oracle.com Wed Sep 18 02:05:38 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Wed, 18 Sep 2013 10:05:38 +0100 Subject: Bug #7063249 In-Reply-To: <52392DD6.5060700@bbs.darktech.org> References: <52392DD6.5060700@bbs.darktech.org> Message-ID: <52396CE2.8060400@oracle.com> On 18/09/2013 05:36, cowwoc wrote: > Hi Alan, > > Just reminding you that we originally agreed to try to get this > into JDK7, and having failed that, into JDK8. I see that > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7063249 is still > marked as unresolved and that we have under a month to get it into > JDK8. Can you please let me know what needs to be done to get this > into JDK8? To my knowledge, there hasn't been any further discussion on this since it was discussed here (in July 2011, just after JDK 7 was released). As it currently stands then if there isn't a timeout then an asynchronous I/O operation completes when the I/O operation completes (or the channel is closed). When there is a timeout specified then the I/O operation may fail early with InterruptedByTimeoutException. So this was a discussion about translating familiar usages of timeout (on synchronous methods) to how those timeouts should work with asynchronous operations. You brought up that a timeout <= 0 should mean the I/O operation completes immediately and I don't think we fully established whether this was the right thing to do, at least for the case where the I/O operation cannot complete immediately. So I believe I suggested this topic needed further consideration and 7063249 was the reminder to re-examine the topic. Unfortunately I have not had cycles myself to explore this further since then (sorry about, just way too many other things going on). -Alan. From nmaurer at redhat.com Wed Sep 18 03:20:20 2013 From: nmaurer at redhat.com (Norman Maurer) Date: Wed, 18 Sep 2013 12:20:20 +0200 Subject: Regression in EPollArrayWrapper causes NPE when fd > 64 * 1024 In-Reply-To: References: <9DC59A1F-9EFB-418C-89C9-4FDE26177B90@redhat.com> <5237448D.2050804@oracle.com> <52382AA4.3050404@oracle.com> <52384655.6040001@oracle.com> Message-ID: Hi Alan, I tried to find one of those "existing stress-tests" but was not able yet. Can you just tell me where I can find one so I can actually write a test-case based on one of them. Bye, Norman --- Norman Maurer nmaurer at redhat.com JBoss, by Red Hat Am 17.09.2013 um 20:11 schrieb Norman Maurer : > Sorry forgot to say that I'm ok with your proposed fix now. So feel free to push and add me as contributor. Hopefully I will have a test for you tomorrow. > > Bye, > Norman > > --- > Norman Maurer > nmaurer at redhat.com > > JBoss, by Red Hat > > > > Am 17.09.2013 um 14:12 schrieb Norman Maurer : > >> Am 17.09.2013 um 14:08 schrieb Alan Bateman : >> >>> On 17/09/2013 12:54, Norman Maurer wrote: >>>> >>>> Hi Alan, >>>> >>>> just a tiny thing but why not check if force is true before try to access the Map (like in my proposed patches)? Not sure it gives much gain in terms of performance but it can't harm at all? >>> It's so that it is consistent with the eventsLow check. Performance-wise I don't see any difference as updated aren't forced when changing the registration. If you want to check then it's okay but I'd prefer that the order be consistent for both eventsLow and eventsHigh. >> >> Ok I think it should be ok then. Thanks for clarify. >> >>> >>>> >>>> About the test-case, I think it's important to have a test cover the usage of the eventsHigh Map. Otherwise it's just "too easy" to break things and it may be take some time to get noticed as only "a few" users are affected. I think you could even "adjust" the MAX_UPDATE_ARRAY_SIZE with reflection if you not want to expose it. >>> Using reflection to change is a good idea (better than using a system property as I was initially thinking). Do you want to contribute a test? The simplest would be to re-run one of the existing stress tests with this set to a small value. >> >> Sure why not, this would be good thing to complete the fix and proof it. Can you point me to one of the tests you are refer to ? >> >>> >>> -Alan. >> >> >> --- >> Norman Maurer >> nmaurer at redhat.com >> >> JBoss, by Red Hat >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130918/14148264/attachment.html From Alan.Bateman at oracle.com Wed Sep 18 05:16:42 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Wed, 18 Sep 2013 13:16:42 +0100 Subject: Regression in EPollArrayWrapper causes NPE when fd > 64 * 1024 In-Reply-To: References: <9DC59A1F-9EFB-418C-89C9-4FDE26177B90@redhat.com> <5237448D.2050804@oracle.com> <52382AA4.3050404@oracle.com> <52384655.6040001@oracle.com> Message-ID: <523999AA.5090005@oracle.com> On 18/09/2013 11:20, Norman Maurer wrote: > Hi Alan, > > I tried to find one of those "existing stress-tests" but was not able > yet. Can you just tell me where I can find one so I can actually write > a test-case based on one of them. > Yeah, the Selector tests in the jdk/test tree are a bit of mixed bag tests for various long fixed issues. One thing that I remembered after our exchange is that the Selector implementations that we have for Solaris use the same approach for queuing updates. This came about when we re-worked the /dev/poll Selector to fix a number of performance and reliability issues. It's also in the relatively new port based Selector which I'd like to see made the default on Solaris at some point. So while only the epoll Selector has the problem (due to the kill logic) then I think it would be best if we provide a way to test all of the Selectors. To that end, I've added a special property to select the limit that we can use for testing. Your suggestion to use reflection would work too, as would having a test that adds to sun.nio.ch. The first approach seems the simplest so I've put a webrev here with the changes: http://cr.openjdk.java.net/~alanb/8024883/webrev/ It means we can run jtreg with -vmoption:-Dsun.nio.ch.maxUpdateArraySize=N and it will run all of the tests with this setting. This is the same approach when you want to test with an alternative Selector implementation. To ensure that there is at least some coverage in normal test runs then I've modified the test description of two tests so that they re-run with this property set to a smallish value. Let me know if you are agree with this. I realize this is extending the scope of your original bug report and patch a bit. -Alan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130918/5daca6f1/attachment-0001.html From nmaurer at redhat.com Wed Sep 18 05:25:39 2013 From: nmaurer at redhat.com (Norman Maurer) Date: Wed, 18 Sep 2013 14:25:39 +0200 Subject: Regression in EPollArrayWrapper causes NPE when fd > 64 * 1024 In-Reply-To: <523999AA.5090005@oracle.com> References: <9DC59A1F-9EFB-418C-89C9-4FDE26177B90@redhat.com> <5237448D.2050804@oracle.com> <52382AA4.3050404@oracle.com> <52384655.6040001@oracle.com> <523999AA.5090005@oracle.com> Message-ID: <88FD79D2-E002-418D-A35A-2521C1508F1E@redhat.com> Am 18.09.2013 um 14:16 schrieb Alan Bateman : > On 18/09/2013 11:20, Norman Maurer wrote: >> >> Hi Alan, >> >> I tried to find one of those "existing stress-tests" but was not able yet. Can you just tell me where I can find one so I can actually write a test-case based on one of them. >> > Yeah, the Selector tests in the jdk/test tree are a bit of mixed bag tests for various long fixed issues. > > One thing that I remembered after our exchange is that the Selector implementations that we have for Solaris use the same approach for queuing updates. This came about when we re-worked the /dev/poll Selector to fix a number of performance and reliability issues. It's also in the relatively new port based Selector which I'd like to see made the default on Solaris at some point. > > So while only the epoll Selector has the problem (due to the kill logic) then I think it would be best if we provide a way to test all of the Selectors. To that end, I've added a special property to select the limit that we can use for testing. Your suggestion to use reflection would work too, as would having a test that adds to sun.nio.ch. The first approach seems the simplest so I've put a webrev here with the changes: > > http://cr.openjdk.java.net/~alanb/8024883/webrev/ > > It means we can run jtreg with -vmoption:-Dsun.nio.ch.maxUpdateArraySize=N and it will run all of the tests with this setting. This is the same approach when you want to test with an alternative Selector implementation. To ensure that there is at least some coverage in normal test runs then I've modified the test description of two tests so that they re-run with this property set to a smallish value. > > Let me know if you are agree with this. I realize this is extending the scope of your original bug report and patch a bit. > > -Alan > > > Looks good? Thanks for reviewing my bug report and the patches. Have this tested also for other Selectors makes a lot of sense. --- Norman Maurer nmaurer at redhat.com JBoss, by Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130918/f1f195f7/attachment.html From sean.coffey at oracle.com Wed Sep 18 06:05:33 2013 From: sean.coffey at oracle.com (=?windows-1252?Q?Se=E1n_Coffey?=) Date: Wed, 18 Sep 2013 14:05:33 +0100 Subject: Regression in EPollArrayWrapper causes NPE when fd > 64 * 1024 In-Reply-To: <88FD79D2-E002-418D-A35A-2521C1508F1E@redhat.com> References: <9DC59A1F-9EFB-418C-89C9-4FDE26177B90@redhat.com> <5237448D.2050804@oracle.com> <52382AA4.3050404@oracle.com> <52384655.6040001@oracle.com> <523999AA.5090005@oracle.com> <88FD79D2-E002-418D-A35A-2521C1508F1E@redhat.com> Message-ID: <5239A51D.90703@oracle.com> Looks good to me also. Thanks for jumping on this Norman, Alan. regards, Sean. On 18/09/13 13:25, Norman Maurer wrote: > > > Am 18.09.2013 um 14:16 schrieb Alan Bateman >: > >> On 18/09/2013 11:20, Norman Maurer wrote: >>> Hi Alan, >>> >>> I tried to find one of those "existing stress-tests" but was not >>> able yet. Can you just tell me where I can find one so I can >>> actually write a test-case based on one of them. >>> >> Yeah, the Selector tests in the jdk/test tree are a bit of mixed bag >> tests for various long fixed issues. >> >> One thing that I remembered after our exchange is that the Selector >> implementations that we have for Solaris use the same approach for >> queuing updates. This came about when we re-worked the /dev/poll >> Selector to fix a number of performance and reliability issues. It's >> also in the relatively new port based Selector which I'd like to see >> made the default on Solaris at some point. >> >> So while only the epoll Selector has the problem (due to the kill >> logic) then I think it would be best if we provide a way to test all >> of the Selectors. To that end, I've added a special property to >> select the limit that we can use for testing. Your suggestion to use >> reflection would work too, as would having a test that adds to >> sun.nio.ch . The first approach seems the simplest >> so I've put a webrev here with the changes: >> >> http://cr.openjdk.java.net/~alanb/8024883/webrev/ >> >> It means we can run jtreg with >> -vmoption:-Dsun.nio.ch.maxUpdateArraySize=N and it will run all of >> the tests with this setting. This is the same approach when you want >> to test with an alternative Selector implementation. To ensure that >> there is at least some coverage in normal test runs then I've >> modified the test description of two tests so that they re-run with >> this property set to a smallish value. >> >> Let me know if you are agree with this. I realize this is extending >> the scope of your original bug report and patch a bit. >> >> -Alan >> >> >> > > Looks good? Thanks for reviewing my bug report and the patches. Have > this tested also for other Selectors makes a lot of sense. > > > --- > Norman Maurer > nmaurer at redhat.com > > JBoss, by Red Hat > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130918/2f75e504/attachment.html From Alan.Bateman at oracle.com Wed Sep 18 06:30:08 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Wed, 18 Sep 2013 14:30:08 +0100 Subject: Regression in EPollArrayWrapper causes NPE when fd > 64 * 1024 In-Reply-To: <88FD79D2-E002-418D-A35A-2521C1508F1E@redhat.com> References: <9DC59A1F-9EFB-418C-89C9-4FDE26177B90@redhat.com> <5237448D.2050804@oracle.com> <52382AA4.3050404@oracle.com> <52384655.6040001@oracle.com> <523999AA.5090005@oracle.com> <88FD79D2-E002-418D-A35A-2521C1508F1E@redhat.com> Message-ID: <5239AAE0.6040605@oracle.com> On 18/09/2013 13:25, Norman Maurer wrote: > : > > Looks good? Thanks for reviewing my bug report and the patches. Have > this tested also for other Selectors makes a lot of sense. > I've pushed this to the jdk8/tl forest [1]. Thanks again for the bug report and patch. I've created a backport issue [2] in the bug database to track getting this into jdk7u. -Alan [1] hg.openjdk.java.net/jdk8/tl/jdk/rev/e92635d6834c [2] https://bugs.openjdk.java.net/browse/JDK-8024989 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130918/f03098d2/attachment.html From chris.w.dennis at gmail.com Wed Sep 18 06:53:06 2013 From: chris.w.dennis at gmail.com (Chris Dennis) Date: Wed, 18 Sep 2013 09:53:06 -0400 Subject: Bug in interrupt handling in FileChannelImpl.map(=?UTF-8?B?4oCm?=) In-Reply-To: <523464AE.9090104@oracle.com> Message-ID: I'm interested in pushing this issue to a conclusion, and I'm happy to contribute a fix. What forest should I be generating a patch against? On 9/14/13 9:29 AM, "Alan Bateman" wrote: >On 13/09/2013 16:04, Chris Dennis wrote: >> Hi All, >> >> I have discovered what I'm pretty certain is a bug in FileChannelImpl's >> interrupt handling. The root cause is that the map(?) method is calling >> size() from within it's begin()/end(?) block. >Thanks for this, it is indeed a bug and should be using nd.size() rather >than size(). > >I've created this bug to track it: > 8024833: (fc) FileChannel.map does not handle async close/interrupt >correctly > >-Alan. From Alan.Bateman at oracle.com Wed Sep 18 06:59:13 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Wed, 18 Sep 2013 14:59:13 +0100 Subject: Bug in interrupt handling in =?UTF-8?B?RmlsZUNoYW5uZWxJbXBsLg==?= =?UTF-8?B?bWFwKOKApik=?= In-Reply-To: References: Message-ID: <5239B1B1.5030205@oracle.com> On 18/09/2013 14:53, Chris Dennis wrote: > I'm interested in pushing this issue to a conclusion, and I'm happy to > contribute a fix. What forest should I be generating a patch against? > The "How to contribute" is the starting point. I see there is a Chris Dennis on the OCA list but I don't know if that is you. The forest that we push core area changes to is jdk8/tl. We try to include a test with all changes although hear I suspect it may be too troublesome to write a test that duplicates this without having side effects. -Alan [1] http://openjdk.java.net/contribute/ From chris.w.dennis at gmail.com Wed Sep 18 07:04:07 2013 From: chris.w.dennis at gmail.com (Chris Dennis) Date: Wed, 18 Sep 2013 10:04:07 -0400 Subject: Bug in interrupt handling in FileChannelImpl.map(=?UTF-8?B?4oCm?=) In-Reply-To: <5239B1B1.5030205@oracle.com> Message-ID: The Chris Dennis on the OCA list is me, I've just switched my openjdk mail subscriptions to my gmail address due to work email quotas. I'll prepare a patch against jdk8/tl, and look in to the possibility of creating a test that attempts to reproduce this (although as you say it may be impractical). Thanks, Chris On 9/18/13 9:59 AM, "Alan Bateman" wrote: >On 18/09/2013 14:53, Chris Dennis wrote: >> I'm interested in pushing this issue to a conclusion, and I'm happy to >> contribute a fix. What forest should I be generating a patch against? >> >The "How to contribute" is the starting point. I see there is a Chris >Dennis on the OCA list but I don't know if that is you. > >The forest that we push core area changes to is jdk8/tl. We try to >include a test with all changes although hear I suspect it may be too >troublesome to write a test that duplicates this without having side >effects. > >-Alan > >[1] http://openjdk.java.net/contribute/ > > From cowwoc at bbs.darktech.org Wed Sep 18 10:17:34 2013 From: cowwoc at bbs.darktech.org (cowwoc) Date: Wed, 18 Sep 2013 10:17:34 -0700 (PDT) Subject: Bug #7063249 In-Reply-To: <52396CE2.8060400@oracle.com> References: <52392DD6.5060700@bbs.darktech.org> <52396CE2.8060400@oracle.com> Message-ID: <1379524654898-7575193.post@n2.nabble.com> Alan Bateman-2 wrote > So this was a discussion about translating familiar usages of timeout > (on synchronous methods) to how those timeouts should work with > asynchronous operations. You brought up that a timeout <= 0 should mean > the I/O operation completes immediately and I don't think we fully > established whether this was the right thing to do, at least for the > case where the I/O operation cannot complete immediately. So I believe > I suggested this topic needed further consideration and 7063249 was the > reminder to re-examine the topic. Unfortunately I have not had cycles > myself to explore this further since then (sorry about, just way too > many other things going on). Hi Alan, Regarding whether timeout <= 0 returning immediately is the right thing to do, I'd like to remind you of the following: Doug explaining why the behavior is desirable: http://mail.openjdk.java.net/pipermail/nio-discuss/2009-June/000239.html You agreeing with him and committing to making the change: http://mail.openjdk.java.net/pipermail/nio-discuss/2009-June/000240.html I provide additional reasons here: http://nio-dev.3157472.n2.nabble.com/AsynchronousSocketChannel-still-throws-unspecified-exception-td6471557.html#a6579825 The bug report was filed to remind you to make this change, not because this "needed further consideration". Here is what happened... We had a prolonged discussion about changing the meaning of timeouts, in the end you agreed to make the change, as the months went on you forgot about it, and by the time I reminded you it was too late to merge into JDK7. At that point you asked me to file a bug report as a reminder to integrate this into JDK8 shortly after the JDK7 release. I filed the bug report under the assumption that you would set "target milestone" to JDK8 (as we had agreed) but it looks like this never happened. So... with all that history out of the way, how do we go about ensuring this gets integrated into JDK8? Thanks, Gili -- View this message in context: http://nio-dev.3157472.n2.nabble.com/Bug-7063249-tp7575180p7575193.html Sent from the nio-dev mailing list archive at Nabble.com. From Alan.Bateman at oracle.com Wed Sep 18 10:57:49 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Wed, 18 Sep 2013 18:57:49 +0100 Subject: Bug #7063249 In-Reply-To: <1379524654898-7575193.post@n2.nabble.com> References: <52392DD6.5060700@bbs.darktech.org> <52396CE2.8060400@oracle.com> <1379524654898-7575193.post@n2.nabble.com> Message-ID: <5239E99D.7040904@oracle.com> On 18/09/2013 18:17, cowwoc wrote: > : > > Regarding whether timeout<= 0 returning immediately is the right thing to > do, All these methods return immediately (they aren't synchronous, they never block). The issue that that we did not come to a conclusion was whether there should be API support for having I/O operating that complete immediately when there aren't any bytes transferred. That needs further consideration to see if it make sense, the API issue is somewhat secondary to that. > I'd like to remind you of the following: > > Doug explaining why the behavior is desirable: > http://mail.openjdk.java.net/pipermail/nio-discuss/2009-June/000239.html > > You agreeing with him and committing to making the change: > http://mail.openjdk.java.net/pipermail/nio-discuss/2009-June/000240.html I think this was actually 6878369 which was pushed soon after discussion. I'm sorry that we didn't get time to explore further since then. Unfortunately I personally do not have the cycles to think about it for JDK 8, there is just way too many other things going on. -Alan. From nmaurer at redhat.com Wed Sep 18 11:23:36 2013 From: nmaurer at redhat.com (Norman Maurer) Date: Wed, 18 Sep 2013 20:23:36 +0200 Subject: Regression in EPollArrayWrapper causes NPE when fd > 64 * 1024 In-Reply-To: <5239AAE0.6040605@oracle.com> References: <9DC59A1F-9EFB-418C-89C9-4FDE26177B90@redhat.com> <5237448D.2050804@oracle.com> <52382AA4.3050404@oracle.com> <52384655.6040001@oracle.com> <523999AA.5090005@oracle.com> <88FD79D2-E002-418D-A35A-2521C1508F1E@redhat.com> <5239AAE0.6040605@oracle.com> Message-ID: <3111FB94-E914-45AF-A65C-473C8252643A@redhat.com> Thanks, you want a backport patch ? --- Norman Maurer nmaurer at redhat.com JBoss, by Red Hat Am 18.09.2013 um 15:30 schrieb Alan Bateman : > On 18/09/2013 13:25, Norman Maurer wrote: >> >> : >> >> Looks good? Thanks for reviewing my bug report and the patches. Have this tested also for other Selectors makes a lot of sense. >> > I've pushed this to the jdk8/tl forest [1]. Thanks again for the bug report and patch. > > I've created a backport issue [2] in the bug database to track getting this into jdk7u. > > -Alan > > [1] hg.openjdk.java.net/jdk8/tl/jdk/rev/e92635d6834c > [2] https://bugs.openjdk.java.net/browse/JDK-8024989 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130918/c7dde6dd/attachment.html From cowwoc at bbs.darktech.org Wed Sep 18 13:55:57 2013 From: cowwoc at bbs.darktech.org (cowwoc) Date: Wed, 18 Sep 2013 13:55:57 -0700 (PDT) Subject: Bug #7063249 In-Reply-To: <5239E99D.7040904@oracle.com> References: <52392DD6.5060700@bbs.darktech.org> <52396CE2.8060400@oracle.com> <1379524654898-7575193.post@n2.nabble.com> <5239E99D.7040904@oracle.com> Message-ID: <1379537757625-7575196.post@n2.nabble.com> Alan Bateman-2 wrote > On 18/09/2013 18:17, cowwoc wrote: >> : >> >> Regarding whether timeout<= 0 returning immediately is the right thing to >> do, > All these methods return immediately (they aren't synchronous, they > never block). The issue that that we did not come to a conclusion was > whether there should be API support for having I/O operating that > complete immediately when there aren't any bytes transferred. That > needs further consideration to see if it make sense, the API issue is > somewhat secondary to that. Alan, That's not what I'm saying. What I'm saying is that we already discussed and agreed to change the API so that "timeout <= 0 should result in the asynchronous read/write operations completing immediately". Here is what I've managed to piece together by examining the dates of the mailing list and bug reports: 1. In http://mail.openjdk.java.net/pipermail/nio-discuss/2009-June/000240.html Doug wrote that timeout <= 0 must complete immediately because an API that does otherwise "can surprisingly misbehave if timeLimit - elapsed happens to be 0." and "I agree that this should be changed. Any method taking a TimeUnit should obey j.u.c conventions. Which means separating out the no-timeout case as another method." According to http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/package-summary.html j.u.c conventions state that timeout "less than or equal to zero to mean not to wait at all". In other words, timeout <= means the operation should complete immediately. What Doug was saying is that the "no-timeout" case (where operations block forever) should be separated into another method and that the method *with* a timeout parameter should complete immediately if <= 0. In response, you wrote: "I'll fix this shortly so that <= 0 means no-timeout". Doug and I interpreted this as you agreeing but it sounds like you misunderstood and implemented the exact opposite behavior. 2. Two months later someone filed 6878369 stating that timeout < 0 "do [...] not have an associated timeout." Notice the critical mistakes here: a. This bug report discusses timeout < 0 instead of timeout <= 0 b. Its description is a direct contradiction of what we agreed to in the mailing list. We agreed that the operation should complete immediately. Instead, this bug report says that we agreed that the operation should "have no timeout". 3. Two years later I pulled down the latest JDK7 build (a month before the release) and noticed that #1 had not been implemented. When I brought it to your attention in http://nio-dev.3157472.n2.nabble.com/AsynchronousSocketChannel-still-throws-unspecified-exception-tp6471557p6473226.html you wrote: "It's unfortunate that this got forgotten but I think we can fix this early in jdk8." I filed 7063249 to make sure this does not get forgotten. I hope you can understand why this can be frustrating. I repeatedly tried to steer the API in the right direction and repeatedly it ended up going in the exact opposite direction. I understand that you are too busy to consider this issue, but then the question becomes: 1. Who else at Oracle can work with me to get this into JDK8? 2. How do we make sure that this doesn't happen again? Thank you, Gili -- View this message in context: http://nio-dev.3157472.n2.nabble.com/Bug-7063249-tp7575180p7575196.html Sent from the nio-dev mailing list archive at Nabble.com. From Alan.Bateman at oracle.com Thu Sep 19 00:37:56 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Thu, 19 Sep 2013 08:37:56 +0100 Subject: Regression in EPollArrayWrapper causes NPE when fd > 64 * 1024 In-Reply-To: <3111FB94-E914-45AF-A65C-473C8252643A@redhat.com> References: <9DC59A1F-9EFB-418C-89C9-4FDE26177B90@redhat.com> <5237448D.2050804@oracle.com> <52382AA4.3050404@oracle.com> <52384655.6040001@oracle.com> <523999AA.5090005@oracle.com> <88FD79D2-E002-418D-A35A-2521C1508F1E@redhat.com> <5239AAE0.6040605@oracle.com> <3111FB94-E914-45AF-A65C-473C8252643A@redhat.com> Message-ID: <523AA9D4.4070702@oracle.com> On 18/09/2013 19:23, Norman Maurer wrote: > Thanks, > > you want a backport patch ? The process for getting changes into jdk7u is on the jdk7u project page [1]. In areas when the jdk7u and 8 code is the same then we usually just use hg export + hg import, and in the approval request we can just point to the jdk8 changeset. Do you want do this? Sean Coffey (one of the jdk7u maintainers) is supportive of getting into jdk7u-dev quickly although we should probably give it a few days in jdk8 to see if anything comes out of the woodwork. -Alan [1] http://openjdk.java.net/projects/jdk7u/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130919/533b5013/attachment.html From nmaurer at redhat.com Thu Sep 19 02:58:20 2013 From: nmaurer at redhat.com (Norman Maurer) Date: Thu, 19 Sep 2013 11:58:20 +0200 Subject: Regression in EPollArrayWrapper causes NPE when fd > 64 * 1024 In-Reply-To: <523AA9D4.4070702@oracle.com> References: <9DC59A1F-9EFB-418C-89C9-4FDE26177B90@redhat.com> <5237448D.2050804@oracle.com> <52382AA4.3050404@oracle.com> <52384655.6040001@oracle.com> <523999AA.5090005@oracle.com> <88FD79D2-E002-418D-A35A-2521C1508F1E@redhat.com> <5239AAE0.6040605@oracle.com> <3111FB94-E914-45AF-A65C-473C8252643A@redhat.com> <523AA9D4.4070702@oracle.com> Message-ID: <2D04301C-B74E-4FE5-A1E5-D57CEC578583@redhat.com> Am 19.09.2013 um 09:37 schrieb Alan Bateman : > On 18/09/2013 19:23, Norman Maurer wrote: >> >> Thanks, >> >> you want a backport patch ? > The process for getting changes into jdk7u is on the jdk7u project page [1]. In areas when the jdk7u and 8 code is the same then we usually just use hg export + hg import, and in the approval request we can just point to the jdk8 changeset. Do you want do this? Sean Coffey (one of the jdk7u maintainers) is supportive of getting into jdk7u-dev quickly although we should probably give it a few days in jdk8 to see if anything comes out of the woodwork. > > -Alan > > [1] http://openjdk.java.net/projects/jdk7u/ Sure? Let me wait 2 days and then write an email to jdk7u-dev and ask for review. Bye, Norman --- Norman Maurer nmaurer at redhat.com JBoss, by Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130919/8cb13dcb/attachment.html From martinrb at google.com Thu Sep 19 10:17:27 2013 From: martinrb at google.com (Martin Buchholz) Date: Thu, 19 Sep 2013 10:17:27 -0700 Subject: Classload of sun.nio.ch.Net fails - regression in jdk7u25 Message-ID: Hi, this is a bug report. Here is a tiny program that does class loading: public class LoadClass { public static void main(String[] args) throws Throwable { for (String className : args) Class.forName(className, true, null); } } If I run this against 1.7.0_21, it succeeds, but if I run it against 1.7.0_25 I get java LoadClass sun.nio.ch.Net Exception in thread "main" java.lang.UnsatisfiedLinkError: sun.nio.ch.Net.isExclusiveBindAvailable()I at sun.nio.ch.Net.isExclusiveBindAvailable(Native Method) at sun.nio.ch.Net.(Net.java:58) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) sun.nio.ch.Net clearly tries to prevent this from happening by calling Util.load(), but the problem is that the static block is called too late. Those static blocks calling Util.load() need to be at the top of each source file instead of the bottom, to prevent such failures. Regression was introduced with this changeset: changeset: 6272:8dd8266a2f4b user: khazra date: Thu Mar 14 13:54:32 2013 -0700 summary: 7170730: Improve Windows network stack support. I think Oracle testing folks should regularly run the above little program against every single class in the JDK (although it might be too expensive to run in a jtreg test). The fix is obvious, but I can provide a webrev if desired. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130919/6abe6824/attachment.html From Alan.Bateman at oracle.com Thu Sep 19 17:30:24 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 20 Sep 2013 01:30:24 +0100 Subject: Classload of sun.nio.ch.Net fails - regression in jdk7u25 In-Reply-To: References: Message-ID: <523B9720.8070007@oracle.com> Martin, I'm curious how you ran into this. I agree this should be fixed, it's just not immediately obviously which APIs would cause Net to be loaded before other classes that would cause the native library to be loaded. I'm not aware of any bug reports so I'm curious if you have the original stack trace (not from the Class.forName). -Alan From martinrb at google.com Thu Sep 19 18:21:43 2013 From: martinrb at google.com (Martin Buchholz) Date: Thu, 19 Sep 2013 18:21:43 -0700 Subject: Classload of sun.nio.ch.Net fails - regression in jdk7u25 In-Reply-To: <523B9720.8070007@oracle.com> References: <523B9720.8070007@oracle.com> Message-ID: On Thu, Sep 19, 2013 at 5:30 PM, Alan Bateman wrote: > Martin, > > I'm curious how you ran into this. I'm not aware of any way to trigger this using just public APIs. We sometimes fiddle with the internals of the JDK networking implementation. So it's possible that no one will run into this in a "strictly conforming" java program. > I agree this should be fixed, it's just not immediately obviously which > APIs would cause Net to be loaded before other classes that would cause the > native library to be loaded. I'm not aware of any bug reports so I'm > curious if you have the original stack trace (not from the Class.forName). > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130919/43058be9/attachment.html From philippe.marschall at gmail.com Fri Sep 20 03:32:30 2013 From: philippe.marschall at gmail.com (Philippe Marschall) Date: Fri, 20 Sep 2013 12:32:30 +0200 Subject: MacOS file system changes between 7u10 and 7u40? Message-ID: Hi Have there been any changes to the way the default file system handles Unicode normalization between 7u10 and 7u40? I'm suddenly seeing code behave differently and I don't remember reading anything in the changelogs. I'm seeing two differences: First Path#toRealPath() return values seem to be NFC instead if NFD. This is a bit confusing because AFAIK MacOS stores in NFD. This code works in 7u10 but fails in 7u40 FileSystem fileSystem = FileSystems.getDefault(); String aUmlaut = "\u00C4"; Path aPath = fileSystem.getPath(aUmlaut); String normalized = Normalizer.normalize(aUmlaut, Form.NFD); Path nPath = fileSystem.getPath(normalized); Path createdFile = null; try { createdFile = Files.createFile(aPath); assertEquals(1, createdFile.getFileName().toString().length()); assertEquals(1, createdFile.toAbsolutePath().getFileName().toString().length()); assertEquals(2, createdFile.toRealPath().getFileName().toString().length()); // failure is here } finally { if (createdFile != null) { Files.delete(createdFile); } } Second Path#equals now seems to normalize paths. This code works in 7u10 but fails in 7u40 FileSystem fileSystem = FileSystems.getDefault(); String aUmlaut = "\u00C4"; String normalized = Normalizer.normalize(aUmlaut, Form.NFD); assertEquals(1, aUmlaut.length()); assertEquals(2, normalized.length()); Path aPath = fileSystem.getPath("/" + aUmlaut); Path nPath = fileSystem.getPath("/" + normalized); assertEquals(1, aPath.getName(0).toString().length()); assertThat(aPath, not(equalTo(nPath))); Cheers Philippe From sean.coffey at oracle.com Fri Sep 20 06:39:01 2013 From: sean.coffey at oracle.com (=?ISO-8859-1?Q?Se=E1n_Coffey?=) Date: Fri, 20 Sep 2013 14:39:01 +0100 Subject: RFR : 8012326 Deadlock occurs when Charset.availableCharsets() is called by several threads at the same time Message-ID: <523C4FF5.8020908@oracle.com> I'd like to port this from jdk8 to jdk7u-dev. https://bugs.openjdk.java.net/browse/JDK-8012326 The backport is similar to JDK 8 fix with the exception of ISO2022_JP_2 & MSISO2022JP classes. The static variables in those classes (DEC02{12|08}/ENC02{12|08}) came in via the JDK-6653797 fix which is only in jdk8. It doesn't appear to be applicable to JDK 7u as a result. (we're not loading JIS_X_0208() class in ISO2022_JP initialization for jdk7u from what I see) In summary, the jdk8 webrev looks applicable to jdk7u with exception of ISO2022_JP_2.java / MSISO2022JP.java changes. webrev : http://cr.openjdk.java.net/~coffeys/webrev.8012326.jdk7u/webrev/ regards, Sean. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130920/bfd725bb/attachment.html From chris.w.dennis at gmail.com Fri Sep 20 07:14:37 2013 From: chris.w.dennis at gmail.com (Chris Dennis) Date: Fri, 20 Sep 2013 10:14:37 -0400 Subject: Bug in interrupt handling in FileChannelImpl.map(=?UTF-8?B?4oCm?=) In-Reply-To: Message-ID: Alan, Attached is my proposed fix for this issue. I've sneakily snuck in a second minor fix, it seems to me we should be checking for !isOpen() after the truncate call (figured we could stretch the title of the bug to cover this too?). The test I've checked in was the best I could do at short notice, it doesn?t fail very often, but if you ramp up the cycle count or run it repeatedly I can get it to fail eventually (can take thousands of cycles). It's not going to catch any regression immediately but with enough aggregated runs it should eventually. Thanks, Chris On 9/18/13 10:04 AM, "Chris Dennis" wrote: >The Chris Dennis on the OCA list is me, I've just switched my openjdk mail >subscriptions to my gmail address due to work email quotas. I'll prepare a >patch against jdk8/tl, and look in to the possibility of creating a test >that attempts to reproduce this (although as you say it may be >impractical). > >Thanks, > >Chris > >On 9/18/13 9:59 AM, "Alan Bateman" wrote: > >>On 18/09/2013 14:53, Chris Dennis wrote: >>> I'm interested in pushing this issue to a conclusion, and I'm happy to >>> contribute a fix. What forest should I be generating a patch against? >>> >>The "How to contribute" is the starting point. I see there is a Chris >>Dennis on the OCA list but I don't know if that is you. >> >>The forest that we push core area changes to is jdk8/tl. We try to >>include a test with all changes although hear I suspect it may be too >>troublesome to write a test that duplicates this without having side >>effects. >> >>-Alan >> >>[1] http://openjdk.java.net/contribute/ >> >> > > -------------- next part -------------- A non-text attachment was scrubbed... Name: 8024833.patch Type: application/octet-stream Size: 7542 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130920/95e5e2ac/8024833.patch From Alan.Bateman at oracle.com Fri Sep 20 07:51:16 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 20 Sep 2013 15:51:16 +0100 Subject: MacOS file system changes between 7u10 and 7u40? In-Reply-To: References: Message-ID: <523C60E4.4000404@oracle.com> On 20/09/2013 11:32, Philippe Marschall wrote: > Hi > > Have there been any changes to the way the default file system handles > Unicode normalization between 7u10 and 7u40? I'm suddenly seeing code > behave differently and I don't remember reading anything in the > changelogs. I assume you are looking for this one: 7130915: File.equals does not give expected results when path contains Non-English characters on Mac OS X It was back-ported to 7uX last year [1]. There were also several threads about this topic on macosx-port-dev at the time, much of it about the initial port of JDK 7 to Mac OS X not matching Apple's JDK 6. So I'm curious if you are just obviously different behavior or whether this has broken something. Were you using java.text.Normalizer to work around issues? -Alan [1] http://hg.openjdk.java.net/jdk7u/jdk7u/jdk/rev/03b9d0ba2488 From martinrb at google.com Fri Sep 20 10:22:39 2013 From: martinrb at google.com (Martin Buchholz) Date: Fri, 20 Sep 2013 10:22:39 -0700 Subject: Classload of sun.nio.ch.Net fails - regression in jdk7u25 In-Reply-To: References: <523B9720.8070007@oracle.com> Message-ID: So, I did the experiment I suggested: jar tf $jdk/jre/lib/rt.jar | sed -n 's/\.class$//p' | sed 's/\//./g' | while read class; do j java LoadClass $class; done and got 148 UnsatisfiedLinkErrors (a.k.a. "a rich crop of bugs") Here are the ones Alan himself may want to fix: Exception in thread "main" java.lang.UnsatisfiedLinkError: java.net.SocketInputStream.init()V at java.net.SocketInputStream.init(Native Method) at java.net.SocketInputStream.(SocketInputStream.java:47) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) at LoadClass.main(LoadClass.java:4) Exception in thread "main" java.lang.UnsatisfiedLinkError: java.net.SocketOutputStream.init()V at java.net.SocketOutputStream.init(Native Method) at java.net.SocketOutputStream.(SocketOutputStream.java:46) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) at LoadClass.main(LoadClass.java:4) Exception in thread "main" java.lang.UnsatisfiedLinkError: sun.nio.ch.EPoll.eventSize()I at sun.nio.ch.EPoll.eventSize(Native Method) at sun.nio.ch.EPoll.(EPoll.java:53) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) at LoadClass.main(LoadClass.java:4) Exception in thread "main" java.lang.UnsatisfiedLinkError: sun.nio.ch.EPollArrayWrapper.sizeofEPollEvent()I at sun.nio.ch.EPollArrayWrapper.sizeofEPollEvent(Native Method) at sun.nio.ch.EPollArrayWrapper.(EPollArrayWrapper.java:67) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) at LoadClass.main(LoadClass.java:4) Exception in thread "main" java.lang.UnsatisfiedLinkError: sun.nio.ch.FileKey.initIDs()V at sun.nio.ch.FileKey.initIDs(Native Method) at sun.nio.ch.FileKey.(FileKey.java:73) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) at LoadClass.main(LoadClass.java:4) Exception in thread "main" java.lang.UnsatisfiedLinkError: sun.nio.ch.KQueue.keventSize()I at sun.nio.ch.KQueue.keventSize(Native Method) at sun.nio.ch.KQueue.(KQueue.java:50) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) at LoadClass.main(LoadClass.java:4) Exception in thread "main" java.lang.UnsatisfiedLinkError: sun.nio.ch.Net.isExclusiveBindAvailable()I at sun.nio.ch.Net.isExclusiveBindAvailable(Native Method) at sun.nio.ch.Net.(Net.java:58) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) at LoadClass.main(LoadClass.java:4) Exception in thread "main" java.lang.UnsatisfiedLinkError: sun.nio.ch.SctpNet.init()V at sun.nio.ch.SctpNet.init(Native Method) at sun.nio.ch.SctpNet.(SctpNet.java:368) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) at LoadClass.main(LoadClass.java:4) Exception in thread "main" java.lang.UnsatisfiedLinkError: sun.nio.ch.Net.isExclusiveBindAvailable()I at sun.nio.ch.Net.isExclusiveBindAvailable(Native Method) at sun.nio.ch.Net.(Net.java:58) at sun.nio.ch.SocketOptionRegistry$LazyInitialization.options(SocketOptionRegistry.java:61) at sun.nio.ch.SocketOptionRegistry$LazyInitialization.(SocketOptionRegistry.java:57) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) at LoadClass.main(LoadClass.java:4) On Thu, Sep 19, 2013 at 6:21 PM, Martin Buchholz wrote: > > > > On Thu, Sep 19, 2013 at 5:30 PM, Alan Bateman wrote: > >> Martin, >> >> I'm curious how you ran into this. > > > I'm not aware of any way to trigger this using just public APIs. > We sometimes fiddle with the internals of the JDK networking > implementation. > So it's possible that no one will run into this in a "strictly conforming" > java program. > > >> I agree this should be fixed, it's just not immediately obviously which >> APIs would cause Net to be loaded before other classes that would cause the >> native library to be loaded. I'm not aware of any bug reports so I'm >> curious if you have the original stack trace (not from the Class.forName). >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130920/dfca6a85/attachment.html From philippe.marschall at gmail.com Sat Sep 21 00:41:44 2013 From: philippe.marschall at gmail.com (Philippe Marschall) Date: Sat, 21 Sep 2013 09:41:44 +0200 Subject: MacOS file system changes between 7u10 and 7u40? In-Reply-To: <523C60E4.4000404@oracle.com> References: <523C60E4.4000404@oracle.com> Message-ID: On Fri, Sep 20, 2013 at 4:51 PM, Alan Bateman wrote: > On 20/09/2013 11:32, Philippe Marschall wrote: >> >> Hi >> >> Have there been any changes to the way the default file system handles >> Unicode normalization between 7u10 and 7u40? I'm suddenly seeing code >> behave differently and I don't remember reading anything in the >> changelogs. > > I assume you are looking for this one: > > 7130915: File.equals does not give expected results when path contains > Non-English characters on Mac OS X > > It was back-ported to 7uX last year [1]. There were also several threads > about this topic on macosx-port-dev at the time, much of it about the > initial port of JDK 7 to Mac OS X not matching Apple's JDK 6. > > So I'm curious if you are just obviously different behavior or whether this > has broken something. Were you using java.text.Normalizer to work around > issues? I have this in-memory file system [1] that tries to simulate MacOS file system for testing purposes. As part of the test suite I run the same code against the default file system and the in-memory file system to ensure the behavior matches. These tests fail now. So it's not a big deal, I'll have to change the in-memory implementation and tests to match the new MacOS behavior. I hope to talk about it at JavaOne (CON4114) but at the moment it's just an alternate session. [1] https://github.com/marschall/memoryfilesystem Cheers Philippe From Alan.Bateman at oracle.com Sat Sep 21 05:05:44 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Sat, 21 Sep 2013 05:05:44 -0700 Subject: MacOS file system changes between 7u10 and 7u40? In-Reply-To: References: <523C60E4.4000404@oracle.com> Message-ID: <523D8B98.8010901@oracle.com> On 21/09/2013 00:41, Philippe Marschall wrote: > : > I have this in-memory file system [1] that tries to simulate MacOS > file system for testing purposes. As part of the test suite I run the > same code against the default file system and the in-memory file > system to ensure the behavior matches. These tests fail now. So it's > not a big deal, I'll have to change the in-memory implementation and > tests to match the new MacOS behavior. Thanks for the explanation (I was surprised by the code fragments that you included in the mail as it looks like it was second guessing how the underlying file system was implemented). > > I hope to talk about it at JavaOne (CON4114) but at the moment it's > just an alternate session. > > [1] https://github.com/marschall/memoryfilesystem > This looks interesting. One idea is to come along to I/O BOF on Tuesday evening (BOF7945) then maybe you could talk about it for a few minutes. -Alan From mik3hall at gmail.com Sat Sep 21 06:23:42 2013 From: mik3hall at gmail.com (Michael Hall) Date: Sat, 21 Sep 2013 08:23:42 -0500 Subject: MacOS file system changes between 7u10 and 7u40? In-Reply-To: References: <523C60E4.4000404@oracle.com> Message-ID: <14DE2248-2E62-4FE3-9342-6399765B4963@gmail.com> On Sep 21, 2013, at 2:41 AM, Philippe Marschall wrote: > I'll have to change the in-memory implementation and > tests to match the new MacOS behavior. Meaning match the new openjdk default filesystem path encoding, matching the Java 6 path encoding. There is no actual MacOS itself behavior change. Correct? Michael Hall trz nio.2 for OS X http://www195.pair.com/mik3hall/index.html#trz HalfPipe Java 6/7 shell app http://www195.pair.com/mik3hall/index.html#halfpipe AppConverter convert Apple jvm to openjdk apps http://www195.pair.com/mik3hall/index.html#appconverter -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130921/ccae729a/attachment.html From yiming.wang at oracle.com Sun Sep 22 03:21:26 2013 From: yiming.wang at oracle.com (Eric Wang) Date: Sun, 22 Sep 2013 18:21:26 +0800 Subject: Review request for bug 8015762: java/nio/channels/DatagramChannel/AdaptDatagramSocket.java fails intermittently In-Reply-To: <52396979.10806@oracle.com> References: <520A188D.2070808@oracle.com> <520A9D30.8020107@oracle.com> <523030C0.8020102@oracle.com> <523036A6.3040009@oracle.com> <52303C29.5040307@oracle.com> <5230463B.8060402@oracle.com> <523960EC.4080005@oracle.com> <52396979.10806@oracle.com> Message-ID: <523EC4A6.7000202@oracle.com> On 2013/9/18 16:51, Alan Bateman wrote: > On 18/09/2013 09:14, Eric Wang wrote: >> Hi Chris, >> >> Yes, It looks a bit odd, i tried to cut down thread number of >> UdpEchoRequest to make response sent before socket timeout. >> Essentially, it is a timing issue. so the new fix below is to update >> change to SO_TIMEOUT value from 5 seconds to 15. I have tested on >> jsn-vm49.us for 20000 times and all passed. >> http://cr.openjdk.java.net/~ewang/8015762/webrev.02/test/java/nio/channels/DatagramChannel/AdaptDatagramSocket.java.sdiff.html >> >> >> >> > I'm okay with this too. > > -Alan. > Thanks Alan, Are you also OK to be my sponsor? Regards, Eric From Alan.Bateman at oracle.com Sun Sep 22 04:46:24 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Sun, 22 Sep 2013 04:46:24 -0700 Subject: Review request for bug 8015762: java/nio/channels/DatagramChannel/AdaptDatagramSocket.java fails intermittently In-Reply-To: <523EC4A6.7000202@oracle.com> References: <520A188D.2070808@oracle.com> <520A9D30.8020107@oracle.com> <523030C0.8020102@oracle.com> <523036A6.3040009@oracle.com> <52303C29.5040307@oracle.com> <5230463B.8060402@oracle.com> <523960EC.4080005@oracle.com> <52396979.10806@oracle.com> <523EC4A6.7000202@oracle.com> Message-ID: <523ED890.4020501@oracle.com> On 22/09/2013 03:21, Eric Wang wrote: > : > > Thanks Alan, Are you also OK to be my sponsor? Chris (thanks) already pushed it for you a few days ago, see: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/b3a506a30fda From philippe.marschall at gmail.com Sun Sep 22 14:17:19 2013 From: philippe.marschall at gmail.com (Philippe Marschall) Date: Sun, 22 Sep 2013 14:17:19 -0700 Subject: MacOS file system changes between 7u10 and 7u40? In-Reply-To: <14DE2248-2E62-4FE3-9342-6399765B4963@gmail.com> References: <523C60E4.4000404@oracle.com> <14DE2248-2E62-4FE3-9342-6399765B4963@gmail.com> Message-ID: On Sat, Sep 21, 2013 at 6:23 AM, Michael Hall wrote: > On Sep 21, 2013, at 2:41 AM, Philippe Marschall wrote: > > I'll have to change the in-memory implementation and > tests to match the new MacOS behavior. > > > Meaning match the new openjdk default filesystem path encoding, matching the > Java 6 path encoding. I haven't checked what the "old" java.io.File API on Java 6 does. > There is no actual MacOS itself behavior change. > Correct? Yes MacOS behaves still the same (NFD). However the Java default file system implementation of the new file system API on MacOS now returns NFC instead of NFD. Cheers Philippe From mik3hall at gmail.com Sun Sep 22 16:31:40 2013 From: mik3hall at gmail.com (Michael Hall) Date: Sun, 22 Sep 2013 18:31:40 -0500 Subject: MacOS file system changes between 7u10 and 7u40? In-Reply-To: References: <523C60E4.4000404@oracle.com> <14DE2248-2E62-4FE3-9342-6399765B4963@gmail.com> Message-ID: <72DAEBB7-7F61-4B46-A81D-023652C6D324@gmail.com> On Sep 22, 2013, at 4:17 PM, Philippe Marschall wrote: > On Sat, Sep 21, 2013 at 6:23 AM, Michael Hall wrote: >> On Sep 21, 2013, at 2:41 AM, Philippe Marschall wrote: >> >> I'll have to change the in-memory implementation and >> tests to match the new MacOS behavior. >> >> >> Meaning match the new openjdk default filesystem path encoding, matching the >> Java 6 path encoding. > > I haven't checked what the "old" java.io.File API on Java 6 does. I thought Alan said something about matching Java 6 and 'old' java.io.File was and is all that is provided there. At least you don't have to emulate MacRoman it sounds like. > >> There is no actual MacOS itself behavior change. >> Correct? > > Yes MacOS behaves still the same (NFD). However the Java default file > system implementation of the new file system API on MacOS now returns > NFC instead of NFD. Not sure what the distinction is there, I should check. Emulating a platform filesystem is an interesting idea. Thanks. Michael Hall trz nio.2 for OS X http://www195.pair.com/mik3hall/index.html#trz HalfPipe Java 6/7 shell app http://www195.pair.com/mik3hall/index.html#halfpipe AppConverter convert Apple jvm to openjdk apps http://www195.pair.com/mik3hall/index.html#appconverter -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20130922/5bebe6b7/attachment.html From philippe.marschall at gmail.com Sun Sep 22 17:28:22 2013 From: philippe.marschall at gmail.com (Philippe Marschall) Date: Sun, 22 Sep 2013 17:28:22 -0700 Subject: MacOS file system changes between 7u10 and 7u40? In-Reply-To: <72DAEBB7-7F61-4B46-A81D-023652C6D324@gmail.com> References: <523C60E4.4000404@oracle.com> <14DE2248-2E62-4FE3-9342-6399765B4963@gmail.com> <72DAEBB7-7F61-4B46-A81D-023652C6D324@gmail.com> Message-ID: On Sun, Sep 22, 2013 at 4:31 PM, Michael Hall wrote: > On Sep 22, 2013, at 4:17 PM, Philippe Marschall wrote: > > On Sat, Sep 21, 2013 at 6:23 AM, Michael Hall wrote: > > On Sep 21, 2013, at 2:41 AM, Philippe Marschall wrote: > > > I'll have to change the in-memory implementation and > > tests to match the new MacOS behavior. > > > > Meaning match the new openjdk default filesystem path encoding, matching the > > Java 6 path encoding. > > > I haven't checked what the "old" java.io.File API on Java 6 does. > > > I thought Alan said something about matching Java 6 and 'old' java.io.File > was and is all that is provided there. > At least you don't have to emulate MacRoman it sounds like. > > > There is no actual MacOS itself behavior change. > > Correct? > > > Yes MacOS behaves still the same (NFD). However the Java default file > system implementation of the new file system API on MacOS now returns > NFC instead of NFD. > > > Not sure what the distinction is there, I should check. This is mainly about Latin-1 backwards compatibility in Unicode. Some (most) Latin-1 characters that are not in ASCII can be represented in two ways in Unicode. NFD as one code point: eg. ? NFC as two code points one of them being a combinatorial diacritical mark: eg. ?a MacOS AFAIK stores file names in NFD. > Emulating a platform filesystem is an interesting idea. It's targeted at testing file code without actually going to disk. Much like testing database code with an in-memory database. Cheers Philippe From Alan.Bateman at oracle.com Sun Sep 22 20:29:06 2013 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Sun, 22 Sep 2013 20:29:06 -0700 Subject: Bug in interrupt handling in =?UTF-8?B?RmlsZUNoYW5uZWxJbXBsLg==?= =?UTF-8?B?bWFwKOKApik=?= In-Reply-To: References: Message-ID: <523FB582.9080101@oracle.com> On 20/09/2013 07:14, Chris Dennis wrote: > Alan, > > Attached is my proposed fix for this issue. I've sneakily snuck in a > second minor fix, it seems to me we should be checking for !isOpen() after > the truncate call (figured we could stretch the title of the bug to cover > this too?). The test I've checked in was the best I could do at short > notice, it doesn?t fail very often, but if you ramp up the cycle count or > run it repeatedly I can get it to fail eventually (can take thousands of > cycles). It's not going to catch any regression immediately but with > enough aggregated runs it should eventually. > > Thanks, > > Chris > Thanks Chris. The update to FileChannel looks okay and I will sponsor the change. I need to study the test, just to be satisfied on its reliability. It's okay that it won't detect the issue reliability, it's really just to check that we won't have any false positives or else issues on Windows where jtreg can't clean up after the test due to the mapped regions. I'll get back to you soon on this (JavaOne keeping many of us busy this week). -Alan. From chris.w.dennis at gmail.com Mon Sep 23 06:33:09 2013 From: chris.w.dennis at gmail.com (Chris Dennis) Date: Mon, 23 Sep 2013 09:33:09 -0400 Subject: Bug in interrupt handling in FileChannelImpl.map(=?UTF-8?B?4oCm?=) In-Reply-To: <523FB582.9080101@oracle.com> Message-ID: Sure, no problem. There are a couple of oddities to note in the test? 1. It doesn't actually end up mapping anything. I'm intentionally trying to extend the file from zero length to length one, when the file is read-only. This way I pass through the nested size call, but don't actually end up mapping anything. Seemed to me like this was the simplest way to test exactly this bug and nothing else, and also avoid all the windows specific mapping behaviors causing any problems. 2. I'm not closing the file in a finally block because when the code is broken the locking will block the close call, hence I'm only closing on a successful run. Thanks, Chris On 9/22/13 11:29 PM, "Alan Bateman" wrote: >On 20/09/2013 07:14, Chris Dennis wrote: >> Alan, >> >> Attached is my proposed fix for this issue. I've sneakily snuck in a >> second minor fix, it seems to me we should be checking for !isOpen() >>after >> the truncate call (figured we could stretch the title of the bug to >>cover >> this too?). The test I've checked in was the best I could do at short >> notice, it doesn?t fail very often, but if you ramp up the cycle count >>or >> run it repeatedly I can get it to fail eventually (can take thousands of >> cycles). It's not going to catch any regression immediately but with >> enough aggregated runs it should eventually. >> >> Thanks, >> >> Chris >> >Thanks Chris. The update to FileChannel looks okay and I will sponsor >the change. I need to study the test, just to be satisfied on its >reliability. It's okay that it won't detect the issue reliability, it's >really just to check that we won't have any false positives or else >issues on Windows where jtreg can't clean up after the test due to the >mapped regions. I'll get back to you soon on this (JavaOne keeping many >of us busy this week). > >-Alan. From xueming.shen at oracle.com Thu Sep 26 21:44:39 2013 From: xueming.shen at oracle.com (Xueming Shen) Date: Thu, 26 Sep 2013 21:44:39 -0700 Subject: MacOS file system changes between 7u10 and 7u40? In-Reply-To: References: Message-ID: <52450D37.2020904@oracle.com> What we did in 7130915 is to normalize the native macos file name (in NFD) back into NFC for the File and Path, but passing the File/Path Java path name (in NFC) into macos's file system API directly (as it appears those APIs just work fine with NFC file name, though it stores them in NFD internally). This serves the purpose of having Java file name (File and Path, and their String representation) in NFC form consistently, cross all platforms. As the consequence, we can have a reasonable implement of file name equals() without involving string normalization, which is expensive. In "normal" use scenario, you should not have a file/path name (in String) passing around in NFD form. -Sherman On 9/20/13 3:32 AM, Philippe Marschall wrote: > Hi > > Have there been any changes to the way the default file system handles > Unicode normalization between 7u10 and 7u40? I'm suddenly seeing code > behave differently and I don't remember reading anything in the > changelogs. I'm seeing two differences: > > First Path#toRealPath() return values seem to be NFC instead if NFD. > This is a bit confusing because AFAIK MacOS stores in NFD. This code > works in 7u10 but fails in 7u40 > > FileSystem fileSystem = FileSystems.getDefault(); > String aUmlaut = "\u00C4"; > Path aPath = fileSystem.getPath(aUmlaut); > String normalized = Normalizer.normalize(aUmlaut, Form.NFD); > Path nPath = fileSystem.getPath(normalized); > > Path createdFile = null; > try { > createdFile = Files.createFile(aPath); > assertEquals(1, createdFile.getFileName().toString().length()); > assertEquals(1, > createdFile.toAbsolutePath().getFileName().toString().length()); > assertEquals(2, > createdFile.toRealPath().getFileName().toString().length()); // > failure is here > } finally { > if (createdFile != null) { > Files.delete(createdFile); > } > } > > Second Path#equals now seems to normalize paths. This code works in > 7u10 but fails in 7u40 > > FileSystem fileSystem = FileSystems.getDefault(); > String aUmlaut = "\u00C4"; > String normalized = Normalizer.normalize(aUmlaut, Form.NFD); > assertEquals(1, aUmlaut.length()); > assertEquals(2, normalized.length()); > Path aPath = fileSystem.getPath("/" + aUmlaut); > Path nPath = fileSystem.getPath("/" + normalized); > assertEquals(1, aPath.getName(0).toString().length()); > assertThat(aPath, not(equalTo(nPath))); > > Cheers > Philippe From johannes.rudolph at googlemail.com Thu Sep 12 13:45:38 2013 From: johannes.rudolph at googlemail.com (Johannes Rudolph) Date: Thu, 12 Sep 2013 13:45:38 -0000 Subject: OP_CONNECT, connect, and finishConnect fail Message-ID: Hi there, I don't know what the right channels are to report bugs for the (Open)JDK. The bug database seems to be completely unmaintained and the last time I tried to file a bug I never got any answer and the report never showed up. This is basically a follow up to http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6371630 The problem is that under Linux for outgoing connections bad things happen if you select for OP_CONNECT before even attempting the connect. The sequence of calls is this: * SocketChannel.open * ch.configureBlocking(false) * ch.register(selector, OP_CONNECT) * selector.select() instantly returns and reports the channel as connected (it doesn't matter if you do the actual select call after the connection attempt) * ch.connect(unrespondingHostAddress) which returns false * ch.finishConnect() which now always returns true, the OS-level socket itself, however, never received any response from the peer for its SYN packet * ch.isConnected() returns true * if the host would eventually establish the connection the socket would be usable (as OP_READ and OP_WRITE still work as intended), so the main problem is that a connection is reported as established when, in fact, it may never make any progress with connection establishment Of course, you could argue that registering for OP_CONNECT before calling connect is a user error but is neither forbidden by the documentation nor in any way prevented at runtime. All of the later behavior of `finishConnect` makes no sense at all. Also the actual call-sequence can usually be much more complicated in a common multi-threaded setting so the actual calls registering the channel to the selector and the connection attempt may be executed concurrently for some reasons making this bug even harder to find. Here's an standalone example exhibiting the behavior: https://gist.github.com/jrudolph/6535400 We discovered the problem here: https://www.assembla.com/spaces/akka/tickets/3602-io--tcp-connection-establishment-always-succeeds-even-if-endpoint-never-answers Cheers, Johannes -------------- next part -------------- An HTML attachment was scrubbed... URL: