From cowwoc at bbs.darktech.org Thu Jan 3 03:25:02 2008 From: cowwoc at bbs.darktech.org (cowwoc) Date: Wed, 2 Jan 2008 19:25:02 -0800 (PST) Subject: Fixing bug #4128333: Serializing strings restricted to 64k bytes Message-ID: <14591177.post@talk.nabble.com> Hi, Bug URL: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4128333 I'd like to start a discussion on how we can possibly solve this bug in a backwards-compatible way. Here is my personal proposal but I'd love to hear your own ideas! 1) Add a new method to DataInputStream/DataOutputStream for encoding/decoding longer Strings (ideally this new encoding should have no fixed limit). I think this should be done independently of Serialization as this is needed by other clients. 2) Add a method to ObjectOutputStream to enable the new encoding format (which is not backwards compatible). The default would be to use the old encoding format but developers of new applications would be encouraged to use the new format. I recommend ObjectOutputStream.setMinimumVersion(enum). The default would be ObjectOutputStream.setMinimumVersion(JDK1_1) which indicates the format is backwards-compatible to JDK 1.1 but we would add ObjectOutputStream.setMinimumVersion(JDK1_7) for the new file format. Please let me know what you think! Gili -- View this message in context: http://www.nabble.com/Fixing-bug--4128333%3A-Serializing-strings-restricted-to-64k-bytes-tp14591177p14591177.html Sent from the OpenJDK Core Libraries mailing list archive at Nabble.com. From cowwoc at bbs.darktech.org Thu Jan 3 03:32:29 2008 From: cowwoc at bbs.darktech.org (cowwoc) Date: Wed, 2 Jan 2008 19:32:29 -0800 (PST) Subject: Fixing bug #4128333: Serializing strings restricted to 64k bytes In-Reply-To: <14591177.post@talk.nabble.com> References: <14591177.post@talk.nabble.com> Message-ID: <14591181.post@talk.nabble.com> I see now that ObjectOutputStream.html#useProtocolVersion() already exists: http://java.sun.com/javase/6/docs/api/java/io/ObjectOutputStream.html#useProtocolVersion(int) so the only thing we'd need to do in step 2 is add ObjectStreamConstants.PROTOCOL_VERSION_3. -- View this message in context: http://www.nabble.com/Fixing-bug--4128333%3A-Serializing-strings-restricted-to-64k-bytes-tp14591177p14591181.html Sent from the OpenJDK Core Libraries mailing list archive at Nabble.com. From peter.jones at sun.com Fri Jan 4 16:04:25 2008 From: peter.jones at sun.com (Peter Jones) Date: Fri, 4 Jan 2008 11:04:25 -0500 Subject: Fixing bug #4128333: Serializing strings restricted to 64k bytes In-Reply-To: <14591177.post@talk.nabble.com> References: <14591177.post@talk.nabble.com> Message-ID: <20080104160425.GA10619@east> If your concern is primarily about Java object serialization, note that it has supported serializing strings with UTF-8 encoding larger than 64KB since J2SE 1.3: http://bugs.sun.com/view_bug.do?bug_id=4217676 I presume that 4128333 remains open for Data{Input,Output}Stream only. -- Peter On Wed, Jan 02, 2008 at 07:25:02PM -0800, cowwoc wrote: > > Hi, > > Bug URL: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4128333 > > I'd like to start a discussion on how we can possibly solve this bug in a > backwards-compatible way. Here is my personal proposal but I'd love to hear > your own ideas! > > 1) Add a new method to DataInputStream/DataOutputStream for > encoding/decoding longer Strings (ideally this new encoding should have no > fixed limit). I think this should be done independently of Serialization as > this is needed by other clients. > > 2) Add a method to ObjectOutputStream to enable the new encoding format > (which is not backwards compatible). The default would be to use the old > encoding format but developers of new applications would be encouraged to > use the new format. I recommend ObjectOutputStream.setMinimumVersion(enum). > The default would be ObjectOutputStream.setMinimumVersion(JDK1_1) which > indicates the format is backwards-compatible to JDK 1.1 but we would add > ObjectOutputStream.setMinimumVersion(JDK1_7) for the new file format. > > Please let me know what you think! > Gili From jackieict at gmail.com Sat Jan 5 12:01:35 2008 From: jackieict at gmail.com (zhang Jackie) Date: Sat, 5 Jan 2008 20:01:35 +0800 Subject: RMI benchmark Message-ID: <13432ab00801050401v4fc4c90dha223feb3b0b2cd34@mail.gmail.com> Hi, everyone! Recently ,I want to have a performance comparision on RMI and my own version with little changes. Can you give me some microbenchmarks and some other suites used for estimate the performance of RMI? I googled the keyword "RMI benchmark", but cant get one. -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.warburton at gmail.com Sat Jan 5 12:16:24 2008 From: richard.warburton at gmail.com (Richard Warburton) Date: Sat, 5 Jan 2008 12:16:24 +0000 Subject: RMI benchmark In-Reply-To: <13432ab00801050401v4fc4c90dha223feb3b0b2cd34@mail.gmail.com> References: <13432ab00801050401v4fc4c90dha223feb3b0b2cd34@mail.gmail.com> Message-ID: <749b5dd60801050416r18a25611lfc4bf10c372cbd7a@mail.gmail.com> On Jan 5, 2008 12:01 PM, zhang Jackie wrote: > Hi, everyone! > Recently ,I want to have a performance comparision on RMI and my own > version with little changes. Can you give me some microbenchmarks and some > other suites used for estimate the performance of RMI? I googled the keyword > "RMI benchmark", but cant get one. The KaRMI system was originally meant to be a faster drop in replacement for RMI, and used to have some benchmarks associated with it. Unfortunately I can no longer find them anymore. It is availible at: http://svn.ipd.uni-karlsruhe.de/trac/javaparty/wiki/KaRMI Richard Warburton From linuxhippy at gmail.com Mon Jan 7 00:11:24 2008 From: linuxhippy at gmail.com (Clemens Eisserer) Date: Mon, 7 Jan 2008 01:11:24 +0100 Subject: Performance regression in java.util.zip.Deflater In-Reply-To: <476B0ABA.6030102@sun.com> References: <194f62550712201120p1d10ac45xf86eb9cacd2eee87@mail.gmail.com> <476ADDAF.2070409@sun.com> <194f62550712201336y3380808bv3726d891873be277@mail.gmail.com> <476AEDCD.6080504@sun.com> <194f62550712201520p30d7b15wa8f2005749a77243@mail.gmail.com> <476B0ABA.6030102@sun.com> Message-ID: <194f62550801061611x7e363a61q95e74b89db6a17ff@mail.gmail.com> Hello again, I implemented two prototypes of the striding to see how they perform and how complex the code would be. Both prototypes implement the striding on the java-side (call JNI method for each stride) which I plan to change to minimize overhead and hide the striding (except Sun would like to have it in Java). The first prototype uses two Direct-ByteBuffers where it copies the data to/from the input/output arrays, the whole input/output data is this way only copied once. The second prototype uses striding (1kb chunks) in the Cirtical-Section, I also did some measurements to see how long the cirtical-section is held in worst-case. Buffers / 2k/1k stride size: (input-buffer: 2k, output-buffer 1k) 1.) Compress 50mb with level=0 / 100byte-output-array: 603ms 2.) Compress 50mb with level=1 / 100byte-output-array: 277ms 3.) Compress 50mb with level=9 / 1kb output-array 784ms Critical / 1k stride size: (no copying) 1.) Compress 50mb with level=0 / 100byte-output-array:720ms 2.) Compress 50mb with level=1 / 100byte-output-array: 270ms 3.) Compress 50mb with level=9 / 1kb output-array 778ms The first two measurements are worst-case scenarios which measure the overhead of striding when the output-buffer is way too small - here the copying approach is even fast (maybe GetPrimitiveArrayCritical has more overhead then GetDirectBufferAdress). The 3.) shows a real-world example with high compression where copying-overhead should not be high - but however it does show up (only a few percent). I did many more measurements (however I don't remeber exactly what I measured, it was some time ago) and my conclusion was that especially for a little bit larger buffers (e.g. 8k/4k) the copying overhead is really low - also oprofile showed ~2-5% in memcpy). Because the non-copying critical-section approach has to use small strides the are both almost equal fast, in real-world use-cases the non-copying approach was a few ms faster. However one thing of the copying solution I don't like: Its quite complex, whereas the critical-section approach is quite clean. I did some benchmarks how long the critical section is held with compression-ratio=9 + uncompressable data (assumed this is a worst-case) and 1kb strides in ?s: 530 351 615 339 2 1 2 292 3470 (worst case over all runs) 256 341 So on my Core2Duo (2ghz) I see worst-cases of about 3ms including JNI-overhead with 1kb strides. Making the strides small won't help as zlib waits until it has enough data to compress (thats why there are 2?s calls - which I assume are only used to move data inside of zlibs compression buffer). On the hotspot-runtime list I started a thread about "how evil" GetPrimitiveArrayCritical is, they said it only blocks the GC - I don't know wether 3ms are problematic. However keeping in mind that Deflater is quite slow anyway, the copying overhead is not relevant I guess. So to sum it up I would recommend for Deflater either the non-copying/critical solution or a copying solution which both work in strides. The copying solution would allocate the stride-buffers in deflater_init(), and free it on deflater_end(), doing the looping and copying on the native side. However for inflater, which is a lot faster (and has more predictable pause-times) I would not recommend a copying approach. The remaining question seems to be how long tolerable pauses are, and ideas? I would be interested in some ideas and feedback. What do you think would be a good solution? Thank you in advance, lg Clemens PS: The striding+GetPrimitive... is even used by NIO for copying java-arrays into direct-ByteBuffers: while (length > 0) { size = (length > MBYTE ? MBYTE : length); GETCRITICAL(bytes, env, dst); memcpy(bytes + dstPos, (void *)srcAddr, size); RELEASECRITICAL(bytes, env, dst, 0); ................ From Alan.Bateman at Sun.COM Mon Jan 7 08:35:36 2008 From: Alan.Bateman at Sun.COM (Alan Bateman) Date: Mon, 07 Jan 2008 08:35:36 +0000 Subject: Performance regression in java.util.zip.Deflater In-Reply-To: <194f62550801061611x7e363a61q95e74b89db6a17ff@mail.gmail.com> References: <194f62550712201120p1d10ac45xf86eb9cacd2eee87@mail.gmail.com> <476ADDAF.2070409@sun.com> <194f62550712201336y3380808bv3726d891873be277@mail.gmail.com> <476AEDCD.6080504@sun.com> <194f62550712201520p30d7b15wa8f2005749a77243@mail.gmail.com> <476B0ABA.6030102@sun.com> <194f62550801061611x7e363a61q95e74b89db6a17ff@mail.gmail.com> Message-ID: <4781E458.1010907@sun.com> Clemens Eisserer wrote: > : > > PS: The striding+GetPrimitive... is even used by NIO for copying > java-arrays into direct-ByteBuffers: > while (length > 0) { > size = (length > MBYTE ? MBYTE : length); > GETCRITICAL(bytes, env, dst); > memcpy(bytes + dstPos, (void *)srcAddr, size); > RELEASECRITICAL(bytes, env, dst, 0); > ................ > Yes, NIO uses JNI critical sections when copying to/from arrays, but as a FYI, we hope to eliminate this native code soon. The replacement uses the Unsafe interface to do the copying and will be much faster than the current native implementation. To allow for safepoint polling (in the VM) it also copies very large arrays/buffers in strides. -Alan. From peter.jones at sun.com Mon Jan 7 16:26:00 2008 From: peter.jones at sun.com (Peter Jones) Date: Mon, 7 Jan 2008 11:26:00 -0500 Subject: RMI benchmark In-Reply-To: <13432ab00801050401v4fc4c90dha223feb3b0b2cd34@mail.gmail.com> References: <13432ab00801050401v4fc4c90dha223feb3b0b2cd34@mail.gmail.com> Message-ID: <20080107162559.GA1873@east> On Sat, Jan 05, 2008 at 08:01:35PM +0800, zhang Jackie wrote: > Hi, everyone! > Recently ,I want to have a performance comparision on RMI and my own > version with little changes. Can you give me some microbenchmarks and some > other suites used for estimate the performance of RMI? I googled the keyword > "RMI benchmark", but cant get one. You can find a microbenchmark suite for RMI, as well as object serialization, in the "test" tree of the jdk7/jdk repository-- relative to the jdk7 forest, look here: jdk/test/java/rmi/reliability/benchmark There isn't much documentation there, but the script here: jdk/test/java/rmi/reliability/scripts/create_benchmark_jars.ksh shows how to create two JAR files, rmibench.jar and serialbench.jar from the sources. Running either JAR with "-h" prints a usage message. Each has a default config file that can be altered to customize which of the microbenchmarks to execute, how many repetitions of each to run, how much warmup to do, etc. The RMI suite can be run all in one VM or in two separate VMs, a "client" and a "server", possibly on different hosts. -- Peter From cowwoc at bbs.darktech.org Mon Jan 7 18:57:19 2008 From: cowwoc at bbs.darktech.org (cowwoc) Date: Mon, 7 Jan 2008 10:57:19 -0800 (PST) Subject: Proposal for improving performance of TreeMap and others Message-ID: <14673283.post@talk.nabble.com> I noticed that TreeMap (and maybe other classes) require a user to either pass in a Comparator or ensure that all keys must implement Comparable. The TreeMap code then uses a utility method whenever it needs to compare two keys: /** * Compares two keys using the correct comparison method for this TreeMap. */ final int compare(Object k1, Object k2) { return comparator == null ? ((Comparable) k1) .compareTo((K) k2) : comparator.compare((K) k1, (K) k2); } The problem with the above method is that it checks whether comparator is null once per comparison instead of once when the TreeMap is constructed. Instead I propose that this check only take place once in the constructors and the rest of the code assume that a comparator exists. If a comparator is not provided then you can simply define one as follows: comparator = new Comparator() { @SuppressWarnings("unchecked") public int compare(K first, K second) { return ((Comparable) first).compareTo(second); } }); This solution should be backwards compatible while improving performance. At least, that's my guess. There is always the chance that the JIT is smart enough to optimize away this comparison but I'd rather not rely on JIT implementation details. I also believe the resulting code is more readable. What do you think? -- View this message in context: http://www.nabble.com/Proposal-for-improving-performance-of-TreeMap-and-others-tp14673283p14673283.html Sent from the OpenJDK Core Libraries mailing list archive at Nabble.com. From linuxhippy at gmail.com Mon Jan 7 19:56:39 2008 From: linuxhippy at gmail.com (Clemens Eisserer) Date: Mon, 7 Jan 2008 20:56:39 +0100 Subject: Proposal for improving performance of TreeMap and others In-Reply-To: <14673283.post@talk.nabble.com> References: <14673283.post@talk.nabble.com> Message-ID: <194f62550801071156m7648135du57331ea91e26fa80@mail.gmail.com> > This solution should be backwards compatible while improving performance. At > least, that's my guess. There is always the chance that the JIT is smart > enough to optimize away this comparison but I'd rather not rely on JIT > implementation details. I also believe the resulting code is more readable. > > What do you think? >From the performance-overview theres no (real) difference I guess, but I have to agree that the code is more readable and cleaner. On the other hand its one more class that has to be shipped and loaded at startup. I like your approach, I just don't know the real pros and cons... lg Clemens From Thomas.Hawtin at Sun.COM Mon Jan 7 19:59:53 2008 From: Thomas.Hawtin at Sun.COM (Thomas Hawtin) Date: Mon, 07 Jan 2008 19:59:53 +0000 Subject: Proposal for improving performance of TreeMap and others In-Reply-To: <14673283.post@talk.nabble.com> References: <14673283.post@talk.nabble.com> Message-ID: <478284B9.6020603@Sun.COM> cowwoc wrote: > I noticed that TreeMap (and maybe other classes) require a user to either > pass in a Comparator or ensure that all keys must implement Comparable. The > TreeMap code then uses a utility method whenever it needs to compare two > keys: I'm not going to comment about performance, but there is a problem with serialisation. TreeMap.comparator is final (and non-transient). TreeMaps serialised with earlier versions will be deserialised with null comparator. So, comparator would either need to be made non-final or sun.misc.Unsafe used. For the serialisation case, it would be necessary to change writeObject to use putFields rather than defaultWriteObject (not very nice, but not half as nasty as I originally thought). Tom Hawtin From Martin.Buchholz at Sun.COM Mon Jan 7 21:04:30 2008 From: Martin.Buchholz at Sun.COM (Martin Buchholz) Date: Mon, 07 Jan 2008 13:04:30 -0800 Subject: Proposal for improving performance of TreeMap and others In-Reply-To: <14673283.post@talk.nabble.com> References: <14673283.post@talk.nabble.com> Message-ID: <478293DE.9090600@sun.com> The authors of TreeMap have thought about eliding comparator null checks: /** * Version of getEntry using comparator. Split off from getEntry * for performance. (This is not worth doing for most methods, * that are less dependent on comparator performance, but is * worthwhile here.) */ final Entry getEntryUsingComparator(Object key) { K k = (K) key; Comparator cpr = comparator; if (cpr != null) { Entry p = root; while (p != null) { int cmp = cpr.compare(k, p.key); if (cmp < 0) p = p.left; else if (cmp > 0) p = p.right; else return p; } } return null; } As to whether using an explicit Comparator for the "natural ordering" is a performance improvement, this depends very much on the implementation of the JIT and the degree of polymorphism of the call site, and on the prevalance of TreeMaps using "natural ordering". At the very least, a null check is very cheap, so it is unlikely that the proposed change will be a significant performance improvement, while, on the other hand, there is a good chance that it will decrease performance for TreeMaps using "natural ordering". Aside: It's probably a good idea for the comparator for "natural ordering" to be available via some static method. Martin cowwoc wrote: > I noticed that TreeMap (and maybe other classes) require a user to either > pass in a Comparator or ensure that all keys must implement Comparable. The > TreeMap code then uses a utility method whenever it needs to compare two > keys: > > > /** > * Compares two keys using the correct comparison method for this TreeMap. > */ > final int compare(Object k1, Object k2) { > return comparator == null ? ((Comparable) k1) > .compareTo((K) k2) : comparator.compare((K) k1, (K) k2); > } > > The problem with the above method is that it checks whether comparator is > null once per comparison instead of once when the TreeMap is constructed. > Instead I propose that this check only take place once in the constructors > and the rest of the code assume that a comparator exists. If a comparator is > not provided then you can simply define one as follows: > > comparator = new Comparator() > { > @SuppressWarnings("unchecked") > public int compare(K first, K second) > { > return ((Comparable) first).compareTo(second); > } > }); > > This solution should be backwards compatible while improving performance. At > least, that's my guess. There is always the chance that the JIT is smart > enough to optimize away this comparison but I'd rather not rely on JIT > implementation details. I also believe the resulting code is more readable. > > What do you think? From nradov at axolotl.com Mon Jan 7 21:31:58 2008 From: nradov at axolotl.com (Nick Radov) Date: Mon, 7 Jan 2008 13:31:58 -0800 Subject: core classes still need to be declared final? Message-ID: Is it still necessary for the core Java classes such as java.lang.Integer to be declared final? I understand that may have been necessary in the early days for performance reasons, but modern JVMs no longer provide much of a performance benefit for final classes. For certain applications it would really be helpful to be able to subclass some of those core classes. For example, one application I'm working on deals with integer values that must be between 0 and 9999 inclusive. I would like to be able to create a custom Integer subclass which enforces that limit in the constructor, but currently that isn't possible. While I could create a new class that acts as a wrapper around Integer, the syntax would be much more awkward and that would also make it much more difficult to interface with other third-party classes. Nick Radov | Research and Development Manager | Axolotl Corp www.axolotl.com, d: 408.920.0800 x116, f: 408.920.0880 160 West Santa Clara St., Suite 1000, San Jose, CA, 95113 THE MARKET LEADER IN HEALTH INFORMATION EXCHANGE ? PROVIDING PATIENT INFORMATION WHEN AND WHERE IT IS NEEDED. The information contained in this e-mail transmission may contain confidential information. It is intended for the use of the addressee. If you are not the intended recipient, any disclosure, copying, or distribution of this information is strictly prohibited. If you receive this message in error, please inform the sender immediately and remove any record of this message. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cowwoc at bbs.darktech.org Mon Jan 7 21:35:32 2008 From: cowwoc at bbs.darktech.org (cowwoc) Date: Mon, 07 Jan 2008 16:35:32 -0500 Subject: core classes still need to be declared final? In-Reply-To: References: Message-ID: <47829B24.2050506@bbs.darktech.org> My understanding is that this has nothing to do with performance. Certain classes, such as String, as declared final for security reasons. In the case of Integer I would suggest using composition. It's not as nice but it'll work. Gili Nick Radov wrote: > > Is it still necessary for the core Java classes such as > java.lang.Integer to be declared final? I understand that may have been > necessary in the early days for performance reasons, but modern JVMs no > longer provide much of a performance benefit for final classes. For > certain applications it would really be helpful to be able to subclass > some of those core classes. > > For example, one application I'm working on deals with integer values > that must be between 0 and 9999 inclusive. I would like to be able to > create a custom Integer subclass which enforces that limit in the > constructor, but currently that isn't possible. While I could create a > new class that acts as a wrapper around Integer, the syntax would be > much more awkward and that would also make it much more difficult to > interface with other third-party classes. > > *Nick Radov | Research and Development Manager | Axolotl Corp* > www.axolotl.com , d: 408.920.0800 x116, f: > 408.920.0880 > 160 West Santa Clara St., Suite 1000, San Jose, CA, 95113 > > THE MARKET LEADER IN HEALTH INFORMATION EXCHANGE ? PROVIDING PATIENT > INFORMATION WHEN AND WHERE IT IS NEEDED. > > /The information contained in this e-mail transmission may contain > confidential information. It is intended for the use of the addressee. > If you are not the intended recipient, any disclosure, copying, or > distribution of this information is strictly prohibited. If you receive > this message in error, please inform the sender immediately and remove > any record of this message./ From cowwoc at bbs.darktech.org Mon Jan 7 22:00:33 2008 From: cowwoc at bbs.darktech.org (cowwoc) Date: Mon, 7 Jan 2008 14:00:33 -0800 (PST) Subject: Proposal for improving performance of TreeMap and others In-Reply-To: <478293DE.9090600@sun.com> References: <14673283.post@talk.nabble.com> <478293DE.9090600@sun.com> Message-ID: <14676918.post@talk.nabble.com> I guess you're right. It is probably as likely that the JIT will optimize away the null check as it is that it will optimize away the NullPointerException check. One exception, though, is when production systems run using -Xverify:none. In such a case, wouldn't my approach run faster? I still think that my proposed code is somehow more consistent/cleaner on a design-level but I guess that's just me :) As an aside, are there standard benchmarks for testing the impact of this change? I'd love to know whether it actually produces any performance difference in practice. Gili Martin Buchholz wrote: > > The authors of TreeMap have thought about > eliding comparator null checks: > > > /** > * Version of getEntry using comparator. Split off from getEntry > * for performance. (This is not worth doing for most methods, > * that are less dependent on comparator performance, but is > * worthwhile here.) > */ > final Entry getEntryUsingComparator(Object key) { > K k = (K) key; > Comparator cpr = comparator; > if (cpr != null) { > Entry p = root; > while (p != null) { > int cmp = cpr.compare(k, p.key); > if (cmp < 0) > p = p.left; > else if (cmp > 0) > p = p.right; > else > return p; > } > } > return null; > } > > As to whether using an explicit Comparator for the "natural ordering" > is a performance improvement, this depends very much on > the implementation of the JIT and the degree of polymorphism of > the call site, and on the prevalance of TreeMaps using "natural > ordering". At the very least, a null check is very cheap, so it is > unlikely that the proposed change will be a significant performance > improvement, while, on the other hand, there is a good chance that > it will decrease performance for TreeMaps using "natural ordering". > > Aside: It's probably a good idea for the comparator for > "natural ordering" to be available via some static method. > > Martin > > > cowwoc wrote: >> I noticed that TreeMap (and maybe other classes) require a user to either >> pass in a Comparator or ensure that all keys must implement Comparable. >> The >> TreeMap code then uses a utility method whenever it needs to compare two >> keys: >> >> >> /** >> * Compares two keys using the correct comparison method for this >> TreeMap. >> */ >> final int compare(Object k1, Object k2) { >> return comparator == null ? ((Comparable) k1) >> .compareTo((K) k2) : comparator.compare((K) k1, (K) k2); >> } >> >> The problem with the above method is that it checks whether comparator is >> null once per comparison instead of once when the TreeMap is constructed. >> Instead I propose that this check only take place once in the >> constructors >> and the rest of the code assume that a comparator exists. If a comparator >> is >> not provided then you can simply define one as follows: >> >> comparator = new Comparator() >> { >> @SuppressWarnings("unchecked") >> public int compare(K first, K second) >> { >> return ((Comparable) first).compareTo(second); >> } >> }); >> >> This solution should be backwards compatible while improving performance. >> At >> least, that's my guess. There is always the chance that the JIT is smart >> enough to optimize away this comparison but I'd rather not rely on JIT >> implementation details. I also believe the resulting code is more >> readable. >> >> What do you think? > > -- View this message in context: http://www.nabble.com/Proposal-for-improving-performance-of-TreeMap-and-others-tp14673283p14676918.html Sent from the OpenJDK Core Libraries mailing list archive at Nabble.com. From linuxhippy at gmail.com Mon Jan 7 22:38:11 2008 From: linuxhippy at gmail.com (Clemens Eisserer) Date: Mon, 7 Jan 2008 23:38:11 +0100 Subject: Proposal for improving performance of TreeMap and others In-Reply-To: <14676918.post@talk.nabble.com> References: <14673283.post@talk.nabble.com> <478293DE.9090600@sun.com> <14676918.post@talk.nabble.com> Message-ID: <194f62550801071438r7cef7d61l3f19872940b88ab@mail.gmail.com> Hi cowwoc, > I guess you're right. It is probably as likely that the JIT will optimize > away the null check as it is that it will optimize away the > NullPointerException check. One exception, though, is when production > systems run using -Xverify:none. In such a case, wouldn't my approach run > faster? I don't think it will optimize the null-check away, however it is so cheap that it most likely will not weight at all, compared to all the other operations happening there. Its maybe 5 instructions compared to thousands or even more. -Xverify:none only disables bytecode verification at class-loading time and has no influence (as far as I know) on the performance of the generated code. > I still think that my proposed code is somehow more consistent/cleaner on a > design-level but I guess that's just me :) I also like it more, its cleaner in my opinion :) > As an aside, are there standard benchmarks for testing the impact of this > change? I'd love to know whether it actually produces any performance > difference in practice. >From my experience i would rather guess that you won't notice the change, noise will be higher. lg Clemens From Martin.Buchholz at Sun.COM Mon Jan 7 22:51:58 2008 From: Martin.Buchholz at Sun.COM (Martin Buchholz) Date: Mon, 07 Jan 2008 14:51:58 -0800 Subject: core classes still need to be declared final? In-Reply-To: References: Message-ID: <4782AD0E.6000206@sun.com> Subclassability is a problem with "value-oriented" computing. If security or extreme reliability is a concern, then existing apis that took Integers or Strings as arguments would have to make defensive copies on import or export, as they have to do with arrays today. Since existing classes depend on the immutability of Integers and Strings, these must forever remain non-subclassable, at least by untrusted application code. Inheritance is the one cornerstone of object-oriented computing that has disappointed us, now that we have gained experience with it, since it seriously constrains the evolution of superclasses. Prefer composition to inheritance. Especially so with immutable "value" types. For the particular case of range-restricted integers, I have some sympathy. It would be nice if the platform offered such things. Martin Nick Radov wrote: > > Is it still necessary for the core Java classes such as > java.lang.Integer to be declared final? I understand that may have been > necessary in the early days for performance reasons, but modern JVMs no > longer provide much of a performance benefit for final classes. For > certain applications it would really be helpful to be able to subclass > some of those core classes. > > For example, one application I'm working on deals with integer values > that must be between 0 and 9999 inclusive. I would like to be able to > create a custom Integer subclass which enforces that limit in the > constructor, but currently that isn't possible. While I could create a > new class that acts as a wrapper around Integer, the syntax would be > much more awkward and that would also make it much more difficult to > interface with other third-party classes. > > *Nick Radov | Research and Development Manager | Axolotl Corp* > www.axolotl.com , d: 408.920.0800 x116, f: > 408.920.0880 > 160 West Santa Clara St., Suite 1000, San Jose, CA, 95113 From forax at univ-mlv.fr Mon Jan 7 23:38:34 2008 From: forax at univ-mlv.fr (=?UTF-8?B?UsOpbWkgRm9yYXg=?=) Date: Tue, 08 Jan 2008 00:38:34 +0100 Subject: Proposal for improving performance of TreeMap and others In-Reply-To: <194f62550801071438r7cef7d61l3f19872940b88ab@mail.gmail.com> References: <14673283.post@talk.nabble.com> <478293DE.9090600@sun.com> <14676918.post@talk.nabble.com> <194f62550801071438r7cef7d61l3f19872940b88ab@mail.gmail.com> Message-ID: <4782B7FA.60303@univ-mlv.fr> Clemens Eisserer a ?crit : > Hi cowwoc, > > >> I guess you're right. It is probably as likely that the JIT will optimize >> away the null check as it is that it will optimize away the >> NullPointerException check. One exception, though, is when production >> systems run using -Xverify:none. In such a case, wouldn't my approach run >> faster? >> > I don't think it will optimize the null-check away, Hotspot removes nullcheck and install a signal handler since its v2 (around 2000/01 If my memory serves me well). > however it is so > cheap that it most likely will not weight at all, compared to all the > other operations happening there. Its maybe 5 instructions compared to > thousands or even more. > -Xverify:none only disables bytecode verification at class-loading > time and has no influence (as far as I know) on the performance of the > generated code. > yes, and there is an option to remove nullcheck that is only available on debug VM. ... > > lg Clemens > R?mi From cowwoc at bbs.darktech.org Mon Jan 7 23:51:52 2008 From: cowwoc at bbs.darktech.org (cowwoc) Date: Mon, 7 Jan 2008 15:51:52 -0800 (PST) Subject: Proposal for improving performance of TreeMap and others In-Reply-To: <194f62550801071438r7cef7d61l3f19872940b88ab@mail.gmail.com> References: <14673283.post@talk.nabble.com> <478293DE.9090600@sun.com> <14676918.post@talk.nabble.com> <194f62550801071438r7cef7d61l3f19872940b88ab@mail.gmail.com> Message-ID: <14679084.post@talk.nabble.com> Something very weird is going on. I tried profiling a minimal testcase and there is a considerable amount of "missing time". I am using a dev build of Netbeans 6.1 and it says: MyComparator.compare(Object, Object) 19670ms \-> MyComparator.compare(Integer, Integer) 10229ms \-> Self Time 3001ms \-> Integer.compareTo(Integer) 1575ms \-> Self Time 3788ms I spot at least three problems: 1) The individual item times do not add up to the total (but they do for other stack-traces). 2) Comparator.compare() self-time consumes more CPU than Integer.compareTo() even though it only invokes a method while the latter does actual computation. 3) Why is extra time consumed moving from MyComparator.compare(Object, Object) to (Integer, Integer)? It looks like Generics is doing something at runtime which consumes a large amount of cpu. Gili Clemens Eisserer wrote: > > Hi cowwoc, > >> I guess you're right. It is probably as likely that the JIT will optimize >> away the null check as it is that it will optimize away the >> NullPointerException check. One exception, though, is when production >> systems run using -Xverify:none. In such a case, wouldn't my approach run >> faster? > I don't think it will optimize the null-check away, however it is so > cheap that it most likely will not weight at all, compared to all the > other operations happening there. Its maybe 5 instructions compared to > thousands or even more. > -Xverify:none only disables bytecode verification at class-loading > time and has no influence (as far as I know) on the performance of the > generated code. > >> I still think that my proposed code is somehow more consistent/cleaner on >> a >> design-level but I guess that's just me :) > I also like it more, its cleaner in my opinion :) > >> As an aside, are there standard benchmarks for testing the impact of this >> change? I'd love to know whether it actually produces any performance >> difference in practice. >>From my experience i would rather guess that you won't notice the > change, noise will be higher. > > lg Clemens > > -- View this message in context: http://www.nabble.com/Proposal-for-improving-performance-of-TreeMap-and-others-tp14673283p14679084.html Sent from the OpenJDK Core Libraries mailing list archive at Nabble.com. From linuxhippy at gmail.com Tue Jan 8 15:10:20 2008 From: linuxhippy at gmail.com (Clemens Eisserer) Date: Tue, 8 Jan 2008 16:10:20 +0100 Subject: [PATCH] Performance bug in String(byte[],int,int,Charset) In-Reply-To: <474992D4.3010908@univ-mlv.fr> References: <47474D15.4060504@gmail.com> <474992D4.3010908@univ-mlv.fr> Message-ID: <194f62550801080710n5cf8cfe9r365926f89a019c95@mail.gmail.com> Hello again, > By the way, using clone() seams better than Arrays.copyOf() here. > > byte[] b = ba.clone(); Why? I remember that I've seen some benchmarks where array.clone() was way slower than creating a new array and using System.arraycopy() (which is exactly what copyOf does). However this may have changed ;) lg Clemens From cowwoc at bbs.darktech.org Tue Jan 8 16:23:50 2008 From: cowwoc at bbs.darktech.org (cowwoc) Date: Tue, 08 Jan 2008 11:23:50 -0500 Subject: Proposal for improving performance of TreeMap and others In-Reply-To: <478391A0.3020901@sun.com> References: <14673283.post@talk.nabble.com> <478293DE.9090600@sun.com> <14676918.post@talk.nabble.com> <194f62550801071438r7cef7d61l3f19872940b88ab@mail.gmail.com> <14679084.post@talk.nabble.com> <478391A0.3020901@sun.com> Message-ID: <4783A396.5020001@bbs.darktech.org> That's good news, I guess ;) because in my minimal testcase that had nothing to do with TreeMap it looked like using a Comparator to wrap natural ordering degraded performance by an order of magnitude... which is really bad :) If the same isn't true for the actual TreeMap this change might be worth considering for its code-cleanup potential. Gili charlie hunt wrote: > It's likely what you are observing in #2 & #3 and possibly in #1 also is > an artifact of inlining and possibly other JIT (dynamic) compiler > optimizations. > > You might consider re-running your experiment with inlining disabled, > -XX:-Inlining. > > Or, alternatively try running your experiment (with inlining enabled) > with Sun Studio Collector / Analyzer. Then, when viewing the results in > the Analyzer, filter (View > Filter Data), the samples so that you are > looking at a portion of samples after the code is warmed up. And, also > look at the results in machine view mode (View > Set Data Presentation > > Formats > View Mode > Machine). NOTE: In machine mode you can also > view the generated assembly code for each method. So, you can really > get down to the specifics of what's being executed. > > Fwiw, I did a comparison run of a TreeMap with your suggested changes > including removing "if (comparator == null)" checks with one of our > favorite SPEC benchmarks which does a pretty good job at exercising > TreeMap.compare(). Even with 18 degrees of freedom I found the changes > to have no significant improvement. I didn't look at, or compare the > generated assembly code for both versions TreeMap.compare(). Though > that might be kind of interesting. > > So, from a performance perspective, it appears this SPEC benchmark shows > no change in performance. > > hths, > > charlie ... > > cowwoc wrote: >> Something very weird is going on. I tried profiling a minimal testcase >> and >> there is a considerable amount of "missing time". I am using a dev >> build of >> Netbeans 6.1 and it says: >> >> MyComparator.compare(Object, Object) 19670ms >> \-> MyComparator.compare(Integer, Integer) 10229ms >> \-> Self Time 3001ms >> \-> Integer.compareTo(Integer) 1575ms >> \-> Self Time 3788ms >> >> I spot at least three problems: >> >> 1) The individual item times do not add up to the total (but they do for >> other stack-traces). >> 2) Comparator.compare() self-time consumes more CPU than >> Integer.compareTo() >> even though it only invokes a method while the latter does actual >> computation. >> 3) Why is extra time consumed moving from MyComparator.compare(Object, >> Object) to (Integer, Integer)? It looks like Generics is doing >> something at >> runtime which consumes a large amount of cpu. >> >> Gili >> >> >> Clemens Eisserer wrote: >> >>> Hi cowwoc, >>> >>> >>>> I guess you're right. It is probably as likely that the JIT will >>>> optimize >>>> away the null check as it is that it will optimize away the >>>> NullPointerException check. One exception, though, is when production >>>> systems run using -Xverify:none. In such a case, wouldn't my >>>> approach run >>>> faster? >>>> >>> I don't think it will optimize the null-check away, however it is so >>> cheap that it most likely will not weight at all, compared to all the >>> other operations happening there. Its maybe 5 instructions compared to >>> thousands or even more. >>> -Xverify:none only disables bytecode verification at class-loading >>> time and has no influence (as far as I know) on the performance of the >>> generated code. >>> >>> >>>> I still think that my proposed code is somehow more >>>> consistent/cleaner on >>>> a >>>> design-level but I guess that's just me :) >>>> >>> I also like it more, its cleaner in my opinion :) >>> >>> >>>> As an aside, are there standard benchmarks for testing the impact of >>>> this >>>> change? I'd love to know whether it actually produces any performance >>>> difference in practice. >>>> >>> >From my experience i would rather guess that you won't notice the >>> change, noise will be higher. >>> >>> lg Clemens >>> >>> >>> >> >> From charlie.hunt at sun.com Tue Jan 8 15:07:12 2008 From: charlie.hunt at sun.com (charlie hunt) Date: Tue, 08 Jan 2008 09:07:12 -0600 Subject: Proposal for improving performance of TreeMap and others In-Reply-To: <14679084.post@talk.nabble.com> References: <14673283.post@talk.nabble.com> <478293DE.9090600@sun.com> <14676918.post@talk.nabble.com> <194f62550801071438r7cef7d61l3f19872940b88ab@mail.gmail.com> <14679084.post@talk.nabble.com> Message-ID: <478391A0.3020901@sun.com> It's likely what you are observing in #2 & #3 and possibly in #1 also is an artifact of inlining and possibly other JIT (dynamic) compiler optimizations. You might consider re-running your experiment with inlining disabled, -XX:-Inlining. Or, alternatively try running your experiment (with inlining enabled) with Sun Studio Collector / Analyzer. Then, when viewing the results in the Analyzer, filter (View > Filter Data), the samples so that you are looking at a portion of samples after the code is warmed up. And, also look at the results in machine view mode (View > Set Data Presentation > Formats > View Mode > Machine). NOTE: In machine mode you can also view the generated assembly code for each method. So, you can really get down to the specifics of what's being executed. Fwiw, I did a comparison run of a TreeMap with your suggested changes including removing "if (comparator == null)" checks with one of our favorite SPEC benchmarks which does a pretty good job at exercising TreeMap.compare(). Even with 18 degrees of freedom I found the changes to have no significant improvement. I didn't look at, or compare the generated assembly code for both versions TreeMap.compare(). Though that might be kind of interesting. So, from a performance perspective, it appears this SPEC benchmark shows no change in performance. hths, charlie ... cowwoc wrote: > Something very weird is going on. I tried profiling a minimal testcase and > there is a considerable amount of "missing time". I am using a dev build of > Netbeans 6.1 and it says: > > MyComparator.compare(Object, Object) 19670ms > \-> MyComparator.compare(Integer, Integer) 10229ms > \-> Self Time 3001ms > \-> Integer.compareTo(Integer) 1575ms > \-> Self Time 3788ms > > I spot at least three problems: > > 1) The individual item times do not add up to the total (but they do for > other stack-traces). > 2) Comparator.compare() self-time consumes more CPU than Integer.compareTo() > even though it only invokes a method while the latter does actual > computation. > 3) Why is extra time consumed moving from MyComparator.compare(Object, > Object) to (Integer, Integer)? It looks like Generics is doing something at > runtime which consumes a large amount of cpu. > > Gili > > > Clemens Eisserer wrote: > >> Hi cowwoc, >> >> >>> I guess you're right. It is probably as likely that the JIT will optimize >>> away the null check as it is that it will optimize away the >>> NullPointerException check. One exception, though, is when production >>> systems run using -Xverify:none. In such a case, wouldn't my approach run >>> faster? >>> >> I don't think it will optimize the null-check away, however it is so >> cheap that it most likely will not weight at all, compared to all the >> other operations happening there. Its maybe 5 instructions compared to >> thousands or even more. >> -Xverify:none only disables bytecode verification at class-loading >> time and has no influence (as far as I know) on the performance of the >> generated code. >> >> >>> I still think that my proposed code is somehow more consistent/cleaner on >>> a >>> design-level but I guess that's just me :) >>> >> I also like it more, its cleaner in my opinion :) >> >> >>> As an aside, are there standard benchmarks for testing the impact of this >>> change? I'd love to know whether it actually produces any performance >>> difference in practice. >>> >> >From my experience i would rather guess that you won't notice the >> change, noise will be higher. >> >> lg Clemens >> >> >> > > From Martin.Buchholz at Sun.COM Wed Jan 9 06:01:11 2008 From: Martin.Buchholz at Sun.COM (Martin Buchholz) Date: Tue, 08 Jan 2008 22:01:11 -0800 Subject: [PATCH] Performance bug in String(byte[],int,int,Charset) In-Reply-To: <194f62550801080710n5cf8cfe9r365926f89a019c95@mail.gmail.com> References: <47474D15.4060504@gmail.com> <474992D4.3010908@univ-mlv.fr> <194f62550801080710n5cf8cfe9r365926f89a019c95@mail.gmail.com> Message-ID: <47846327.5060901@sun.com> The slowness of array.clone() has been fixed as of jdk6 and 5.0u6. 6428387: array clone() much slower than Arrays.copyOf http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6428387 The rest of this message is the latest version of my private microbenchmark to measure the fix: import java.util.*; public class ArrayCopyMicroBenchmark { abstract static class Job { private final String name; Job(String name) { this.name = name; } String name() { return name; } abstract void work() throws Throwable; } private static void collectAllGarbage() { try { for (int i = 0; i < 2; i++) { System.gc(); System.runFinalization(); Thread.sleep(10); } } catch (InterruptedException e) { throw new Error(e); } } /** * Runs each job for long enough that all the runtime compilers * have had plenty of time to warm up, i.e. get around to * compiling everything worth compiling. * Returns array of average times per job per run. */ private static long[] time0(Job ... jobs) throws Throwable { final long warmupNanos = 10L * 1000L * 1000L * 1000L; long[] nanoss = new long[jobs.length]; for (int i = 0; i < jobs.length; i++) { collectAllGarbage(); long t0 = System.nanoTime(); long t; int j = 0; do { jobs[i].work(); j++; } while ((t = System.nanoTime() - t0) < warmupNanos); nanoss[i] = t/j; } return nanoss; } private static void time(Job ... jobs) throws Throwable { long[] warmup = time0(jobs); // Warm up run long[] nanoss = time0(jobs); // Real timing run final String nameHeader = "Method"; int nameWidth = nameHeader.length(); for (Job job : jobs) nameWidth = Math.max(nameWidth, job.name().length()); final String millisHeader = "Millis"; int millisWidth = millisHeader.length(); for (long nanos : nanoss) millisWidth = Math.max(millisWidth, String.format("%d", nanos/(1000L * 1000L)).length()); final String ratioHeader = "Ratio"; int ratioWidth = ratioHeader.length(); String format = String.format("%%-%ds %%%dd %%.3f%%n", nameWidth, millisWidth); String headerFormat = String.format("%%-%ds %%-%ds %%-%ds%%n", nameWidth, millisWidth, ratioWidth); System.out.printf(headerFormat, "Method", "Millis", "Ratio"); // Print out absolute and relative times, calibrated against first job for (int i = 0; i < jobs.length; i++) { long millis = nanoss[i]/(1000L * 1000L); double ratio = (double)nanoss[i] / (double)nanoss[0]; System.out.printf(format, jobs[i].name(), millis, ratio); } } private static int intArg(String[] args, int i, int defaultValue) { return args.length > i ? Integer.parseInt(args[i]) : defaultValue; } private static void deoptimize(Object[] a) { for (Object x : a) if (x == null) throw new Error(); } public static void main(String[] args) throws Throwable { final int iterations = intArg(args, 0, 100000); final int size = intArg(args, 1, 1000); final Object[] array = new Object[size]; final Random rnd = new Random(); for (int i = 0; i < array.length; i++) array[i] = rnd.nextInt(size); time( new Job("arraycopy") { void work() { Object[] a = array; for (int i = 0; i < iterations; i++) { Object[] t = new Object[size]; System.arraycopy(a, 0, t, 0, size); a = t;} deoptimize(a);}}, new Job("copyOf") { void work() { Object[] a = array; for (int i = 0; i < iterations; i++) a = Arrays.copyOf(a, size); deoptimize(a);}}, new Job("clone") { void work() { Object[] a = array; for (int i = 0; i < iterations; i++) a = a.clone(); deoptimize(a);}}, new Job("loop") { void work() { Object[] a = array; for (int i = 0; i < iterations; i++) { Object[] t = new Object[size]; for (int j = 0; j < size; j++) t[j] = a[j]; a = t;} deoptimize(a);}} ); } } Clemens Eisserer wrote: > Hello again, > >>By the way, using clone() seams better than Arrays.copyOf() here. >> >>byte[] b = ba.clone(); > > Why? I remember that I've seen some benchmarks where array.clone() was > way slower than creating a new array and using System.arraycopy() > (which is exactly what copyOf does). However this may have changed ;) > > lg Clemens From linuxhippy at gmail.com Wed Jan 9 11:53:28 2008 From: linuxhippy at gmail.com (Clemens Eisserer) Date: Wed, 9 Jan 2008 12:53:28 +0100 Subject: Performance regression in java.util.zip.Deflater In-Reply-To: <476B1916.2060502@sun.com> References: <194f62550712201120p1d10ac45xf86eb9cacd2eee87@mail.gmail.com> <476ADDAF.2070409@sun.com> <194f62550712201336y3380808bv3726d891873be277@mail.gmail.com> <476AEDCD.6080504@sun.com> <194f62550712201520p30d7b15wa8f2005749a77243@mail.gmail.com> <476B0ABA.6030102@sun.com> <194f62550712201702n6f44efd5hda27c397e8d1ce96@mail.gmail.com> <476B1916.2060502@sun.com> Message-ID: <194f62550801090353x484a856bl3b6bfdc1e65cf58d@mail.gmail.com> Hi again, I've finished a very early draft of the native stride+copy implementation of Deflater. Its still very early and is not tested a lot (so don't wory about I would think this should go in as is ;) ), but seems to perform quite well. I just post it ... well ... to get some critics and advises ;) I don't like the code as its far too messy in my opinion, maybe somebody has better ideas to clean it up. Furthermore I don't know wether it breaks corner-cases. lg Clemens -------------- next part -------------- A non-text attachment was scrubbed... Name: Deflater.java Type: application/octet-stream Size: 14566 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Deflater.c Type: application/octet-stream Size: 10878 bytes Desc: not available URL: From linuxhippy at gmail.com Wed Jan 9 12:23:41 2008 From: linuxhippy at gmail.com (Clemens Eisserer) Date: Wed, 9 Jan 2008 13:23:41 +0100 Subject: Performance regression in java.util.zip.Deflater In-Reply-To: <194f62550801090353x484a856bl3b6bfdc1e65cf58d@mail.gmail.com> References: <194f62550712201120p1d10ac45xf86eb9cacd2eee87@mail.gmail.com> <476ADDAF.2070409@sun.com> <194f62550712201336y3380808bv3726d891873be277@mail.gmail.com> <476AEDCD.6080504@sun.com> <194f62550712201520p30d7b15wa8f2005749a77243@mail.gmail.com> <476B0ABA.6030102@sun.com> <194f62550712201702n6f44efd5hda27c397e8d1ce96@mail.gmail.com> <476B1916.2060502@sun.com> <194f62550801090353x484a856bl3b6bfdc1e65cf58d@mail.gmail.com> Message-ID: <194f62550801090423s2cd83a1aia4c81541c1e28c04@mail.gmail.com> Sorry, sent the wrong files and found some bugs. Will re-send the updated files soon -sorry for the traffic. lg Clemens 2008/1/9, Clemens Eisserer : > Hi again, > > I've finished a very early draft of the native stride+copy > implementation of Deflater. > Its still very early and is not tested a lot (so don't wory about I > would think this should go in as is ;) ), but seems to perform quite > well. > I just post it ... well ... to get some critics and advises ;) > > I don't like the code as its far too messy in my opinion, maybe > somebody has better ideas to clean it up. Furthermore I don't know > wether it breaks corner-cases. > > lg Clemens > > From linuxhippy at gmail.com Wed Jan 9 15:11:47 2008 From: linuxhippy at gmail.com (Clemens Eisserer) Date: Wed, 9 Jan 2008 16:11:47 +0100 Subject: Early version of striding Deflater Message-ID: <194f62550801090711q35d8a5f1wb5a4a29480b40f9b@mail.gmail.com> Hello again, I've finished an early version of the java.util.zip.Deflater implementation that uses striding. Its in an early stage and quite likely will be buggy. It passes FlaterTest and a simple test written by myself, but maybe acts differently in corner-cases. I would be happy to receive some comments as well as criticism ;) lg Clemens PS: Sorry for the traffic lately. -------------- next part -------------- A non-text attachment was scrubbed... Name: Deflater.c Type: application/octet-stream Size: 7184 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Deflater.java Type: application/octet-stream Size: 11495 bytes Desc: not available URL: From linuxhippy at gmail.com Thu Jan 10 02:41:36 2008 From: linuxhippy at gmail.com (Clemens Eisserer) Date: Thu, 10 Jan 2008 03:41:36 +0100 Subject: 6539727 is no bug Message-ID: <194f62550801091841h6ff93771ic616ca17ba39dee7@mail.gmail.com> Hi again, While going through the bug-database I noticed that 6539727 is no bug - the code is just mis-using Deflater. The old deflater implementation did/does not return any bytes when deflateParams() has to be called, at a second call deflate() is called again and further data is processed. The problem is that the reporter expects that Deflater does always return bytes, without calling finished(). lg Clemens From David.Bristor at Sun.COM Fri Jan 11 20:51:04 2008 From: David.Bristor at Sun.COM (Dave Bristor) Date: Fri, 11 Jan 2008 12:51:04 -0800 Subject: Early version of striding Deflater In-Reply-To: <194f62550801090711q35d8a5f1wb5a4a29480b40f9b@mail.gmail.com> References: <194f62550801090711q35d8a5f1wb5a4a29480b40f9b@mail.gmail.com> Message-ID: <4787D6B8.5030805@sun.com> Hi Clemens, Thanks for the code drop! I was not yet able to spend much Quality Time with it. At a glance, it seems like the fix might work. The general ideas seem sound, and I appreciate your effort in addressing it. So please don't take the feedback below as anything but encouragement to help get this bug fixed! That said...it is a low priority bug (for us, anyway; a P4 out of 5) and we have bigger fish to fry. I'll attend to it as time permits... Then too, there are some issues to resolve re the provided code. I know it's an "Early version", but some changes to it would make it easier to examine. For example, Deflater.java was completely reformatted. When diffing the code, every changeof indentation, javadoc removal, spaces to tabs, moving of {, etc. shows up. We don't allow such changes into the JDK sources. Spaces only, and despite whatever awful "standards" (or not!) already in use in the file, please stick to them. Make every change be the best/smallest one which directly addresses the bug. This makes it easier for all concerned to examine the relevant changes. There are similar issues in Deflater.c. The files provided the lack the GNU copyright file headers. My guess is that they originated in the src.zip of a binary distribution. Regardless of their source, could you please instead use files from mercurial repository at hg.openjdk.java.net/jdk7/tl? The above will make it easier to review future changes to the files. Enough of that boring stuff, here's some feedback on "interesting" part of the changes! In Deflater.java, new method rangeCheck() is used in a couple of places, but it is not an adequate replacement for the code previously in Deflater.setDictionary(); it incompatibly changes the error condition checking semantics (we can't omit the check on strm). We strive to not introduce incompatible changes, even small ones like this. Another incompatible change is to the semantics of Deflater.deflate()...I think. In Deflater.c, it seems that deflateBytes will use setParams if necessary and then compress...I reference your email about 6539727 in this regard (You are completely right about that being a non-bug, BTW and I'll update it shortly: thanks!) I think your solution would have been the Right Thing to do way back when, but we don't want to make an incompatible change now. (I haven't reviewed this thoroughly enough; see my notes re formatting & priorities above.) With a change of this sort, we really do need tests along with a fix. Have you started writing any test cases? Finally, it seems that this solution obviates the need for the striding in DeflaterOutputStream...IIRC, that is some of the original motivation for this work. If you have suggested changes to that class as well, please include them. I appreciate the work you've put in, and again, I hope to not dissuade you. But we have certain standards to which we must adhere, and it's a not a very high priority for us now, so we have to minimize the time we spend on it. Thanks, Dave Clemens Eisserer wrote: > Hello again, > > I've finished an early version of the java.util.zip.Deflater > implementation that uses striding. Its in an early stage and quite > likely will be buggy. > It passes FlaterTest and a simple test written by myself, but maybe > acts differently in corner-cases. > > I would be happy to receive some comments as well as criticism ;) > > lg Clemens > > PS: Sorry for the traffic lately. From linuxhippy at gmail.com Sun Jan 13 23:32:47 2008 From: linuxhippy at gmail.com (Clemens Eisserer) Date: Mon, 14 Jan 2008 00:32:47 +0100 Subject: Early version of striding Deflater In-Reply-To: <4787D6B8.5030805@sun.com> References: <194f62550801090711q35d8a5f1wb5a4a29480b40f9b@mail.gmail.com> <4787D6B8.5030805@sun.com> Message-ID: <194f62550801131532v4a3b443bt550beb6bd34549cb@mail.gmail.com> Hi Dave, Thanks a lot for your reply. To make it short: Of course I understand that this is low-priority (also for me, its a fun-only fix because someone in forums.java.net mentioned it) so don't hurry. Sorry that I wasted your time with my messy files, they were taken from my "playground" thats why they were in such a bad shape - they were only intended to give an idea which "road" I was taking. I attached the new files taken from the mercurial repositories and only modified at the affected places. > With a change of this sort, we really do need tests along with a fix. Have > you started writing any test cases? I completly agree - I have some simple test-cases which test more or less only very basic functionality of Deflater and they work well (also FlatterTest passes). I'll write some more tests which test exotic use-cases like changing compression-level, ... during compression. I have some open questions: 1.) Is the seperate structure approach to hold the stride-buffers ok? 2.) Any suggestions for the following names: 1. strm-field in class (defAdr), 2. defAdr-parameter,3. defptr - long_to_ptr of defAdr, 4. def_data - name of the structure 3.) I am not really used to program in C. Are the adress-operations ok which I used to get members of the new struct def_data? Thanks for your patience, lg Clemens Some notes, and changes in ramdom order: * Changed deflate-bytes to the old behaviour to return after the call to deflateParams * Verified that its ok to call deflateParams when there's not enough space in the output-buffer to flush all "old" data out (thanks to Mark Adler) * I changed the method-signiture of the native method compared to original, because some variables were read from JNI-code, whereas they could have been passed simply down using method parameters. I think its "cleaner" to pass it. * Allocation of the stride-buffers together with the z_stream structure. z_stream is really large, so the two stride-buffers should not add that much overhead. However this has the advantage of not mallocing/freeing and also beeing able to fill the input-stride-buffer once for several calls of the native method. * Renamed the strm-adress-parameter to defadr, because it no longer really points to a strm. I did not rename the java field "strm" because I did not have an idea for a proper name. * Removed striding from DeflaterOutputStream, (looked how code looked in 1.4.2). -------------- next part -------------- A non-text attachment was scrubbed... Name: Deflater.java Type: application/octet-stream Size: 14312 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Deflater.c Type: application/octet-stream Size: 8251 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: DeflaterOutputStream.java Type: application/octet-stream Size: 5643 bytes Desc: not available URL: From roman.kennke at aicas.com Mon Jan 21 21:12:51 2008 From: roman.kennke at aicas.com (Roman Kennke) Date: Mon, 21 Jan 2008 22:12:51 +0100 Subject: Null-terminated Unicode strings in java.io on Windows Message-ID: <1200949971.6264.48.camel@mercury> Hi, I'm trying to understand a piece of code in java.io . Let me try to explain: When you look into WinNTFileSystem.c in the method getBooleanAttributes(), you see that the file object is converted to a WCHAR* using fileToNTPath(). In io_util.c, fileToNTPath(), the filename string is extracted from the File object, and passed to pathToNTPath(). This is where it gets interesting. The pathToNTPath() function first converts the string into a jchar* using the macro WITH_UNICODE_STRING. This macro uses GetStringChars() to do this conversion. Now this is where I'm lost. Java strings are not null-terminated, and neither are the jchar* returned by GetStringChars() (which is in itself a long discussed problem in the JNI spec, but that's another story). But back in pathToNTPath() this jchar* is treated just like a null-terminated string, for example, we call wcslen() to determine its length, which relies on the string beeing null-terminated. Now I assume that this works somehow, and I only see the following options: 1. There's something in this picture that I don't see. Maybe the string ends up null-terminated somehow? 2. Maybe this works by accident because Hotspot terminates strings with a null internally? 3. Or this is a serious bug, that for some reason doesn't bomb all the time. After all, it _does_ bomb in the JamaicaVM, where I'm trying to port the code to... Any ideas? I'd be happy to get an explanation for this problem. Cheers, Roman -- Dipl.-Inform. (FH) Roman Kennke, Software Engineer, http://kennke.org aicas Allerton Interworks Computer Automated Systems GmbH Haid-und-Neu-Stra?e 18 * D-76131 Karlsruhe * Germany http://www.aicas.com * Tel: +49-721-663 968-0 USt-Id: DE216375633, Handelsregister HRB 109481, AG Karlsruhe Gesch?ftsf?hrer: Dr. James J. Hunt From Alan.Bateman at Sun.COM Mon Jan 21 21:52:09 2008 From: Alan.Bateman at Sun.COM (Alan Bateman) Date: Mon, 21 Jan 2008 21:52:09 +0000 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: <1200949971.6264.48.camel@mercury> References: <1200949971.6264.48.camel@mercury> Message-ID: <47951409.70805@sun.com> Roman Kennke wrote: > Hi, > > I'm trying to understand a piece of code in java.io . Let me try to > explain: > > When you look into WinNTFileSystem.c in the method > getBooleanAttributes(), you see that the file object is converted to a > WCHAR* using fileToNTPath(). In io_util.c, fileToNTPath(), the filename > string is extracted from the File object, and passed to pathToNTPath(). > > This is where it gets interesting. The pathToNTPath() function first > converts the string into a jchar* using the macro WITH_UNICODE_STRING. > This macro uses GetStringChars() to do this conversion. Now this is > where I'm lost. Java strings are not null-terminated, and neither are > the jchar* returned by GetStringChars() (which is in itself a long > discussed problem in the JNI spec, but that's another story). But back > in pathToNTPath() this jchar* is treated just like a null-terminated > string, for example, we call wcslen() to determine its length, which > relies on the string beeing null-terminated. Now I assume that this > works somehow, and I only see the following options: > 1. There's something in this picture that I don't see. Maybe the string > ends up null-terminated somehow? > 2. Maybe this works by accident because Hotspot terminates strings with > a null internally? > 3. Or this is a serious bug, that for some reason doesn't bomb all the > time. After all, it _does_ bomb in the JamaicaVM, where I'm trying to > port the code to... > > Any ideas? I'd be happy to get an explanation for this problem. > > Cheers, Roman > The GetStringChars implementation in HotSpot always returns a copy that is length+1 and zero terminated. There is a long-standing bug to clarify the JNI specification on this topic. I believe it should say that the returned array of Unicode characters is not required to be zero terminated and that one should use GetStringLength to determine the length. Steve Bohne (cc'ed) has done the recent maintenance on the JNI spec and may wish to comment. In any case, I did a quick cscope and aside from java.io, it only appears to impact a small number of places. -Alan. From roman.kennke at aicas.com Mon Jan 21 22:01:04 2008 From: roman.kennke at aicas.com (Roman Kennke) Date: Mon, 21 Jan 2008 23:01:04 +0100 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: <47951409.70805@sun.com> References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com> Message-ID: <1200952864.6264.53.camel@mercury> Hi Alan, Am Montag, den 21.01.2008, 21:52 +0000 schrieb Alan Bateman: > Roman Kennke wrote: > > Hi, > > > > I'm trying to understand a piece of code in java.io . Let me try to > > explain: > > > > When you look into WinNTFileSystem.c in the method > > getBooleanAttributes(), you see that the file object is converted to a > > WCHAR* using fileToNTPath(). In io_util.c, fileToNTPath(), the filename > > string is extracted from the File object, and passed to pathToNTPath(). > > > > This is where it gets interesting. The pathToNTPath() function first > > converts the string into a jchar* using the macro WITH_UNICODE_STRING. > > This macro uses GetStringChars() to do this conversion. Now this is > > where I'm lost. Java strings are not null-terminated, and neither are > > the jchar* returned by GetStringChars() (which is in itself a long > > discussed problem in the JNI spec, but that's another story). But back > > in pathToNTPath() this jchar* is treated just like a null-terminated > > string, for example, we call wcslen() to determine its length, which > > relies on the string beeing null-terminated. Now I assume that this > > works somehow, and I only see the following options: > > 1. There's something in this picture that I don't see. Maybe the string > > ends up null-terminated somehow? > > 2. Maybe this works by accident because Hotspot terminates strings with > > a null internally? > > 3. Or this is a serious bug, that for some reason doesn't bomb all the > > time. After all, it _does_ bomb in the JamaicaVM, where I'm trying to > > port the code to... > > > > Any ideas? I'd be happy to get an explanation for this problem. > > > > Cheers, Roman > > > The GetStringChars implementation in HotSpot always returns a copy that > is length+1 and zero terminated. There is a long-standing bug to clarify > the JNI specification on this topic. I believe it should say that the > returned array of Unicode characters is not required to be zero > terminated and that one should use GetStringLength to determine the > length. Steve Bohne (cc'ed) has done the recent maintenance on the JNI > spec and may wish to comment. In any case, I did a quick cscope and > aside from java.io, it only appears to impact a small number of places. So this is indeed a bug, right? Do you think it makes sense to go out and fix it? /Roman -- Dipl.-Inform. (FH) Roman Kennke, Software Engineer, http://kennke.org aicas Allerton Interworks Computer Automated Systems GmbH Haid-und-Neu-Stra?e 18 * D-76131 Karlsruhe * Germany http://www.aicas.com * Tel: +49-721-663 968-0 USt-Id: DE216375633, Handelsregister HRB 109481, AG Karlsruhe Gesch?ftsf?hrer: Dr. James J. Hunt From Tim.Bell at Sun.COM Mon Jan 21 22:45:06 2008 From: Tim.Bell at Sun.COM (Tim Bell) Date: Mon, 21 Jan 2008 14:45:06 -0800 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: <1200952864.6264.53.camel@mercury> References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com> <1200952864.6264.53.camel@mercury> Message-ID: <47952072.2060600@sun.com> Alan Bateman wrote (about GetStringChars): > [...] is length+1 and zero terminated. There is a long-standing bug to clarify the JNI specification on this topic. I believe it should say that the returned array of Unicode characters is not required to be zero terminated and that one should use GetStringLength to determine the length. Roman Kennke wrote: > So this is indeed a bug, right? Do you think it makes sense to go out and fix it? I'd start here: 4616318 Spec for JNI's GetStringChars() is incomplete http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4616318 HTH - Tim From roman.kennke at aicas.com Mon Jan 21 22:57:42 2008 From: roman.kennke at aicas.com (Roman Kennke) Date: Mon, 21 Jan 2008 23:57:42 +0100 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: <47952072.2060600@sun.com> References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com> <1200952864.6264.53.camel@mercury> <47952072.2060600@sun.com> Message-ID: <1200956262.6264.65.camel@mercury> Hi, Am Montag, den 21.01.2008, 14:45 -0800 schrieb Tim Bell: > Alan Bateman wrote (about GetStringChars): > > > [...] is length+1 and zero terminated. There is a long-standing bug to clarify the JNI specification on this topic. I believe it should say that the returned array of Unicode characters is not required to be zero terminated and that one should use GetStringLength to determine the length. > > Roman Kennke wrote: > > > So this is indeed a bug, right? Do you think it makes sense to go out and fix it? > > I'd start here: > > 4616318 Spec for JNI's GetStringChars() is incomplete > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4616318 Hmm, I'm not talking about fixing the spec (I've read that bug report while searching for clarfication on the spec actually). When the spec doesn't tell _that_ the returned array is zero terminated, I think we should assume that it isn't (and it seems to be the trend that the spec should be clarfied by saying that an implementation isn't required to return a zero-terminated array, but this is only speculation). What I'm asking is, should we fix the java.io C code to deal with non-zero-terminated jchar arrays? Unfortunately, this probably means allocating additional buffers, because we really need zero terminated strings here (AFAICS). /Roman -- Dipl.-Inform. (FH) Roman Kennke, Software Engineer, http://kennke.org aicas Allerton Interworks Computer Automated Systems GmbH Haid-und-Neu-Stra?e 18 * D-76131 Karlsruhe * Germany http://www.aicas.com * Tel: +49-721-663 968-0 USt-Id: DE216375633, Handelsregister HRB 109481, AG Karlsruhe Gesch?ftsf?hrer: Dr. James J. Hunt From program.spe at home.pl Tue Jan 22 07:35:59 2008 From: program.spe at home.pl (Krzysztof =?UTF-8?Q?=C5=BBelechowski?=) Date: Tue, 22 Jan 2008 08:35:59 +0100 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: <1200956262.6264.65.camel@mercury> References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com> <1200952864.6264.53.camel@mercury> <47952072.2060600@sun.com> <1200956262.6264.65.camel@mercury> Message-ID: <1200987359.6488.3.camel@a1dmin.vola.spe.com.pl> Dnia 21-01-2008, Pn o godzinie 23:57 +0100, Roman Kennke pisze: > Hi, > > Am Montag, den 21.01.2008, 14:45 -0800 schrieb Tim Bell: > > Alan Bateman wrote (about GetStringChars): > > > > > [...] is length+1 and zero terminated. There is a long-standing bug to clarify the JNI specification on this topic. I believe it should say that the returned array of Unicode characters is not required to be zero terminated and that one should use GetStringLength to determine the length. > > > > Roman Kennke wrote: > > > > > So this is indeed a bug, right? Do you think it makes sense to go out and fix it? > > > > I'd start here: > > > > 4616318 Spec for JNI's GetStringChars() is incomplete > > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4616318 > > Hmm, I'm not talking about fixing the spec (I've read that bug report > while searching for clarfication on the spec actually). When the spec > doesn't tell _that_ the returned array is zero terminated, I think we > should assume that it isn't (and it seems to be the trend that the spec > should be clarfied by saying that an implementation isn't required to > return a zero-terminated array, but this is only speculation). What I'm > asking is, should we fix the java.io C code to deal with > non-zero-terminated jchar arrays? Unfortunately, this probably means > allocating additional buffers, because we really need zero terminated > strings here (AFAICS). If the specification gets fixed so that GSC result MUST be z-term, your VM will cease being conformant so it will be fixed and no additional buffers will be needed. Chris From forax at univ-mlv.fr Thu Jan 24 12:05:44 2008 From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=) Date: Thu, 24 Jan 2008 13:05:44 +0100 Subject: Selector cleanup Message-ID: <47987F18.4040309@univ-mlv.fr> Hi all, i currently develop a small web server and I think codes related to selectors can be improved just by changing some small pieces of code. To be crystal clear, i don't want to re-implement all selector related stuffs but just patch some parts of the actual code. There are some allocations in JDK API that can be removed, the code was badly retrofited to 1.5 and lot of field can be declared final. Some methods/fields still 'use' raw types and doesn't take advantage of autoboxing. Futhermore, there is some divergence between Windows and *nix code i don't understand. By example, WindowsSelectorImpl and PollSelectorImpl uses a pipe to implements wakeup but WindowsSelectorImpl relies on Pipe and PollSelectorImpl on IOUtil.initPipe(). I think this code should be the same. in WindowsSelectorImpl: - updateSelectedKeys() use an iterator to traverse the array (an ArrayList). It should use an indexed loop instead to avoid Iterator allocation. - field threads should be declared as an ArrayList because adjustThreadsCount() supose that i can be iterate using an indexed loop. Furthermore, it can be generified like this: private final ArrayList threads = new ArrayList(); - class FDMap, I don't see why FdMap need to be a class, all methods can be moved as member methods of WindowsSelectorImpl without problems. Futhermore, the constructor of FdMap is private (get/put/remove too) so the compiler stupidly inserts accessor methods (access$000 etc.). Ok, the main point, here when the code was retrofited to 1.5, The new Integer() was not transformed to use Integer.valueOf() to share small integers and avoid allocation if file descriptor value are small. - In class MapEntry, ski should be declared final. - close(), set selectedKeys() to null doesn't allow the Set to be collected because publicSelectedKeys contains() a reference to it. in PollSelectorImpl: - interruptLock should be final. - close(), see WindowsSelectorImpl in EpollSelectorImpl: - like in poll, interruptLock should be final. - hashMap fdTokey should be generified and final. - close(), see WindowsSelectorImpl - implRegister/implDereg - They should use Integer.valueOf() instead of new Integer(). - IOUtil.fdVal() is used spuriously, in implRegister but not in implDereg. - EPollArrayWrapper - updateList is a LinkedList, a double linked list that stores Updator object, I think it's more efficient to add a field next in the Updator object and link updator by hand in order to avoid to create LinkedList$Entry . - Updataor.opcode and Updator.fd should be declared final. - SelectorImpl: key and selectedKeys should be LinkedHashSet instead of Set because they are frequently iterated. let discuss about that before I submit patchs. R?mi From Alan.Bateman at Sun.COM Thu Jan 24 13:11:42 2008 From: Alan.Bateman at Sun.COM (Alan Bateman) Date: Thu, 24 Jan 2008 13:11:42 +0000 Subject: Selector cleanup In-Reply-To: <47987F18.4040309@univ-mlv.fr> References: <47987F18.4040309@univ-mlv.fr> Message-ID: <47988E8E.3010403@sun.com> R?mi Forax wrote: > Hi all, i currently develop a small web server and I think codes related > to selectors can be improved just by changing some small pieces of code. > To be crystal clear, i don't want to re-implement all selector related > stuffs but > just patch some parts of the actual code. > > There are some allocations in JDK API that can be removed, > the code was badly retrofited to 1.5 and lot of field can be declared > final. > Some methods/fields still 'use' raw types and doesn't take > advantage of autoboxing. You're right. Much of the code here dates back to 1.4 and we haven't gone back to clean-up things like this. > > Futhermore, there is some divergence between Windows and *nix > code i don't understand. > By example, WindowsSelectorImpl and PollSelectorImpl uses a pipe to > implements wakeup but WindowsSelectorImpl relies on Pipe > and PollSelectorImpl on IOUtil.initPipe(). > I think this code should be the same. Ideally we would use a socketpair for the wakeup mechanism but Windows doesn't support it. For this reason, Pipe is implemented as a loopback connection and this works okay for the wakeup mechanism too. One thing to mention is that PollSelectorImpl is only used now when running on the Linux 2.4 kernel (it's not used with the 2.6 kernel and isn't used on Solaris). I just mention this as someday it might become obsolete and we can remove it. > > in WindowsSelectorImpl: > - updateSelectedKeys() use an iterator to traverse the array > (an ArrayList). It should use an indexed loop instead > to avoid Iterator allocation. > - field threads should be declared as an ArrayList > because adjustThreadsCount() supose that i can be iterate > using an indexed loop. > Furthermore, it can be generified like this: > private final ArrayList threads = new ArrayList(); These clean-ups seem reasonable. > - class FDMap, > I don't see why FdMap need to be a class, all methods can be moved > as member methods of WindowsSelectorImpl without problems. > Futhermore, the constructor of FdMap is private (get/put/remove too) > so the compiler stupidly inserts accessor methods (access$000 etc.). > Ok, the main point, here when the code was retrofited to 1.5, > The new Integer() was not transformed to use Integer.valueOf() > to share small integers and avoid allocation if file descriptor > value are small. These are SOCKET types rather than file descriptors and unlikely to be in the range that Integer caches (actually it should be a Long but that is a story for another day). > - In class MapEntry, ski should be declared final. > - close(), set selectedKeys() to null doesn't allow the Set to be > collected > because publicSelectedKeys contains() a reference to it. > > in PollSelectorImpl: > - interruptLock should be final. > - close(), see WindowsSelectorImpl > > in EpollSelectorImpl: > - like in poll, interruptLock should be final. > - hashMap fdTokey should be generified and final. > - close(), see WindowsSelectorImpl > - implRegister/implDereg > - They should use Integer.valueOf() instead of new Integer(). > - IOUtil.fdVal() is used spuriously, in implRegister but not > in implDereg. These are integers so there could be some benefit (but probably very hard to measure). > > - EPollArrayWrapper > - updateList is a LinkedList, a double linked list that stores > Updator object, > I think it's more efficient to add a field next in the Updator > object and > link updator by hand in order to avoid to create LinkedList$Entry . Maybe but probably very hard to measure. > - Updataor.opcode and Updator.fd should be declared final. > - SelectorImpl: > key and selectedKeys should be LinkedHashSet instead of Set > because they are frequently iterated. > let discuss about that before I submit patchs. The clean-ups you suggest seem reasonable so I would suggest going ahead and sending a patch. I'm happy to review and work with you to get the clean-ups integrated (once OpenJDK/jdk7 re-opens for changes of course). -Alan. PS: I don't know anything about your "small web server" but the simple server in com.sun.net.httpserver may be useful. From forax at univ-mlv.fr Thu Jan 24 15:41:52 2008 From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=) Date: Thu, 24 Jan 2008 16:41:52 +0100 Subject: Selector cleanup In-Reply-To: <47988E8E.3010403@sun.com> References: <47987F18.4040309@univ-mlv.fr> <47988E8E.3010403@sun.com> Message-ID: <4798B1C0.9020907@univ-mlv.fr> Alan Bateman a ?crit : > R?mi Forax wrote: >> Hi all, i currently develop a small web server and I think codes >> related >> to selectors can be improved just by changing some small pieces of code. >> To be crystal clear, i don't want to re-implement all selector >> related stuffs but >> just patch some parts of the actual code. >> >> There are some allocations in JDK API that can be removed, >> the code was badly retrofited to 1.5 and lot of field can be declared >> final. >> Some methods/fields still 'use' raw types and doesn't take >> advantage of autoboxing. > You're right. Much of the code here dates back to 1.4 and we haven't > gone back to clean-up things like this. > >> >> Futhermore, there is some divergence between Windows and *nix >> code i don't understand. >> By example, WindowsSelectorImpl and PollSelectorImpl uses a pipe to >> implements wakeup but WindowsSelectorImpl relies on Pipe >> and PollSelectorImpl on IOUtil.initPipe(). >> I think this code should be the same. > Ideally we would use a socketpair for the wakeup mechanism but Windows > doesn't support it. For this reason, Pipe is implemented as a loopback > connection and this works okay for the wakeup mechanism too. One thing > to mention is that PollSelectorImpl is only used now when running on > the Linux 2.4 kernel (it's not used with the 2.6 kernel and isn't used > on Solaris). I just mention this as someday it might become obsolete > and we can remove it. ok. > >> >> in WindowsSelectorImpl: >> - updateSelectedKeys() use an iterator to traverse the array >> (an ArrayList). It should use an indexed loop instead >> to avoid Iterator allocation. >> - field threads should be declared as an ArrayList >> because adjustThreadsCount() supose that i can be iterate >> using an indexed loop. >> Furthermore, it can be generified like this: >> private final ArrayList threads = new ArrayList(); > These clean-ups seem reasonable. > >> - class FDMap, >> I don't see why FdMap need to be a class, all methods can be moved >> as member methods of WindowsSelectorImpl without problems. >> Futhermore, the constructor of FdMap is private (get/put/remove too) >> so the compiler stupidly inserts accessor methods (access$000 etc.). >> Ok, the main point, here when the code was retrofited to 1.5, >> The new Integer() was not transformed to use Integer.valueOf() >> to share small integers and avoid allocation if file descriptor >> value are small. > These are SOCKET types rather than file descriptors and unlikely to be > in the range that Integer caches (actually it should be a Long but > that is a story for another day). ok, no valueOf(), i'm not an expert in Windows API. But are you agree that class FdMap is not necessary. > >> - In class MapEntry, ski should be declared final. >> - close(), set selectedKeys() to null doesn't allow the Set to be >> collected >> because publicSelectedKeys contains() a reference to it. >> >> in PollSelectorImpl: >> - interruptLock should be final. >> - close(), see WindowsSelectorImpl >> >> in EpollSelectorImpl: >> - like in poll, interruptLock should be final. >> - hashMap fdTokey should be generified and final. >> - close(), see WindowsSelectorImpl >> - implRegister/implDereg >> - They should use Integer.valueOf() instead of new Integer(). >> - IOUtil.fdVal() is used spuriously, in implRegister but not >> in implDereg. > These are integers so there could be some benefit (but probably very > hard to measure). yes, very hard to mesure until you span 1k thread with one selector each. btw if you take a look to EPollArrayWrapper, idlSet already use boxing. > >> >> - EPollArrayWrapper >> - updateList is a LinkedList, a double linked list that stores >> Updator object, >> I think it's more efficient to add a field next in the Updator >> object and >> link updator by hand in order to avoid to create LinkedList$Entry . > Maybe but probably very hard to measure. > >> - Updataor.opcode and Updator.fd should be declared final. >> - SelectorImpl: >> key and selectedKeys should be LinkedHashSet instead of Set >> because they are frequently iterated. >> let discuss about that before I submit patchs. > The clean-ups you suggest seem reasonable so I would suggest going > ahead and sending a patch. i will do that. > I'm happy to review and work with you to get the clean-ups integrated > (once OpenJDK/jdk7 re-opens for changes of course). Do you have any idea when openjdk will be reopen ? > > -Alan. > > PS: I don't know anything about your "small web server" but the simple > server in com.sun.net.httpserver may be useful. My small server is a research project that embeds a non-blocking parser in a webserver and claims to have the same performance than grizzly. I will post a blog entry about it when all benchmarks will be finished. R?mi From Alan.Bateman at Sun.COM Thu Jan 24 16:50:07 2008 From: Alan.Bateman at Sun.COM (Alan Bateman) Date: Thu, 24 Jan 2008 16:50:07 +0000 Subject: Selector cleanup In-Reply-To: <4798B1C0.9020907@univ-mlv.fr> References: <47987F18.4040309@univ-mlv.fr> <47988E8E.3010403@sun.com> <4798B1C0.9020907@univ-mlv.fr> Message-ID: <4798C1BF.8010205@sun.com> R?mi Forax wrote: > ok, no valueOf(), i'm not an expert in Windows API. > But are you agree that class FdMap is not necessary. I agree and I assume you will replace it with an embedded Map. I suspect it will be hard to see a difference (with the server VM anyway). > yes, very hard to mesure until you span 1k thread with one selector each. The typical NIO server tends to handle lots of concurrent connections so with a relatively small number of threads (one per core for example) and a small number of Selectors. It sounds like your server might be different. Selector creation is relatively expensive so you might run into issues there. > btw if you take a look to EPollArrayWrapper, idlSet already use boxing. The idle set is almost always empty and an aside from one case, there shouldn't be any boxing when the set is empty. > Do you have any idea when openjdk will be reopen ? Mark and others are working hard to make this happen very soon. As I understand it they have some infrastructure work to complete before they can allow changesets to be pushed. > My small server is a research project that embeds a non-blocking > parser in a webserver and > claims to have the same performance than grizzly. I will post a blog > entry about it > when all benchmarks will be finished. I look forward to it. -Alan. From mark at klomp.org Fri Jan 25 13:14:32 2008 From: mark at klomp.org (Mark Wielaard) Date: Fri, 25 Jan 2008 13:14:32 +0000 (UTC) Subject: Null-terminated Unicode strings in java.io on Windows References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com> <1200952864.6264.53.camel@mercury> <47952072.2060600@sun.com> <1200956262.6264.65.camel@mercury> <1200987359.6488.3.camel@a1dmin.vola.spe.com.pl> Message-ID: Krzysztof ?elechowski writes: > If the specification gets fixed so that GSC result MUST be z-term, > your VM will cease being conformant > so it will be fixed and no additional buffers will be needed. Eh, that doesn't seem right at all. The specification currently doesn't guarantee that the result is a jchar array that is zero terminated. So you can expect current runtimes not to do this. As Roman said at least JamaicaVM doesn't do this. I just checked the implementations gcj and jamvm, they both also don't make any such guarantee (cacao does seem to add an extra 0 at the end of the result it returns though). So "clarifying the spec" would break a lot of code of currently conforming implementations. The code relying on this behavior seems to be just buggy and should be fixed imho. Cheers, Mark From forax at univ-mlv.fr Fri Jan 25 13:20:29 2008 From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=) Date: Fri, 25 Jan 2008 14:20:29 +0100 Subject: Selector cleanup In-Reply-To: <4798C1BF.8010205@sun.com> References: <47987F18.4040309@univ-mlv.fr> <47988E8E.3010403@sun.com> <4798B1C0.9020907@univ-mlv.fr> <4798C1BF.8010205@sun.com> Message-ID: <4799E21D.6050908@univ-mlv.fr> Alan Bateman a ?crit : > R?mi Forax wrote: >> ok, no valueOf(), i'm not an expert in Windows API. >> But are you agree that class FdMap is not necessary. > I agree and I assume you will replace it with an embedded Map. I > suspect it will be hard to see a difference (with the server VM anyway). i'am pretty sure to see no perf difference but it will use less memory. > >> yes, very hard to mesure until you span 1k thread with one selector >> each. > The typical NIO server tends to handle lots of concurrent connections > so with a relatively small number of threads (one per core for > example) and a small number of Selectors. It sounds like your server > might be different. Selector creation is relatively expensive so you > might run into issues there. Selector are pre-created, during startup, so no problem. We have observed that a selector doesn't work well with a lot of keys. That's why i use more threads than one per core. I have found someone else saying the same thing: see http://blogs.sun.com/oleksiys/entry/multiple_selector_read_threads_in > >> btw if you take a look to EPollArrayWrapper, idlSet already use boxing. > The idle set is almost always empty and an aside from one case, there > shouldn't be any boxing when the set is empty. I not agree, reading the code, idle set is used when setInterestOps(0) is called. I'm not sure that case is not frequent. By example, you can found this code in grizzly: // disable OP_READ on key before doing anything else key.interestOps(key.interestOps() & (~SelectionKey.OP_READ)); see http://weblogs.java.net/blog/jfarcand/archive/2006/06/tricks_and_tips.html > >> Do you have any idea when openjdk will be reopen ? > Mark and others are working hard to make this happen very soon. As I > understand it they have some infrastructure work to complete before > they can allow changesets to be pushed. > > >> My small server is a research project that embeds a non-blocking >> parser in a webserver and >> claims to have the same performance than grizzly. I will post a blog >> entry about it >> when all benchmarks will be finished. > I look forward to it. > > -Alan. R?mi From program.spe at home.pl Fri Jan 25 13:28:10 2008 From: program.spe at home.pl (Krzysztof =?UTF-8?Q?=C5=BBelechowski?=) Date: Fri, 25 Jan 2008 14:28:10 +0100 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com> <1200952864.6264.53.camel@mercury> <47952072.2060600@sun.com> <1200956262.6264.65.camel@mercury> <1200987359.6488.3.camel@a1dmin.vola.spe.com.pl> Message-ID: <1201267690.6482.4.camel@a1dmin.vola.spe.com.pl> Dnia 25-01-2008, Pt o godzinie 13:14 +0000, Mark Wielaard pisze: > Krzysztof ?elechowski writes: > > If the specification gets fixed so that GSC result MUST be z-term, > > your VM will cease being conformant > > so it will be fixed and no additional buffers will be needed. > > Eh, that doesn't seem right at all. > The specification currently doesn't guarantee that the result is a jchar array > that is zero terminated. So you can expect current runtimes not to do this. As > Roman said at least JamaicaVM doesn't do this. I just checked the > implementations gcj and jamvm, they both also don't make any such guarantee > (cacao does seem to add an extra 0 at the end of the result it returns though). > So "clarifying the spec" would break a lot of code of currently conforming > implementations. The code relying on this behavior seems to be just buggy and > should be fixed imho. The specification is buggy in that it does not take into account the operating system interface and makes correct memory management inefficient for the benefit of sparing one byte per buffer where an OS call is not needed. Ridiculous. The developers at Sun found the correct way to interpreting the specification; the other ones followed it blindfolded. It is now time to repent. Cheers, Chris From Alan.Bateman at Sun.COM Fri Jan 25 13:59:25 2008 From: Alan.Bateman at Sun.COM (Alan Bateman) Date: Fri, 25 Jan 2008 13:59:25 +0000 Subject: Selector cleanup In-Reply-To: <4799E21D.6050908@univ-mlv.fr> References: <47987F18.4040309@univ-mlv.fr> <47988E8E.3010403@sun.com> <4798B1C0.9020907@univ-mlv.fr> <4798C1BF.8010205@sun.com> <4799E21D.6050908@univ-mlv.fr> Message-ID: <4799EB3D.5070807@sun.com> R?mi Forax wrote: > : > We have observed that a selector doesn't work well with a lot of keys. Is this just Windows? I ask because the Selector implementations on Solaris and Linux scale very well and there are many people using it on servers that are handling thousands of concurrent connections. >> The idle set is almost always empty and an aside from one case, there >> shouldn't be any boxing when the set is empty. > I not agree, reading the code, idle set is used when setInterestOps(0) > is called. > I'm not sure that case is not frequent. > By example, you can found this code in grizzly: > > // disable OP_READ on key before doing anything else > key.interestOps(key.interestOps() & (~SelectionKey.OP_READ)); > > see > http://weblogs.java.net/blog/jfarcand/archive/2006/06/tricks_and_tips.html > I've only observed it on a few occasions. As it happens that fragment of Grizzly code is what lead us to add the idle set as I missed this case in the original implementation (see http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6403933 for details). -Alan. From mark at klomp.org Fri Jan 25 14:40:18 2008 From: mark at klomp.org (Mark Wielaard) Date: Fri, 25 Jan 2008 14:40:18 +0000 (UTC) Subject: Null-terminated Unicode strings in java.io on Windows References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com> <1200952864.6264.53.camel@mercury> <47952072.2060600@sun.com> <1200956262.6264.65.camel@mercury> Message-ID: Hi Roman, Roman Kennke writes: > Hmm, I'm not talking about fixing the spec (I've read that bug report > while searching for clarfication on the spec actually). When the spec > doesn't tell _that_ the returned array is zero terminated, I think we > should assume that it isn't (and it seems to be the trend that the spec > should be clarfied by saying that an implementation isn't required to > return a zero-terminated array, but this is only speculation). What I'm > asking is, should we fix the java.io C code to deal with > non-zero-terminated jchar arrays? Unfortunately, this probably means > allocating additional buffers, because we really need zero terminated > strings here (AFAICS). If you rewrite WITH_UNICODE_STRING to not use the runtime to allocate and deallocate the jchar array through GetStringChars and ReleaseStringChars but allocate and deallocate the jchar arrray yourself using GetStringLength (+ 1) and then fill it through GetStringRegion() it looks like you don't really need to allocate any additional buffers. Cheers, Mark From Alan.Bateman at Sun.COM Fri Jan 25 15:05:13 2008 From: Alan.Bateman at Sun.COM (Alan Bateman) Date: Fri, 25 Jan 2008 15:05:13 +0000 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: <1200952864.6264.53.camel@mercury> References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com> <1200952864.6264.53.camel@mercury> Message-ID: <4799FAA9.70804@sun.com> Roman Kennke wrote: > Hi Alan, > > Am Montag, den 21.01.2008, 21:52 +0000 schrieb Alan Bateman: > >> : >> The GetStringChars implementation in HotSpot always returns a copy that >> is length+1 and zero terminated. There is a long-standing bug to clarify >> the JNI specification on this topic. I believe it should say that the >> returned array of Unicode characters is not required to be zero >> terminated and that one should use GetStringLength to determine the >> length. Steve Bohne (cc'ed) has done the recent maintenance on the JNI >> spec and may wish to comment. In any case, I did a quick cscope and >> aside from java.io, it only appears to impact a small number of places. >> > > So this is indeed a bug, right? Do you think it makes sense to go out > and fix it? > > This is one of issues that has gone unnoticed for years because we don't test with other VMs and also the Windows code isn't used when porting to other platforms. So I'd suggest just doing it. Mark Wielaard's mail provides a good suggestion. You'll probably want to check other areas of the code too (src/windows/native/java/lang/ProcessImpl_md.c for example) for other cases. -Alan. From rob.lougher at gmail.com Fri Jan 25 17:08:39 2008 From: rob.lougher at gmail.com (Robert Lougher) Date: Fri, 25 Jan 2008 17:08:39 +0000 Subject: Null-terminated Unicode strings in java.io on Windows Message-ID: Hi, Apologies if you receive this twice. I sent it via nabble and it's now stuck awaiting moderation so I've subscribed. Dnia 25-01-2008, Pt o godzinie 13:14 +0000, Mark Wielaard pisze: > Krzysztof ?elechowski writes: > > If the specification gets fixed so that GSC result MUST be z-term, > > your VM will cease being conformant > > so it will be fixed and no additional buffers will be needed. > > Eh, that doesn't seem right at all. > The specification currently doesn't guarantee that the result is a jchar array > that is zero terminated. So you can expect current runtimes not to do this. As > Roman said at least JamaicaVM doesn't do this. I just checked the > implementations gcj and jamvm, they both also don't make any such guarantee > (cacao does seem to add an extra 0 at the end of the result it returns though). > So "clarifying the spec" would break a lot of code of currently conforming > implementations. The code relying on this behavior seems to be just buggy and > should be fixed imho. The specification is buggy in that it does not take into account the operating system interface and makes correct memory management inefficient for the benefit of sparing one byte per buffer where an OS call is not needed. Ridiculous. The developers at Sun found the correct way to interpreting the specification; the other ones followed it blindfolded. It is now time to repent. Wrong! Requiring null termimation will make things more inefficient. This is because Strings within Java are not null-terminated. So to add the null the VM will have to copy the String chars into a new buffer. A more efficient approach is to simply return a pointer to the String chars themselves. However, this will not be null-terminated. The JNI specification allows a VM to either copy the chars or return a direct pointer. The extra isCopy parameter can be used to find out what it did. The point is, if the programmer doesn't need a null-terminated string, not copying is _much_ more efficient. The programmer can always copy and add the null if they need to. But forcing the VM to null-terminate will require a copy and slow it down it all cases. If I was updating the spec, I would change it so that if a copy is returned it is always null terminated. If it isn't a copy then it may or may not be. It's likely no VMs will need changing, as I suspect the ones that do not null-terminate are returning direct pointers (e.g. JamVM). And I doubt Sun makes a copy because of the null. Giving out direct heap pointers causes problems for VMs that move objects within the heap (e.g. a compacting GC). Either you've got to "pin" the object so it can't move or you always copy. Sun probably chose the latter. In JamVM, I decided to pin the String (it's unpinned in ReleaseStringChars). Rob. P.S. I hope your blindfold has been removed :) When implementing a VM few things are as straight-forward as they may seem. From roman at kennke.org Fri Jan 25 17:17:03 2008 From: roman at kennke.org (Roman Kennke) Date: Fri, 25 Jan 2008 18:17:03 +0100 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: References: Message-ID: <1201281423.6277.86.camel@mercury> Hi, > The specification is buggy > in that it does not take into account the operating system interface > and makes correct memory management inefficient > for the benefit of sparing one byte per buffer > where an OS call is not needed. > Ridiculous. > The developers at Sun > found the correct way to interpreting the specification; > the other ones followed it blindfolded. It is now time to repent. > > > Wrong! Requiring null termimation will make things more inefficient. > This is because Strings within Java are not null-terminated. Unless the VM stores all strings with 0-termination internally, which is possible, but arguably more inefficient on another level. > If I was updating the spec, I would change it so that if a copy is > returned it is always null terminated. If it isn't a copy then it may > or may not be. It's likely no VMs will need changing, as I suspect > the ones that do not null-terminate are returning direct pointers > (e.g. JamVM). Maybe we should all go to the original old bug (gosh! from 2001!) and make some noise? /Roman -- http://kennke.org/blog/ From roman at kennke.org Fri Jan 25 17:19:51 2008 From: roman at kennke.org (Roman Kennke) Date: Fri, 25 Jan 2008 18:19:51 +0100 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: <1201267690.6482.4.camel@a1dmin.vola.spe.com.pl> References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com> <1200952864.6264.53.camel@mercury> <47952072.2060600@sun.com> <1200956262.6264.65.camel@mercury> <1200987359.6488.3.camel@a1dmin.vola.spe.com.pl> <1201267690.6482.4.camel@a1dmin.vola.spe.com.pl> Message-ID: <1201281591.6277.88.camel@mercury> Hi, > The specification is buggy > in that it does not take into account the operating system interface > and makes correct memory management inefficient > for the benefit of sparing one byte per buffer > where an OS call is not needed. > Ridiculous. Tom Tromey pointed out another possible problem on IRC: What if the string itself contains the 0? Unlikely, but possible in the Java world. Cheers, Roman -- http://kennke.org/blog/ From program.spe at home.pl Fri Jan 25 17:23:14 2008 From: program.spe at home.pl (Krzysztof =?UTF-8?Q?=C5=BBelechowski?=) Date: Fri, 25 Jan 2008 18:23:14 +0100 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: References: Message-ID: <1201281794.6482.17.camel@a1dmin.vola.spe.com.pl> Dnia 25-01-2008, Pt o godzinie 17:08 +0000, Robert Lougher pisze: > Hi, Hi-aye. > > Apologies if you receive this twice. I sent it via nabble and it's > now stuck awaiting moderation so I've subscribed. > > > > Dnia 25-01-2008, Pt o godzinie 13:14 +0000, Mark Wielaard pisze: > > Krzysztof ?elechowski writes: > > > If the specification gets fixed so that GSC result MUST be z-term, > > > your VM will cease being conformant > > > so it will be fixed and no additional buffers will be needed. > > > > Eh, that doesn't seem right at all. > > The specification currently doesn't guarantee that the result is a jchar array > > that is zero terminated. So you can expect current runtimes not to do this. As > > Roman said at least JamaicaVM doesn't do this. I just checked the > > implementations gcj and jamvm, they both also don't make any such guarantee > > (cacao does seem to add an extra 0 at the end of the result it returns though). > > So "clarifying the spec" would break a lot of code of currently conforming > > implementations. The code relying on this behavior seems to be just buggy and > > should be fixed imho. > > The specification is buggy > in that it does not take into account the operating system interface > and makes correct memory management inefficient > for the benefit of sparing one byte per buffer > where an OS call is not needed. > Ridiculous. > The developers at Sun > found the correct way to interpreting the specification; > the other ones followed it blindfolded. It is now time to repent. > > > Wrong! Requiring null termimation will make things more inefficient. > This is because Strings within Java are not null-terminated. They are not z-term in the sense that they may contain zero inside, but nothing more. The implementation is free to affix zero to each and every string buffer and make that zero unavailable to Java as required by the specification. It is an easy thing to do because strings are immutable. > So to > add the null the VM will have to copy the String chars into a new > buffer. > > A more efficient approach is to simply return a pointer to the String > chars themselves. However, this will not be null-terminated. It depends on the implementation, as described above. > > The JNI specification allows a VM to either copy the chars or return a > direct pointer. The extra isCopy parameter can be used to find out > what it did. > > The point is, if the programmer doesn't need a null-terminated string, > not copying is _much_ more efficient. The programmer can always copy > and add the null if they need to. But forcing the VM to > null-terminate will require a copy and slow it down it all cases. No, it will not, because all strings buffers will have an inaccessible zero at the end. > > If I was updating the spec, I would change it so that if a copy is > returned it is always null terminated. If it isn't a copy then it may > or may not be. It's likely no VMs will need changing, as I suspect > the ones that do not null-terminate are returning direct pointers > (e.g. JamVM). If I was updating the spec, I would say that strings are required to be inaccessibly z-term as above if the underlying OS expects them to be in most cases. > > And I doubt Sun makes a copy because of the null. So do I, they apparently need not. > > Rob. > > P.S. I hope your blindfold has been removed :) When implementing a VM > few things are as straight-forward as they may seem. So do I (that my blindfold has been removed). Chris From program.spe at home.pl Fri Jan 25 17:26:49 2008 From: program.spe at home.pl (Krzysztof =?UTF-8?Q?=C5=BBelechowski?=) Date: Fri, 25 Jan 2008 18:26:49 +0100 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: <1201281591.6277.88.camel@mercury> References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com> <1200952864.6264.53.camel@mercury> <47952072.2060600@sun.com> <1200956262.6264.65.camel@mercury> <1200987359.6488.3.camel@a1dmin.vola.spe.com.pl> <1201267690.6482.4.camel@a1dmin.vola.spe.com.pl> <1201281591.6277.88.camel@mercury> Message-ID: <1201282009.6482.20.camel@a1dmin.vola.spe.com.pl> Dnia 25-01-2008, Pt o godzinie 18:19 +0100, Roman Kennke pisze: > Hi, > > > The specification is buggy > > in that it does not take into account the operating system interface > > and makes correct memory management inefficient > > for the benefit of sparing one byte per buffer > > where an OS call is not needed. > > Ridiculous. > > Tom Tromey pointed out another possible problem on IRC: What if the > string itself contains the 0? Unlikely, but possible in the Java world. I understand that parameters passed to the OS are subject to the limitations of the OS. Not containing a zero inside may be just one of them. The Java specification claims nowhere that every string can be used to name every object. Chris From rob.lougher at gmail.com Fri Jan 25 17:30:07 2008 From: rob.lougher at gmail.com (Robert Lougher) Date: Fri, 25 Jan 2008 17:30:07 +0000 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: <1201281794.6482.17.camel@a1dmin.vola.spe.com.pl> References: <1201281794.6482.17.camel@a1dmin.vola.spe.com.pl> Message-ID: On 1/25/08, Krzysztof ?elechowski wrote: > > Dnia 25-01-2008, Pt o godzinie 17:08 +0000, Robert Lougher pisze: > > Hi, > > Hi-aye. > > > > > Apologies if you receive this twice. I sent it via nabble and it's > > now stuck awaiting moderation so I've subscribed. > > > > > > > > Dnia 25-01-2008, Pt o godzinie 13:14 +0000, Mark Wielaard pisze: > > > Krzysztof ?elechowski writes: > > > > If the specification gets fixed so that GSC result MUST be z-term, > > > > your VM will cease being conformant > > > > so it will be fixed and no additional buffers will be needed. > > > > > > Eh, that doesn't seem right at all. > > > The specification currently doesn't guarantee that the result is a jchar array > > > that is zero terminated. So you can expect current runtimes not to do this. As > > > Roman said at least JamaicaVM doesn't do this. I just checked the > > > implementations gcj and jamvm, they both also don't make any such guarantee > > > (cacao does seem to add an extra 0 at the end of the result it returns though). > > > So "clarifying the spec" would break a lot of code of currently conforming > > > implementations. The code relying on this behavior seems to be just buggy and > > > should be fixed imho. > > > > The specification is buggy > > in that it does not take into account the operating system interface > > and makes correct memory management inefficient > > for the benefit of sparing one byte per buffer > > where an OS call is not needed. > > Ridiculous. > > The developers at Sun > > found the correct way to interpreting the specification; > > the other ones followed it blindfolded. It is now time to repent. > > > > > > Wrong! Requiring null termimation will make things more inefficient. > > This is because Strings within Java are not null-terminated. > > They are not z-term in the sense that they may contain zero inside, > but nothing more. > The implementation is free > to affix zero to each and every string buffer > and make that zero unavailable to Java as required by the specification. > It is an easy thing to do because strings are immutable. > > > So to > > add the null the VM will have to copy the String chars into a new > > buffer. > > > > A more efficient approach is to simply return a pointer to the String > > chars themselves. However, this will not be null-terminated. > > It depends on the implementation, as described above. No it doesn't. An implementation would have to be truly stupid to internally null-terminate. How many Strings are in the heap? How many will the programmer access via GetStringChars? The null will be a overhead for all Strings for a miniscule percentage. From program.spe at home.pl Fri Jan 25 17:42:23 2008 From: program.spe at home.pl (Krzysztof =?UTF-8?Q?=C5=BBelechowski?=) Date: Fri, 25 Jan 2008 18:42:23 +0100 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: References: <1201281794.6482.17.camel@a1dmin.vola.spe.com.pl> Message-ID: <1201282944.6482.30.camel@a1dmin.vola.spe.com.pl> Dnia 25-01-2008, Pt o godzinie 17:30 +0000, Robert Lougher pisze: > No it doesn't. An implementation would have to be truly stupid to > internally null-terminate. How many Strings are in the heap? How > many will the programmer access via GetStringChars? The null will be > a overhead for all Strings for a miniscule percentage. Please observe: 1. the amount of memory needed to manage the allocation is greater than the number of bytes needed to store one additional character, so the relative impact on memory usage will not be dramatic. 2. The string usually has much more characters then one. That means, if strings take 10 characters on the average, the overhead is 10%, in the impossible worst case, as explained below. This is an overhead I (and most programmers) can live with. 3. Memory is allocated in chunks. The size and alignment of the chunk is subject to various limitations. If the characters of the string do not fill the chunk entirely, there is good chance that there will space for the terminating zero anyway. Yours truly, Chris From roman at kennke.org Fri Jan 25 17:44:54 2008 From: roman at kennke.org (Roman Kennke) Date: Fri, 25 Jan 2008 18:44:54 +0100 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: <1201282009.6482.20.camel@a1dmin.vola.spe.com.pl> References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com> <1200952864.6264.53.camel@mercury> <47952072.2060600@sun.com> <1200956262.6264.65.camel@mercury> <1200987359.6488.3.camel@a1dmin.vola.spe.com.pl> <1201267690.6482.4.camel@a1dmin.vola.spe.com.pl> <1201281591.6277.88.camel@mercury> <1201282009.6482.20.camel@a1dmin.vola.spe.com.pl> Message-ID: <1201283094.9468.4.camel@mercury> Heyo, > > The specification is buggy > > > in that it does not take into account the operating system interface > > > and makes correct memory management inefficient > > > for the benefit of sparing one byte per buffer > > > where an OS call is not needed. > > > Ridiculous. > > > > Tom Tromey pointed out another possible problem on IRC: What if the > > string itself contains the 0? Unlikely, but possible in the Java world. > > I understand that parameters passed to the OS > are subject to the limitations of the OS. > Not containing a zero inside may be just one of them. > The Java specification claims nowhere > that every string can be used to name every object. Yeah, but GetStringChars() is a general purpuse JNI function and not at all tied to the OS. Passing the string on to the OS for I/O purposes is just one use case. Zero-terminating a Java string really doesn't right. If you need it zero-terminated, then you can always do this in your code by copying over the string in a static buffer or similar (as suggested somewhere else in this thread). This is by no means incorrect memory management, it only requires a little more thinking. /Roman -- http://kennke.org/blog/ From roman at kennke.org Fri Jan 25 17:54:01 2008 From: roman at kennke.org (Roman Kennke) Date: Fri, 25 Jan 2008 18:54:01 +0100 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: <1201282944.6482.30.camel@a1dmin.vola.spe.com.pl> References: <1201281794.6482.17.camel@a1dmin.vola.spe.com.pl> <1201282944.6482.30.camel@a1dmin.vola.spe.com.pl> Message-ID: <1201283641.9468.11.camel@mercury> Hi, > Please observe: > > 1. > the amount of memory needed to manage the allocation > is greater than the number of bytes > needed to store one additional character, > so the relative impact on memory usage will not be dramatic. This is just ridiculous. An average Java app has tons of Strings in them, most of which are _not_ used in GetStringChars. Allocating one additional jchar for each String surely _does_ impact. Especially on embedded systems (this is where I'm working on). > 2. The string usually has much more characters then one. > That means, if strings take 10 characters on the average, > the overhead is 10%, in the impossible worst case, as explained below. > This is an overhead I (and most programmers) can live with. Yeah, but not in the embedded/mobile world. > 3. Memory is allocated in chunks. > The size and alignment of the chunk is subject to various limitations. > If the characters of the string do not fill the chunk entirely, > there is good chance > that there will space for the terminating zero anyway. Yeah, and if not? I can only speak for Jamaica, where memory is allocated in chunks of 32 bytes, or 16 chars. There's a 1 out 16 (actually, 2 out of 16 because of some internal stuff) chance that there's no trailing space for the zero, so should we allocate another 32bytes, only to get this zero termination for a JNI method that's only rarely used? So much for the no-impact statement above... Cheers, Roman -- http://kennke.org/blog/ From rob.lougher at gmail.com Fri Jan 25 17:55:34 2008 From: rob.lougher at gmail.com (Robert Lougher) Date: Fri, 25 Jan 2008 17:55:34 +0000 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: <1201282944.6482.30.camel@a1dmin.vola.spe.com.pl> References: <1201281794.6482.17.camel@a1dmin.vola.spe.com.pl> <1201282944.6482.30.camel@a1dmin.vola.spe.com.pl> Message-ID: Hi Chris, On 1/25/08, Krzysztof ?elechowski wrote: > > Dnia 25-01-2008, Pt o godzinie 17:30 +0000, Robert Lougher pisze: > > No it doesn't. An implementation would have to be truly stupid to > > internally null-terminate. How many Strings are in the heap? How > > many will the programmer access via GetStringChars? The null will be > > a overhead for all Strings for a miniscule percentage. > > Please observe: > > 1. > the amount of memory needed to manage the allocation > is greater than the number of bytes > needed to store one additional character, > so the relative impact on memory usage will not be dramatic. > > 2. The string usually has much more characters then one. > That means, if strings take 10 characters on the average, > the overhead is 10%, in the impossible worst case, as explained below. > This is an overhead I (and most programmers) can live with. > > 3. Memory is allocated in chunks. > The size and alignment of the chunk is subject to various limitations. > If the characters of the string do not fill the chunk entirely, > there is good chance > that there will space for the terminating zero anyway. > Yes, you're absolutely right. However, consider for the sake of argument the memory manager aligned on 4 byte boundaries. Consider we have 4 strings. The first is 1 byte long, the second 2 bytes and so on. The first three strings will absorb the null due to alignment. The fourth however, will require an extra 4 bytes because of the same alignment. So we have a 4 byte overhead for 4 strings, or 1 byte per string. Rob. byte, 2 bytes, 3 bytes and 4 bytes. > Yours truly, > Chris > > From program.spe at home.pl Fri Jan 25 18:14:28 2008 From: program.spe at home.pl (Krzysztof =?UTF-8?Q?=C5=BBelechowski?=) Date: Fri, 25 Jan 2008 19:14:28 +0100 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: <1201283641.9468.11.camel@mercury> References: <1201281794.6482.17.camel@a1dmin.vola.spe.com.pl> <1201282944.6482.30.camel@a1dmin.vola.spe.com.pl> <1201283641.9468.11.camel@mercury> Message-ID: <1201284868.6482.45.camel@a1dmin.vola.spe.com.pl> Dnia 25-01-2008, Pt o godzinie 18:54 +0100, Roman Kennke pisze: > Hi, > > > Please observe: > > > > 1. > > the amount of memory needed to manage the allocation > > is greater than the number of bytes > > needed to store one additional character, > > so the relative impact on memory usage will not be dramatic. > > This is just ridiculous. An average Java app has tons of Strings in > them, most of which are _not_ used in GetStringChars. Allocating one > additional jchar for each String surely _does_ impact. Especially on > embedded systems (this is where I'm working on). I never said there will be no impact. Aside: wouldn't it be cheaper if the device worked without Java on it? (another ridiculous question, I am afraid) > > > 2. The string usually has much more characters then one. > > That means, if strings take 10 characters on the average, > > the overhead is 10%, in the impossible worst case, as explained below. > > This is an overhead I (and most programmers) can live with. > > Yeah, but not in the embedded/mobile world. Well, in that case it seems a fork is needed. The desktop code can assume that string buffers are z-term. The mobile code has to copy. > > > 3. Memory is allocated in chunks. > > The size and alignment of the chunk is subject to various limitations. > > If the characters of the string do not fill the chunk entirely, > > there is good chance > > that there will space for the terminating zero anyway. > > Yeah, and if not? I can only speak for Jamaica, where memory is > allocated in chunks of 32 bytes, or 16 chars. There's a 1 out 16 > (actually, 2 out of 16 because of some internal stuff) chance that > there's no trailing space for the zero, so should we allocate another > 32bytes, only to get this zero termination for a JNI method that's only > rarely used? So much for the no-impact statement above... So that accumulated memory cost is linear in the numer of strings? Good point, statement 3 is invalid. Chris From program.spe at home.pl Fri Jan 25 18:18:34 2008 From: program.spe at home.pl (Krzysztof =?UTF-8?Q?=C5=BBelechowski?=) Date: Fri, 25 Jan 2008 19:18:34 +0100 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: <1201283094.9468.4.camel@mercury> References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com> <1200952864.6264.53.camel@mercury> <47952072.2060600@sun.com> <1200956262.6264.65.camel@mercury> <1200987359.6488.3.camel@a1dmin.vola.spe.com.pl> <1201267690.6482.4.camel@a1dmin.vola.spe.com.pl> <1201281591.6277.88.camel@mercury> <1201282009.6482.20.camel@a1dmin.vola.spe.com.pl> <1201283094.9468.4.camel@mercury> Message-ID: <1201285114.6482.46.camel@a1dmin.vola.spe.com.pl> Dnia 25-01-2008, Pt o godzinie 18:44 +0100, Roman Kennke pisze: > Heyo, > > > > The specification is buggy > > > > in that it does not take into account the operating system interface > > > > and makes correct memory management inefficient > > > > for the benefit of sparing one byte per buffer > > > > where an OS call is not needed. > > > > Ridiculous. > > > > > > Tom Tromey pointed out another possible problem on IRC: What if the > > > string itself contains the 0? Unlikely, but possible in the Java world. > > > > I understand that parameters passed to the OS > > are subject to the limitations of the OS. > > Not containing a zero inside may be just one of them. > > The Java specification claims nowhere > > that every string can be used to name every object. > > Yeah, but GetStringChars() is a general purpuse JNI function and not at > all tied to the OS. Passing the string on to the OS for I/O purposes is > just one use case. Zero-terminating a Java string really doesn't right. > If you need it zero-terminated, then you can always do this in your code > by copying over the string in a static buffer or similar (as suggested > somewhere else in this thread). This is by no means incorrect memory > management, it only requires a little more thinking. > Static buffers are not re?ntrant and unwieldy: they are either too large or too small. It has been argued that excessive copying is inefficient and can be easily avoided with proper setup. Chris From rob.lougher at gmail.com Fri Jan 25 19:01:22 2008 From: rob.lougher at gmail.com (Robert Lougher) Date: Fri, 25 Jan 2008 19:01:22 +0000 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: <1201284868.6482.45.camel@a1dmin.vola.spe.com.pl> References: <1201281794.6482.17.camel@a1dmin.vola.spe.com.pl> <1201282944.6482.30.camel@a1dmin.vola.spe.com.pl> <1201283641.9468.11.camel@mercury> <1201284868.6482.45.camel@a1dmin.vola.spe.com.pl> Message-ID: Hi, This is getting a bit hostile for no reason.... Thinking about alignment gives an interesting solution. 1) Strings are not null-terminated 2) For most strings the alignment gives the VM room to terminate in place when GetStringChars is called 3) Copy strings that can't be terminated in place. On average, you'll need to copy 1/ strings. Rob. From linuxhippy at gmail.com Fri Jan 25 19:29:44 2008 From: linuxhippy at gmail.com (Clemens Eisserer) Date: Fri, 25 Jan 2008 20:29:44 +0100 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: References: <1201281794.6482.17.camel@a1dmin.vola.spe.com.pl> <1201282944.6482.30.camel@a1dmin.vola.spe.com.pl> <1201283641.9468.11.camel@mercury> <1201284868.6482.45.camel@a1dmin.vola.spe.com.pl> Message-ID: <194f62550801251129p625dac1cm28e3c53de3c7dde8@mail.gmail.com> Hi there, > This is getting a bit hostile for no reason.... Thinking about > alignment gives an interesting solution. > > 1) Strings are not null-terminated > 2) For most strings the alignment gives the VM room to terminate in > place when GetStringChars is called > 3) Copy strings that can't be terminated in place. However GetStringChars() as far as I know always returns a copy because hotspot does not support pinning (or at least I think so) - at least for the moving GCs. So if one byte more is allocated or not on the JNI side should not make much difference even if its never needed. lg Clemens From rob.lougher at gmail.com Fri Jan 25 16:45:51 2008 From: rob.lougher at gmail.com (Robert Lougher) Date: Fri, 25 Jan 2008 08:45:51 -0800 (PST) Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: <1201267690.6482.4.camel@a1dmin.vola.spe.com.pl> References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com> <1200952864.6264.53.camel@mercury> <47952072.2060600@sun.com> <1200956262.6264.65.camel@mercury> <1200987359.6488.3.camel@a1dmin.vola.spe.com.pl> <1201267690.6482.4.camel@a1dmin.vola.spe.com.pl> Message-ID: <15091812.post@talk.nabble.com> Hi, Krzysztof ?elechowski-2 wrote: > > > Dnia 25-01-2008, Pt o godzinie 13:14 +0000, Mark Wielaard pisze: >> Krzysztof ?elechowski writes: >> > If the specification gets fixed so that GSC result MUST be z-term, >> > your VM will cease being conformant >> > so it will be fixed and no additional buffers will be needed. >> >> Eh, that doesn't seem right at all. >> The specification currently doesn't guarantee that the result is a jchar >> array >> that is zero terminated. So you can expect current runtimes not to do >> this. As >> Roman said at least JamaicaVM doesn't do this. I just checked the >> implementations gcj and jamvm, they both also don't make any such >> guarantee >> (cacao does seem to add an extra 0 at the end of the result it returns >> though). >> So "clarifying the spec" would break a lot of code of currently >> conforming >> implementations. The code relying on this behavior seems to be just buggy >> and >> should be fixed imho. > > The specification is buggy > in that it does not take into account the operating system interface > and makes correct memory management inefficient > for the benefit of sparing one byte per buffer > where an OS call is not needed. > Ridiculous. > The developers at Sun > found the correct way to interpreting the specification; > the other ones followed it blindfolded. It is now time to repent. > Wrong! Requiring null termimation will make things more inefficient. This is because Strings within Java are not null-terminated. So to add the null the VM will have to copy the String chars into a new buffer. A more efficient approach is to simply return a pointer to the String chars themselves. However, this will not be null-terminated. The JNI specification allows a VM to either copy the chars or return a direct pointer. The extra isCopy parameter can be used to find out what it did. The point is, if the programmer doesn't need a null-terminated string, not copying is _much_ more efficient. The programmer can always copy and add the null if they need to. But forcing the VM to null-terminate will require a copy and slow it down it all cases. If I was updating the spec, I would change it so that if a copy is returned it is always null terminated. If it isn't a copy then it may or may not be. It's likely no VMs will need changing, as I suspect the ones that do not null-terminate are returning direct pointers (e.g. JamVM). And I doubt Sun makes a copy because of the null. Giving out direct heap pointers causes problems for VMs that move objects within the heap (e.g. a compacting GC). Either you've got to "pin" the object so it can't move or you always copy. Sun probably chose the latter. In JamVM, I decided to pin the String (it's unpinned in ReleaseStringChars). Rob. P.S. I hope your blindfold has been removed :) When implementing a VM few things are as straight-forward as they may seem. -- View this message in context: http://www.nabble.com/Null-terminated-Unicode-strings-in-java.io-on-Windows-tp15006673p15091812.html Sent from the OpenJDK Core Libraries mailing list archive at Nabble.com. From matthias at mernst.org Fri Jan 25 19:52:42 2008 From: matthias at mernst.org (Matthias Ernst) Date: Fri, 25 Jan 2008 20:52:42 +0100 Subject: Selector cleanup In-Reply-To: <22ec15240801251151h1ddb1a44oe037096586596d79@mail.gmail.com> References: <47987F18.4040309@univ-mlv.fr> <47988E8E.3010403@sun.com> <4798B1C0.9020907@univ-mlv.fr> <4798C1BF.8010205@sun.com> <4799E21D.6050908@univ-mlv.fr> <4799EB3D.5070807@sun.com> <22ec15240801251151h1ddb1a44oe037096586596d79@mail.gmail.com> Message-ID: <22ec15240801251152g5b5142a9o4737d9990b8a9a3d@mail.gmail.com> On Jan 25, 2008 2:59 PM, Alan Bateman wrote: > > I not agree, reading the code, idle set is used when setInterestOps(0) > > is called. > > I'm not sure that case is not frequent. > I've only observed it on a few occasions. Hmm? I thought a boilerplate selection loop looks more or less like this: while(true) { select(); for k in selectedKeys: { k.setInterestOps(0); <=========== pool.execute({ newInterest = handle(k); k.interestOps(newInterest); }); } } Matthias From rob.lougher at gmail.com Fri Jan 25 19:54:33 2008 From: rob.lougher at gmail.com (Robert Lougher) Date: Fri, 25 Jan 2008 19:54:33 +0000 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: <194f62550801251129p625dac1cm28e3c53de3c7dde8@mail.gmail.com> References: <1201281794.6482.17.camel@a1dmin.vola.spe.com.pl> <1201282944.6482.30.camel@a1dmin.vola.spe.com.pl> <1201283641.9468.11.camel@mercury> <1201284868.6482.45.camel@a1dmin.vola.spe.com.pl> <194f62550801251129p625dac1cm28e3c53de3c7dde8@mail.gmail.com> Message-ID: On 1/25/08, Clemens Eisserer wrote: > Hi there, > > > This is getting a bit hostile for no reason.... Thinking about > > alignment gives an interesting solution. > > > > 1) Strings are not null-terminated > > 2) For most strings the alignment gives the VM room to terminate in > > place when GetStringChars is called > > 3) Copy strings that can't be terminated in place. > > However GetStringChars() as far as I know always returns a copy > because hotspot does not support pinning (or at least I think so) - at > least for the moving GCs. So if one byte more is allocated or not on > the JNI side should not make much difference even if its never needed. > Yes, I already mentioned Sun probably chose to copy to avoid pinning. I did the opposite in JamVM and pinned the String to avoid the copy. It appears that other VMs such as gcj and Jamaica also do not copy the string in GetStringChars (however, I do not know if they have a moving GC or not). The above was a solution to the problem of null-terminating the string chars without having to copy. Rob. > lg Clemens > From roman at kennke.org Fri Jan 25 20:37:41 2008 From: roman at kennke.org (Roman Kennke) Date: Fri, 25 Jan 2008 21:37:41 +0100 Subject: [PATCH] Move Solaris specific classes to solaris/ Message-ID: <1201293461.9468.21.camel@mercury> Hi, there are some classes in the jdk/share tree, that seem to be Solaris specific. I suggest moving them to the jdk/solaris tree instead. Or am I wrong here? /Roman -- http://kennke.org/blog/ -------------- next part -------------- # HG changeset patch # User Roman Kennke # Date 1201293270 -3600 # Node ID db9384d2f46857b26ae306b4a0e1d25a049c634e # Parent 2b6c2ce8cd88445d9e3ea709069bf26d53039223 Moved Solaris specific NIO Java classes to the solaris subdir diff -r 2b6c2ce8cd88 -r db9384d2f468 src/share/classes/sun/nio/ch/AbstractPollSelectorImpl.java --- a/src/share/classes/sun/nio/ch/AbstractPollSelectorImpl.java Tue Dec 18 15:30:58 2007 +0100 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,187 +0,0 @@ -/* - * Copyright 2001-2004 Sun Microsystems, Inc. All Rights Reserved. - * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. - * - * This code is free software; you can redistribute it and/or modify it - * under the terms of the GNU General Public License version 2 only, as - * published by the Free Software Foundation. Sun designates this - * particular file as subject to the "Classpath" exception as provided - * by Sun in the LICENSE file that accompanied this code. - * - * This code is distributed in the hope that it will be useful, but WITHOUT - * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or - * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License - * version 2 for more details (a copy is included in the LICENSE file that - * accompanied this code). - * - * You should have received a copy of the GNU General Public License version - * 2 along with this work; if not, write to the Free Software Foundation, - * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. - * - * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, - * CA 95054 USA or visit www.sun.com if you need additional information or - * have any questions. - */ - -package sun.nio.ch; - -import java.io.IOException; -import java.nio.channels.*; -import java.nio.channels.spi.*; -import java.util.*; -import sun.misc.*; - - -/** - * An abstract selector impl. - */ - -abstract class AbstractPollSelectorImpl - extends SelectorImpl -{ - - // The poll fd array - PollArrayWrapper pollWrapper; - - // Initial capacity of the pollfd array - protected final int INIT_CAP = 10; - - // The list of SelectableChannels serviced by this Selector - protected SelectionKeyImpl[] channelArray; - - // In some impls the first entry of channelArray is bogus - protected int channelOffset = 0; - - // The number of valid channels in this Selector's poll array - protected int totalChannels; - - // True if this Selector has been closed - private boolean closed = false; - - AbstractPollSelectorImpl(SelectorProvider sp, int channels, int offset) { - super(sp); - this.totalChannels = channels; - this.channelOffset = offset; - } - - void putEventOps(SelectionKeyImpl sk, int ops) { - pollWrapper.putEventOps(sk.getIndex(), ops); - } - - public Selector wakeup() { - pollWrapper.interrupt(); - return this; - } - - protected abstract int doSelect(long timeout) throws IOException; - - protected void implClose() throws IOException { - if (!closed) { - closed = true; - // Deregister channels - for(int i=channelOffset; i= 0); - if (i != totalChannels - 1) { - // Copy end one over it - SelectionKeyImpl endChannel = channelArray[totalChannels-1]; - channelArray[i] = endChannel; - endChannel.setIndex(i); - pollWrapper.release(i); - PollArrayWrapper.replaceEntry(pollWrapper, totalChannels - 1, - pollWrapper, i); - } else { - pollWrapper.release(i); - } - // Destroy the last one - channelArray[totalChannels-1] = null; - totalChannels--; - pollWrapper.totalChannels--; - ski.setIndex(-1); - // Remove the key from keys and selectedKeys - keys.remove(ski); - selectedKeys.remove(ski); - deregister((AbstractSelectionKey)ski); - SelectableChannel selch = ski.channel(); - if (!selch.isOpen() && !selch.isRegistered()) - ((SelChImpl)selch).kill(); - } - - static { - Util.load(); - } - -} diff -r 2b6c2ce8cd88 -r db9384d2f468 src/share/classes/sun/nio/ch/DevPollSelectorProvider.java --- a/src/share/classes/sun/nio/ch/DevPollSelectorProvider.java Tue Dec 18 15:30:58 2007 +0100 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,42 +0,0 @@ -/* - * Copyright 2001-2003 Sun Microsystems, Inc. All Rights Reserved. - * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. - * - * This code is free software; you can redistribute it and/or modify it - * under the terms of the GNU General Public License version 2 only, as - * published by the Free Software Foundation. Sun designates this - * particular file as subject to the "Classpath" exception as provided - * by Sun in the LICENSE file that accompanied this code. - * - * This code is distributed in the hope that it will be useful, but WITHOUT - * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or - * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License - * version 2 for more details (a copy is included in the LICENSE file that - * accompanied this code). - * - * You should have received a copy of the GNU General Public License version - * 2 along with this work; if not, write to the Free Software Foundation, - * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. - * - * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, - * CA 95054 USA or visit www.sun.com if you need additional information or - * have any questions. - */ - -package sun.nio.ch; - -import java.io.IOException; -import java.nio.channels.*; -import java.nio.channels.spi.*; - -public class DevPollSelectorProvider - extends SelectorProviderImpl -{ - public AbstractSelector openSelector() throws IOException { - return new DevPollSelectorImpl(this); - } - - public Channel inheritedChannel() throws IOException { - return InheritedChannel.getChannel(); - } -} diff -r 2b6c2ce8cd88 -r db9384d2f468 src/share/classes/sun/nio/ch/PollSelectorProvider.java --- a/src/share/classes/sun/nio/ch/PollSelectorProvider.java Tue Dec 18 15:30:58 2007 +0100 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,42 +0,0 @@ -/* - * Copyright 2001-2003 Sun Microsystems, Inc. All Rights Reserved. - * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. - * - * This code is free software; you can redistribute it and/or modify it - * under the terms of the GNU General Public License version 2 only, as - * published by the Free Software Foundation. Sun designates this - * particular file as subject to the "Classpath" exception as provided - * by Sun in the LICENSE file that accompanied this code. - * - * This code is distributed in the hope that it will be useful, but WITHOUT - * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or - * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License - * version 2 for more details (a copy is included in the LICENSE file that - * accompanied this code). - * - * You should have received a copy of the GNU General Public License version - * 2 along with this work; if not, write to the Free Software Foundation, - * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. - * - * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, - * CA 95054 USA or visit www.sun.com if you need additional information or - * have any questions. - */ - -package sun.nio.ch; - -import java.io.IOException; -import java.nio.channels.*; -import java.nio.channels.spi.*; - -public class PollSelectorProvider - extends SelectorProviderImpl -{ - public AbstractSelector openSelector() throws IOException { - return new PollSelectorImpl(this); - } - - public Channel inheritedChannel() throws IOException { - return InheritedChannel.getChannel(); - } -} diff -r 2b6c2ce8cd88 -r db9384d2f468 src/solaris/classes/sun/nio/ch/AbstractPollSelectorImpl.java --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/src/solaris/classes/sun/nio/ch/AbstractPollSelectorImpl.java Fri Jan 25 21:34:30 2008 +0100 @@ -0,0 +1,187 @@ +/* + * Copyright 2001-2004 Sun Microsystems, Inc. All Rights Reserved. + * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. + * + * This code is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 only, as + * published by the Free Software Foundation. Sun designates this + * particular file as subject to the "Classpath" exception as provided + * by Sun in the LICENSE file that accompanied this code. + * + * This code is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + * version 2 for more details (a copy is included in the LICENSE file that + * accompanied this code). + * + * You should have received a copy of the GNU General Public License version + * 2 along with this work; if not, write to the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. + * + * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, + * CA 95054 USA or visit www.sun.com if you need additional information or + * have any questions. + */ + +package sun.nio.ch; + +import java.io.IOException; +import java.nio.channels.*; +import java.nio.channels.spi.*; +import java.util.*; +import sun.misc.*; + + +/** + * An abstract selector impl. + */ + +abstract class AbstractPollSelectorImpl + extends SelectorImpl +{ + + // The poll fd array + PollArrayWrapper pollWrapper; + + // Initial capacity of the pollfd array + protected final int INIT_CAP = 10; + + // The list of SelectableChannels serviced by this Selector + protected SelectionKeyImpl[] channelArray; + + // In some impls the first entry of channelArray is bogus + protected int channelOffset = 0; + + // The number of valid channels in this Selector's poll array + protected int totalChannels; + + // True if this Selector has been closed + private boolean closed = false; + + AbstractPollSelectorImpl(SelectorProvider sp, int channels, int offset) { + super(sp); + this.totalChannels = channels; + this.channelOffset = offset; + } + + void putEventOps(SelectionKeyImpl sk, int ops) { + pollWrapper.putEventOps(sk.getIndex(), ops); + } + + public Selector wakeup() { + pollWrapper.interrupt(); + return this; + } + + protected abstract int doSelect(long timeout) throws IOException; + + protected void implClose() throws IOException { + if (!closed) { + closed = true; + // Deregister channels + for(int i=channelOffset; i= 0); + if (i != totalChannels - 1) { + // Copy end one over it + SelectionKeyImpl endChannel = channelArray[totalChannels-1]; + channelArray[i] = endChannel; + endChannel.setIndex(i); + pollWrapper.release(i); + PollArrayWrapper.replaceEntry(pollWrapper, totalChannels - 1, + pollWrapper, i); + } else { + pollWrapper.release(i); + } + // Destroy the last one + channelArray[totalChannels-1] = null; + totalChannels--; + pollWrapper.totalChannels--; + ski.setIndex(-1); + // Remove the key from keys and selectedKeys + keys.remove(ski); + selectedKeys.remove(ski); + deregister((AbstractSelectionKey)ski); + SelectableChannel selch = ski.channel(); + if (!selch.isOpen() && !selch.isRegistered()) + ((SelChImpl)selch).kill(); + } + + static { + Util.load(); + } + +} diff -r 2b6c2ce8cd88 -r db9384d2f468 src/solaris/classes/sun/nio/ch/DevPollSelectorProvider.java --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/src/solaris/classes/sun/nio/ch/DevPollSelectorProvider.java Fri Jan 25 21:34:30 2008 +0100 @@ -0,0 +1,42 @@ +/* + * Copyright 2001-2003 Sun Microsystems, Inc. All Rights Reserved. + * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. + * + * This code is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 only, as + * published by the Free Software Foundation. Sun designates this + * particular file as subject to the "Classpath" exception as provided + * by Sun in the LICENSE file that accompanied this code. + * + * This code is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + * version 2 for more details (a copy is included in the LICENSE file that + * accompanied this code). + * + * You should have received a copy of the GNU General Public License version + * 2 along with this work; if not, write to the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. + * + * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, + * CA 95054 USA or visit www.sun.com if you need additional information or + * have any questions. + */ + +package sun.nio.ch; + +import java.io.IOException; +import java.nio.channels.*; +import java.nio.channels.spi.*; + +public class DevPollSelectorProvider + extends SelectorProviderImpl +{ + public AbstractSelector openSelector() throws IOException { + return new DevPollSelectorImpl(this); + } + + public Channel inheritedChannel() throws IOException { + return InheritedChannel.getChannel(); + } +} diff -r 2b6c2ce8cd88 -r db9384d2f468 src/solaris/classes/sun/nio/ch/PollSelectorProvider.java --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/src/solaris/classes/sun/nio/ch/PollSelectorProvider.java Fri Jan 25 21:34:30 2008 +0100 @@ -0,0 +1,42 @@ +/* + * Copyright 2001-2003 Sun Microsystems, Inc. All Rights Reserved. + * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. + * + * This code is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 only, as + * published by the Free Software Foundation. Sun designates this + * particular file as subject to the "Classpath" exception as provided + * by Sun in the LICENSE file that accompanied this code. + * + * This code is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + * version 2 for more details (a copy is included in the LICENSE file that + * accompanied this code). + * + * You should have received a copy of the GNU General Public License version + * 2 along with this work; if not, write to the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. + * + * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, + * CA 95054 USA or visit www.sun.com if you need additional information or + * have any questions. + */ + +package sun.nio.ch; + +import java.io.IOException; +import java.nio.channels.*; +import java.nio.channels.spi.*; + +public class PollSelectorProvider + extends SelectorProviderImpl +{ + public AbstractSelector openSelector() throws IOException { + return new PollSelectorImpl(this); + } + + public Channel inheritedChannel() throws IOException { + return InheritedChannel.getChannel(); + } +} From Alan.Bateman at Sun.COM Fri Jan 25 21:10:14 2008 From: Alan.Bateman at Sun.COM (Alan Bateman) Date: Fri, 25 Jan 2008 21:10:14 +0000 Subject: [PATCH] Move Solaris specific classes to solaris/ In-Reply-To: <1201293461.9468.21.camel@mercury> References: <1201293461.9468.21.camel@mercury> Message-ID: <479A5036.9060701@sun.com> Roman Kennke wrote: > Hi, > > there are some classes in the jdk/share tree, that seem to be Solaris > specific. I suggest moving them to the jdk/solaris tree instead. Or am I > wrong here? > > /Roman > > Yes, they should be in the src/solaris tree (although only DevPollSelectorProvider is Solaris specific). -Alan. From mark at klomp.org Fri Jan 25 22:16:51 2008 From: mark at klomp.org (Mark Wielaard) Date: Fri, 25 Jan 2008 22:16:51 +0000 (UTC) Subject: Null-terminated Unicode strings in java.io on Windows References: <1201281794.6482.17.camel@a1dmin.vola.spe.com.pl> <1201282944.6482.30.camel@a1dmin.vola.spe.com.pl> <1201283641.9468.11.camel@mercury> <1201284868.6482.45.camel@a1dmin.vola.spe.com.pl> Message-ID: Hi Robert, Robert Lougher writes: > This is getting a bit hostile for no reason.... Thinking about > alignment gives an interesting solution. > > 1) Strings are not null-terminated > 2) For most strings the alignment gives the VM room to terminate in > place when GetStringChars is called > 3) Copy strings that can't be terminated in place. Note that Strings have a backing [j]char array which can be shared between different Strings, and often are when read in in one go and then split in different sub-String objects. All these Strings have a shared slice of this backing jchar array, so there isn't any place to terminate it because that place will overlap with another slice that can belong to another String. You should know, because I learned all I know about this and pinning of the backing storage of a String (not the String object itself) by reading your jamvm code! :) BTW. I would really recommend anybody wanting to know how the VM and JNI specs truly work/can be implemented in practice take a look at jamvm, it is a truly remarkable clear, concise and small implementation. Nothing bad about other runtimes, but jamvm is small enough that you can read the code, sit down with the spec and compare them almost directly to get a really nice insight in how things are/can be done. Cheers, Mark From rob.lougher at gmail.com Sat Jan 26 00:00:51 2008 From: rob.lougher at gmail.com (Robert Lougher) Date: Sat, 26 Jan 2008 00:00:51 +0000 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: References: <1201281794.6482.17.camel@a1dmin.vola.spe.com.pl> <1201282944.6482.30.camel@a1dmin.vola.spe.com.pl> <1201283641.9468.11.camel@mercury> <1201284868.6482.45.camel@a1dmin.vola.spe.com.pl> Message-ID: Hi Mark, On Jan 25, 2008 10:16 PM, Mark Wielaard wrote: > Hi Robert, > > Robert Lougher writes: > > This is getting a bit hostile for no reason.... Thinking about > > alignment gives an interesting solution. > > > > 1) Strings are not null-terminated > > 2) For most strings the alignment gives the VM room to terminate in > > place when GetStringChars is called > > 3) Copy strings that can't be terminated in place. > > Note that Strings have a backing [j]char array which can be shared between > different Strings, and often are when read in in one go and then split in > different sub-String objects. All these Strings have a shared slice of this > backing jchar array, so there isn't any place to terminate it because that place > will overlap with another slice that can belong to another String. > Whoops, you're right :) > You should know, because I learned all I know about this and pinning of the > backing storage of a String (not the String object itself) by reading your jamvm > code! :) > > BTW. I would really recommend anybody wanting to know how the VM and JNI specs > truly work/can be implemented in practice take a look at jamvm, it is a truly > remarkable clear, concise and small implementation. Nothing bad about other > runtimes, but jamvm is small enough that you can read the code, sit down with > the spec and compare them almost directly to get a really nice insight in how > things are/can be done. > How many beers did we agree I'll buy you at FOSDEM? ;) Rob. > Cheers, > > Mark > > From program.spe at home.pl Mon Jan 28 08:24:26 2008 From: program.spe at home.pl (Krzysztof =?UTF-8?Q?=C5=BBelechowski?=) Date: Mon, 28 Jan 2008 09:24:26 +0100 Subject: Null-terminated Unicode strings in java.io on Windows In-Reply-To: References: <1201281794.6482.17.camel@a1dmin.vola.spe.com.pl> <1201282944.6482.30.camel@a1dmin.vola.spe.com.pl> <1201283641.9468.11.camel@mercury> <1201284868.6482.45.camel@a1dmin.vola.spe.com.pl> Message-ID: <1201508666.6550.9.camel@a1dmin.vola.spe.com.pl> Dnia 25-01-2008, Pt o godzinie 22:16 +0000, Mark Wielaard pisze: > Hi Robert, > > Robert Lougher writes: > > This is getting a bit hostile for no reason.... Thinking about > > alignment gives an interesting solution. > > > > 1) Strings are not null-terminated > > 2) For most strings the alignment gives the VM room to terminate in > > place when GetStringChars is called > > 3) Copy strings that can't be terminated in place. > > Note that Strings have a backing [j]char array which can be shared between > different Strings, and often are when read in in one go and then split in > different sub-String objects. All these Strings have a shared slice of this > backing jchar array, so there isn't any place to terminate it because that place > will overlap with another slice that can belong to another String. That changes the picture dramatically. Indeed, taking a prefix of an unmodifiable string requires copying data, as of the C language. I should have thought of that earlier, particularly because I gave run into this problem when I tried to make the source code for gmake straight. I suppose K&R (or whoever they inherited the concept after) chose to z-term strings because they found out that writing a 0 at the end takes less memory than keeping a separate pointer to the end. (This is true for character data only, that is why strings are so special). Sorry for wasting your time. Chris From msa at allman.ms Wed Jan 30 09:20:06 2008 From: msa at allman.ms (Michael Allman) Date: Wed, 30 Jan 2008 01:20:06 -0800 (PST) Subject: purpose of FileDispatcher.preClose() Message-ID: <20080130011129.U28274@yvyyl.pfbsg.arg> Hello, Can someone with knowledge of such matters explain what FileDispatcher.preClose() is supposed to do on Solaris/Linux. I mean, I see the code, but I don't understand why it exists or what problem it's supposed to avoid or something. I ask because I'm trying to fix a file-locking problem on soylatte and it seems the solution to that problem is to remove this code (on that platform). But before I charge ahead, I need a better understanding of why this code exists. In particular, I'm really interested in the stuff that happens in FileDispatcher.c, functions Java_sun_nio_ch_FileDispatcher_init and Java_sun_nio_ch_FileDispatcher_preClose0. They're setting something up that looks important, but I just don't get it. Cheers, Michael From Alan.Bateman at Sun.COM Wed Jan 30 11:35:07 2008 From: Alan.Bateman at Sun.COM (Alan Bateman) Date: Wed, 30 Jan 2008 11:35:07 +0000 Subject: purpose of FileDispatcher.preClose() In-Reply-To: <20080130011129.U28274@yvyyl.pfbsg.arg> References: <20080130011129.U28274@yvyyl.pfbsg.arg> Message-ID: <47A060EB.8080200@sun.com> Michael Allman wrote: > Hello, > > Can someone with knowledge of such matters explain what > FileDispatcher.preClose() is supposed to do on Solaris/Linux. I mean, > I see the code, but I don't understand why it exists or what problem > it's supposed to avoid or something. > > I ask because I'm trying to fix a file-locking problem on soylatte and > it seems the solution to that problem is to remove this code (on that > platform). But before I charge ahead, I need a better understanding > of why this code exists. > > In particular, I'm really interested in the stuff that happens in > FileDispatcher.c, functions Java_sun_nio_ch_FileDispatcher_init and > Java_sun_nio_ch_FileDispatcher_preClose0. They're setting something > up that looks important, but I just don't get it. In a multi-threaded application it is always difficult to know when you can safely close and release a file descriptor (or other resource). If one thread is using a file descriptor to read or write and another thread releases (closes) it then it it possible for the first thread to read or write to the wrong file or socket in the event that the file descriptor is recycled quickly. The approach that we use in both classic networking and NIO is to use a two-step process. In the first step we duplicate (dup2) the file descriptor to another that is one end of a half shutdown socket pair. Other threads that are reading or writing but haven't called the read or write system calls yet will get an immediate EOF or pipe error when they do so. As the threads complete the read or write method then they examine their state. If there is a close pending then the last one releases the file descriptor. Hopefully this brief overview gives you some idea what this code is about. The FileDescriptor#init method is where the socketpair is created, and that preClose0 method does the dup2. I haven't been following the Soylatte port very closely so I'm curious what problem you are seeing - when you say "file locking" do you mean FileChannel#lock? If so then the issue may be that the asynchronous close mechanism isn't completely extended to FileChannel yet. -Alan. From msa at allman.ms Wed Jan 30 06:30:14 2008 From: msa at allman.ms (Michael Allman) Date: Tue, 29 Jan 2008 22:30:14 -0800 (PST) Subject: [PATCH] FileChannelImpl.c.Java_sun_nio_ch_FileChannelImpl_truncate0 Message-ID: <20080129220705.T49011@yvyyl.pfbsg.arg> This must have been on somebody's plate for a long time. Attached please find a patch to correct an apparently unreported bug. At least, I couldn't find one. The problem is that if a FileChannel is truncated and its position was previously set beyond the new length of the file, the position should be but isn't set to the new length of the file. Heads up. I have kinda sorta tested this patch. I run a Mac OS X Leopard system. I have tested this patch on that system, as applied to the soylatte source code repository. More info on soylatte here: http://landonf.bikemonkey.org/static/soylatte/. The gist of it is that soylatte is a port of Sun's JDK 6 to Mac OS X. My test procedure was as follows: 1. Get jdk7/jdk/test/java/nio/channels/FileChannel/Truncate.java from the OpenJDK repository. 2. Compile and run Truncate on soylatte 1.0.1 (which is based on Sun's JDK 6 something). Test reports failure as such: Exception in thread "main" java.lang.RuntimeException: Position greater than size at Truncate.main(Truncate.java:68) 3. Run Truncate on a patched version of soylatte (patch essentially identical to attached file). Test completes normally without output. I guess this means it passed. I'm sending this in as a patch to OpenJDK and not soylatte because I know this is a problem on Solaris, too. That is, I ran Truncate on jdk6u4 on solaris 11 and it failed. Obviously, this is not the only way to fix this problem. We could also do this with a patch to FileChannelImpl.java. I'll let whoever's in charge here make that call. So, I hope this is helpful. I am ready and willing to respond to feedback. I have tried to follow the guidelines in http://openjdk.java.net/contribute/. Cheers, Michael (CCing Landon Fuller because he runs the Soylatte project.) -------------- next part -------------- A non-text attachment was scrubbed... Name: truncate.patch Type: text/x-diff Size: 1406 bytes Desc: URL: From Tim.Bell at Sun.COM Wed Jan 30 19:44:20 2008 From: Tim.Bell at Sun.COM (Tim Bell) Date: Wed, 30 Jan 2008 11:44:20 -0800 Subject: [PATCH] FileChannelImpl.c.Java_sun_nio_ch_FileChannelImpl_truncate0 In-Reply-To: <20080129220705.T49011@yvyyl.pfbsg.arg> References: <20080129220705.T49011@yvyyl.pfbsg.arg> Message-ID: <47A0D394.8010603@sun.com> Hi Michael Allman wrote: > This must have been on somebody's plate for a long time. > > Attached please find a patch Thanks for sending your suggested fix our way. I will do some additional searching to see if I can locate an existing Bug-ID for this issue. I don't find your name on the SCA list: https://sca.dev.java.net/CA_signatories Please sign the Sun Contributor's Agreement. You will find the latest version of the SCA here: http://www.sun.com/software/opensource/sca.pdf The FAQ about the SCA and its ramifications is here: http://www.sun.com/software/opensource/contributor_agreement.jsp After reading and signing the agreement, fax it to +1-408-715-2540, or scan it and e-mail the result to sun_ca (at) sun.com. If you have already done this, please contact me offline. If may be sitting in our queue of incoming SCAs. Thanks, and Best Regards- Tim Bell From msa at allman.ms Wed Jan 30 20:14:11 2008 From: msa at allman.ms (Michael Allman) Date: Wed, 30 Jan 2008 12:14:11 -0800 (PST) Subject: purpose of FileDispatcher.preClose() In-Reply-To: <47A060EB.8080200@sun.com> References: <20080130011129.U28274@yvyyl.pfbsg.arg> <47A060EB.8080200@sun.com> Message-ID: <20080130114658.S99697@yvyyl.pfbsg.arg> On Wed, 30 Jan 2008, Alan Bateman wrote: > Michael Allman wrote: >> Hello, >> >> Can someone with knowledge of such matters explain what >> FileDispatcher.preClose() is supposed to do on Solaris/Linux. I mean, I >> see the code, but I don't understand why it exists or what problem it's >> supposed to avoid or something. >> >> I ask because I'm trying to fix a file-locking problem on soylatte and it >> seems the solution to that problem is to remove this code (on that >> platform). But before I charge ahead, I need a better understanding of why >> this code exists. >> >> In particular, I'm really interested in the stuff that happens in >> FileDispatcher.c, functions Java_sun_nio_ch_FileDispatcher_init and >> Java_sun_nio_ch_FileDispatcher_preClose0. They're setting something up >> that looks important, but I just don't get it. > In a multi-threaded application it is always difficult to know when you can > safely close and release a file descriptor (or other resource). If one thread > is using a file descriptor to read or write and another thread releases > (closes) it then it it possible for the first thread to read or write to the > wrong file or socket in the event that the file descriptor is recycled > quickly. The approach that we use in both classic networking and NIO is to > use a two-step process. In the first step we duplicate (dup2) the file > descriptor to another that is one end of a half shutdown socket pair. Other > threads that are reading or writing but haven't called the read or write > system calls yet will get an immediate EOF or pipe error when they do so. As > the threads complete the read or write method then they examine their state. > If there is a close pending then the last one releases the file descriptor. > Hopefully this brief overview gives you some idea what this code is about. > The FileDescriptor#init method is where the socketpair is created, and that > preClose0 method does the dup2. I haven't been following the Soylatte port > very closely so I'm curious what problem you are seeing - when you say "file > locking" do you mean FileChannel#lock? If so then the issue may be that the > asynchronous close mechanism isn't completely extended to FileChannel yet. I think I get it. So let me explain the problem I'm seeing here. If I close a file channel on which I have acquired (but not released) a file lock, I get an IOException: Bad file descriptor. For example, the Lock regression test does this and fails (on soylatte). I think the problem here is that FileChannelImpl.implCloseChannel() calls nd.preClose(fd) before the block that releases its file locks. On non-windows, nd.preClose(fd) doesn't just "pre close" fd, it closes it. Then implCloseChannel() tries to release its file locks. fd now points to a socket descriptor and on Solaris/Linux, such attempt seems to be harmless. On Mac OS X, it complains with the EBADF error code. It seems that the preClose semantics are not correctly handled by the FileChannelImpl.implCloseChannel() method. On non-windows, it attempts to release file locks that no longer exist (because preClose() releases them). It seems that the file lock release block should be moved into NativeDispatcher.preClose(). It will be run on Windows, but will not be run on non-Windows. That seems correct to me, given that on non-Windows, preClose0 releases the file locks. Obviously, this kind of change is much more than a soylatte patch. It changes code that already works on Windows, Solaris, and Linux. But if my analysis is correct, it looks like it's just a silent bug. Thoughts? Michael From Alan.Bateman at Sun.COM Wed Jan 30 20:59:13 2008 From: Alan.Bateman at Sun.COM (Alan Bateman) Date: Wed, 30 Jan 2008 20:59:13 +0000 Subject: [PATCH] FileChannelImpl.c.Java_sun_nio_ch_FileChannelImpl_truncate0 In-Reply-To: <20080129220705.T49011@yvyyl.pfbsg.arg> References: <20080129220705.T49011@yvyyl.pfbsg.arg> Message-ID: <47A0E521.6070007@sun.com> Michael Allman wrote: > This must have been on somebody's plate for a long time. > > Attached please find a patch to correct an apparently unreported bug. > At least, I couldn't find one. I think this is the bug you are looking for: http://bugs.sun.com/view_bug.do?bug_id=6191269 It is fixed in jdk7/OpenJDK. If I understand correctly you are running jdk7/OpenJDK's regression tests on a jdk6 port. In that case you will likely see other failures because there are many fixes and updated tests in jdk7/OpenJDK that aren't in jdk6. -Alan. From Alan.Bateman at Sun.COM Wed Jan 30 21:12:12 2008 From: Alan.Bateman at Sun.COM (Alan Bateman) Date: Wed, 30 Jan 2008 21:12:12 +0000 Subject: purpose of FileDispatcher.preClose() In-Reply-To: <20080130114658.S99697@yvyyl.pfbsg.arg> References: <20080130011129.U28274@yvyyl.pfbsg.arg> <47A060EB.8080200@sun.com> <20080130114658.S99697@yvyyl.pfbsg.arg> Message-ID: <47A0E82C.2020004@sun.com> Michael Allman wrote: > : > > I think the problem here is that FileChannelImpl.implCloseChannel() > calls nd.preClose(fd) before the block that releases its file locks. > On non-windows, nd.preClose(fd) doesn't just "pre close" fd, it closes > it. Then implCloseChannel() tries to release its file locks. fd now > points to a socket descriptor and on Solaris/Linux, such attempt seems > to be harmless. On Mac OS X, it complains with the EBADF error code. Yes, this is a known issue but hasn't been a problem to date. I don't know Mac OS X well but if closing a file causes all advisory locks on the file to be removed then the simplest solution for your port is probably to just comment out the call to release0 that is called from the inner class in implCloseChannel. As you've found this will otherwise attempt the unlock on the dup'ed file descriptor and fail. -Alan. From eliasen at mindspring.com Thu Jan 31 04:22:57 2008 From: eliasen at mindspring.com (Alan Eliasen) Date: Wed, 30 Jan 2008 21:22:57 -0700 Subject: BigInteger performance improvements Message-ID: <47A14D21.8020807@mindspring.com> I'm planning on tackling the performance issues in the BigInteger class. In short, inefficient algorithms are used for multiplication, exponentiation, conversion to strings, etc. I intend to improve this by adding algorithms with better asymptotic behavior that will work better for large numbers, while preserving the existing algorithms for use with smaller numbers. This encompasses a lot of different bug reports: 4228681: Some BigInteger operations are slow with very large numbers http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4228681 (This was closed but never fixed.) 4837946: Implement Karatsuba multiplication algorithm in BigInteger http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4837946 I've already done the work on this one. My implementation is intended to be easy to read, understand, and check. It significantly improves multiplication performance for large numbers. 4646474: BigInteger.pow() algorithm slow in 1.4.0 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4646474 This will be improved in a couple ways: * Rewrite pow() to use the above Karatsuba multiplication * Implement Karatsuba squaring * Finding a good threshhold for Karatsuba squaring * Rewrite pow() to use Karatsuba squaring * Add an optimization to use left-shifting for multiples of 2 in the base. This improves speed by thousands of times for things like Mersenne numbers. 4641897: BigInteger.toString() algorithm slow for large numbers http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4641897 This algorithm uses a very inefficient algorithm for large numbers. I plan to replace it with a recursive divide-and-conquer algorithm devised by Schoenhage and Strassen. I have developed and tested this in my own software. This operates hundreds or thousands of times faster than the current version for large numbers. It will also benefit from faster multiplication and exponentiation. In the future, we should also add multiplication routines that are even more efficient for very large numbers, such as Toom-Cook multiplication, which is more efficient than Karatsuba multiplication for even larger numbers. Has anyone else worked on these? Is this the right group? I will probably submit the Karatsuba multiplication patch soon. Would it be more preferable to implement *all* of these parts first and submit one large patch? -- Alan Eliasen | "Furious activity is no substitute eliasen at mindspring.com | for understanding." http://futureboy.us/ | --H.H. Williams