From Jonathan.Gibbons at Sun.COM Mon Jan 4 13:56:46 2010 From: Jonathan.Gibbons at Sun.COM (Jonathan Gibbons) Date: Mon, 04 Jan 2010 13:56:46 -0800 Subject: Benefit from computing String Hash at compile time? In-Reply-To: References: <560fb5ed0912181504h5033d229uf77827beb519460c@mail.gmail.com> Message-ID: <4B42641E.1050804@sun.com> Paul Benedict wrote: > Reinier, > > Thank you for your reply. > > On Fri, Dec 18, 2009 at 5:04 PM, Reinier Zwitserloot > wrote: > >> String.hashCode() has _already_ been defined as unchanging and set in stone. >> We could do so again, if it assuages recently stated fears, though I'm not >> sure what this would accomplish. It's right here: >> http://java.sun.com/javase/6/docs/api/java/lang/String.html#hashCode() >> > > I hope to make some things clear: > > My objection relies solely on the fact that it is not "set in stone". > If I remember correctly, Joe had to do research if the API ever > changed (not since at least 1.2). Neither Joe, Jonathan, and Josh > (people well respected) have claimed what you are claiming. The > highest assurance given is that it's "highly unlikely" and only if > "hell freezes over". . > > Now I grant the fact it's highly unlikely. I buy off on that. The odds > are hashCode() is not going to change. I also have no philosophical > problems with emitting the value from String.hashCode() into class > files. However, I believe the manufacturer of a JDK should have > *absolute certainty* when making this decision. It's pretty clear to > me this certainty is high, but not absolute. And since OpenJDK is made > by Sun, the bearer of Java, if it is good for them, it's good for > everyone. Follow the leader. Once this decision is made, I assert > String.hashCode() will have to be "set in stone" but only because of > Project Coin and Sun's influence, not the API. > > Paul > > Note that specification of String.hashCode specifies how the value is to be determined, and that as a result, this is covered by the TCK (JCK) which checks for conformance with the specification. For an impl of Java to be called "Java" it must the TCK, and so must pass the tests that check for the correct functioning of String.hashCode. -- Jon From pbenedict at apache.org Mon Jan 4 14:13:39 2010 From: pbenedict at apache.org (Paul Benedict) Date: Mon, 4 Jan 2010 16:13:39 -0600 Subject: Benefit from computing String Hash at compile time? In-Reply-To: <4B42641E.1050804@sun.com> References: <560fb5ed0912181504h5033d229uf77827beb519460c@mail.gmail.com> <4B42641E.1050804@sun.com> Message-ID: Jon, Thanks. But the issue I raised is not about conforming to the current JDK, but whether the algorithm can possibly changing in a future JDK. As I said before, these values are written into the class file -- so they have to (must, shall, will) conform to ALL future JDK versions. Paul On Mon, Jan 4, 2010 at 3:56 PM, Jonathan Gibbons wrote: > Paul Benedict wrote: > > Reinier, > > Thank you for your reply. > > On Fri, Dec 18, 2009 at 5:04 PM, Reinier Zwitserloot > wrote: > > > String.hashCode() has _already_ been defined as unchanging and set in stone. > We could do so again, if it assuages recently stated fears, though I'm not > sure what this would accomplish. It's right here: > http://java.sun.com/javase/6/docs/api/java/lang/String.html#hashCode() > > > I hope to make some things clear: > > My objection relies solely on the fact that it is not "set in stone". > If I remember correctly, Joe had to do research if the API ever > changed (not since at least 1.2). Neither Joe, Jonathan, and Josh > (people well respected) have claimed what you are claiming. The > highest assurance given is that it's "highly unlikely" and only if > "hell freezes over". . > > Now I grant the fact it's highly unlikely. I buy off on that. The odds > are hashCode() is not going to change. I also have no philosophical > problems with emitting the value from String.hashCode() into class > files. However, I believe the manufacturer of a JDK should have > *absolute certainty* when making this decision. It's pretty clear to > me this certainty is high, but not absolute. And since OpenJDK is made > by Sun, the bearer of Java, if it is good for them, it's good for > everyone. Follow the leader. Once this decision is made, I assert > String.hashCode() will have to be "set in stone" but only because of > Project Coin and Sun's influence, not the API. > > Paul > > > > Note that specification of String.hashCode specifies how the value is to be > determined, and that as a result, this is covered by the TCK (JCK) which > checks for conformance with the specification.? For an impl of Java to be > called "Java" it must the TCK, and so must pass the tests that check for the > correct functioning of String.hashCode. > > -- Jon > > From Jonathan.Gibbons at Sun.COM Mon Jan 4 14:29:14 2010 From: Jonathan.Gibbons at Sun.COM (Jonathan Gibbons) Date: Mon, 04 Jan 2010 14:29:14 -0800 Subject: Benefit from computing String Hash at compile time? In-Reply-To: References: <560fb5ed0912181504h5033d229uf77827beb519460c@mail.gmail.com> <4B42641E.1050804@sun.com> Message-ID: <4B426BBA.9020800@sun.com> Paul, I was reacting to your somewhat wishy-washy description of the effectiveness of the specification; > And since OpenJDK is made > by Sun, the bearer of Java, if it is good for them, it's good for > everyone. Follow the leader. Once this decision is made, I assert > String.hashCode() will have to be "set in stone" but only because of > Project Coin and Sun's influence, not the API. The fact is, the spec is set in stone, *and* covered by conformance tests. The chance of the spec changing is vanishingly small, and the conformance tests ensure that all implementations of Java must follow the spec. -- Jon Paul Benedict wrote: > Jon, > > Thanks. But the issue I raised is not about conforming to the current > JDK, but whether the algorithm can possibly changing in a future JDK. > As I said before, these values are written into the class file -- so > they have to (must, shall, will) conform to ALL future JDK versions. > > Paul > > On Mon, Jan 4, 2010 at 3:56 PM, Jonathan Gibbons > wrote: > >> Paul Benedict wrote: >> >> Reinier, >> >> Thank you for your reply. >> >> On Fri, Dec 18, 2009 at 5:04 PM, Reinier Zwitserloot >> wrote: >> >> >> String.hashCode() has _already_ been defined as unchanging and set in stone. >> We could do so again, if it assuages recently stated fears, though I'm not >> sure what this would accomplish. It's right here: >> http://java.sun.com/javase/6/docs/api/java/lang/String.html#hashCode() >> >> >> I hope to make some things clear: >> >> My objection relies solely on the fact that it is not "set in stone". >> If I remember correctly, Joe had to do research if the API ever >> changed (not since at least 1.2). Neither Joe, Jonathan, and Josh >> (people well respected) have claimed what you are claiming. The >> highest assurance given is that it's "highly unlikely" and only if >> "hell freezes over". . >> >> Now I grant the fact it's highly unlikely. I buy off on that. The odds >> are hashCode() is not going to change. I also have no philosophical >> problems with emitting the value from String.hashCode() into class >> files. However, I believe the manufacturer of a JDK should have >> *absolute certainty* when making this decision. It's pretty clear to >> me this certainty is high, but not absolute. And since OpenJDK is made >> by Sun, the bearer of Java, if it is good for them, it's good for >> everyone. Follow the leader. Once this decision is made, I assert >> String.hashCode() will have to be "set in stone" but only because of >> Project Coin and Sun's influence, not the API. >> >> Paul >> >> >> >> Note that specification of String.hashCode specifies how the value is to be >> determined, and that as a result, this is covered by the TCK (JCK) which >> checks for conformance with the specification. For an impl of Java to be >> called "Java" it must the TCK, and so must pass the tests that check for the >> correct functioning of String.hashCode. >> >> -- Jon >> >> >> > > From alexander.veit at gmx.net Mon Jan 4 15:01:34 2010 From: alexander.veit at gmx.net (Alexander Veit) Date: Tue, 5 Jan 2010 00:01:34 +0100 Subject: Benefit from computing String Hash at compile time? In-Reply-To: <4B426BBA.9020800@sun.com> References: <560fb5ed0912181504h5033d229uf77827beb519460c@mail.gmail.com><4B42641E.1050804@sun.com> <4B426BBA.9020800@sun.com> Message-ID: <4B79AAE0B52943F892B1FD81203CE6EA@helium> Hi Jonathan, > The fact is, the spec is set in stone, *and* covered by conformance > tests. The chance of the spec changing is vanishingly small, and the > conformance tests ensure that all implementations of Java must follow > the spec. Calculating String#hashCode() is quite costly in terms of CPU cycles. If a better performing method with comparable quality would come to our knowledge, the chance of changing the spec would probably be greater than e > 0. -- Cheers, Alex From Jonathan.Gibbons at Sun.COM Mon Jan 4 15:28:50 2010 From: Jonathan.Gibbons at Sun.COM (Jonathan Gibbons) Date: Mon, 04 Jan 2010 15:28:50 -0800 Subject: Benefit from computing String Hash at compile time? In-Reply-To: <4B79AAE0B52943F892B1FD81203CE6EA@helium> References: <560fb5ed0912181504h5033d229uf77827beb519460c@mail.gmail.com> <4B42641E.1050804@sun.com> <4B426BBA.9020800@sun.com> <4B79AAE0B52943F892B1FD81203CE6EA@helium> Message-ID: <4B4279B2.1090206@sun.com> Alexander Veit wrote: > Hi Jonathan, > > >> The fact is, the spec is set in stone, *and* covered by conformance >> tests. The chance of the spec changing is vanishingly small, and the >> conformance tests ensure that all implementations of Java must follow >> the spec. >> > > Calculating String#hashCode() is quite costly in terms of CPU cycles. If a > better performing method with comparable quality would come to our > knowledge, the chance of changing the spec would probably be greater than e > >> 0. >> Alex, You are missing the point. The spec doesn't mandate "a jolly good hash function" with enough wiggle room to allow "a jolly better one" if and when we think of one. The spec mandates a specific formula. You can argue the merits either way of whether it was appropriate to put such a specific formula into the spec, but whatever the reason and merits, it has happened. It now has to be assumed that there is Very Important Software Out There that is relying on this behavior, and that the Very Important Owners of Said Software would be right royally pissed off if it were changed. The chance of the spec changing brings death and taxes to mind. -- Jon From Joe.Darcy at Sun.COM Mon Jan 4 15:52:20 2010 From: Joe.Darcy at Sun.COM (Joseph D. Darcy) Date: Mon, 04 Jan 2010 15:52:20 -0800 Subject: Benefit from computing String Hash at compile time? In-Reply-To: <4B4279B2.1090206@sun.com> References: <560fb5ed0912181504h5033d229uf77827beb519460c@mail.gmail.com> <4B42641E.1050804@sun.com> <4B426BBA.9020800@sun.com> <4B79AAE0B52943F892B1FD81203CE6EA@helium> <4B4279B2.1090206@sun.com> Message-ID: <4B427F34.8010704@sun.com> Jonathan Gibbons wrote: > Alexander Veit wrote: > >> Hi Jonathan, >> >> >> >>> The fact is, the spec is set in stone, *and* covered by conformance >>> tests. The chance of the spec changing is vanishingly small, and the >>> conformance tests ensure that all implementations of Java must follow >>> the spec. >>> >>> >> Calculating String#hashCode() is quite costly in terms of CPU cycles. If a >> better performing method with comparable quality would come to our >> knowledge, the chance of changing the spec would probably be greater than e >> >> >>> 0. >>> >>> > > Alex, > > You are missing the point. The spec doesn't mandate "a jolly good hash > function" with enough wiggle room to allow "a jolly better one" if and > when we think of one. The spec mandates a specific formula. And implicit in mandating that formula is mandating that formula going forward for all subsequent releases. Otherwise, there is often not much point in specifying a particular hash function. -Joe From fredrik.ohrstrom at oracle.com Wed Jan 6 01:19:12 2010 From: fredrik.ohrstrom at oracle.com (=?ISO-8859-1?Q?Fredrik_=D6hrstr=F6m?=) Date: Wed, 06 Jan 2010 10:19:12 +0100 Subject: Benefit from computing String Hash at compile time? In-Reply-To: <4B427F34.8010704@sun.com> References: <560fb5ed0912181504h5033d229uf77827beb519460c@mail.gmail.com> <4B42641E.1050804@sun.com> <4B426BBA.9020800@sun.com> <4B79AAE0B52943F892B1FD81203CE6EA@helium> <4B4279B2.1090206@sun.com> <4B427F34.8010704@sun.com> Message-ID: <4B445590.3000300@oracle.com> Joseph D. Darcy skrev: > And implicit in mandating that formula is mandating that formula going > forward for all subsequent releases. Otherwise, there is often not > much point in specifying a particular hash function. > > The best would be to defer the choice how to implement the string switch entirely to the JVM, akin to how the plain switch does it. Obviously this is not an option since string switch belongs to project Coin that cannot change the bytecodes. Thus Joe's design is good since it gives the best worst case behavior for interpreting and non-optimizing JVMs. Our JVM will probably detect the string switch pattern and rip out all the hashcode and everything and replace it with something more optimized when it will give us a performance advantage. Thus the generated string switch bytecode should be as easy to detect and rip out as possible. (Which was the reason for my previous gripe where I wanted a simpler pattern, but I forgot about the low end JVMs.) Computing the hash at runtime would make the pattern worse and gain nothing since the hash is indeed specified in the specification. //Fredrik From Ulf.Zibis at gmx.de Wed Jan 6 11:05:15 2010 From: Ulf.Zibis at gmx.de (Ulf Zibis) Date: Wed, 06 Jan 2010 20:05:15 +0100 Subject: Benefit from computing String Hash at compile time? In-Reply-To: <4B445590.3000300@oracle.com> References: <560fb5ed0912181504h5033d229uf77827beb519460c@mail.gmail.com> <4B42641E.1050804@sun.com> <4B426BBA.9020800@sun.com> <4B79AAE0B52943F892B1FD81203CE6EA@helium> <4B4279B2.1090206@sun.com> <4B427F34.8010704@sun.com> <4B445590.3000300@oracle.com> Message-ID: <4B44DEEB.4040207@gmx.de> Am 06.01.2010 10:19, Fredrik ?hrstr?m schrieb: > > Thus the generated string switch bytecode should > be as easy to detect and rip out as possible. (Which was the reason for > my previous gripe where I wanted a simpler pattern, but I forgot about > the low end JVMs.) Computing the hash at runtime would make the pattern > worse and gain nothing since the hash is indeed specified in the > specification. > The computation of the hashes should only be done once on the first run through a string switch statement, even on low end JVMs. After each String object caches its own hash, so could be used for n more runs. Additionally all String constants provide natively computed hashes automatically, as they are always interned, and the equals method could benefit from Bug Id: 6912520 , not only for the string switch construct. -Ulf From reinier at zwitserloot.com Thu Jan 7 03:48:09 2010 From: reinier at zwitserloot.com (Reinier Zwitserloot) Date: Thu, 7 Jan 2010 12:48:09 +0100 Subject: JLS bug (unicode escapes)? Message-ID: <560fb5ed1001070348q59017da8wa310fd6829e6659a@mail.gmail.com> Am I reading this: http://java.sun.com/docs/books/jls/third_edition/html/lexical.html#3.3 correctly? A UnicodeMarker seems to be defined as, in regexp terms: "u+" instead of the expected "u". So, that would mean: \uuuuuuuuuuuuuuuuuuuuuuu0041 will still turn into "A" just like \u0041 would. What on earth is the thinking behind this? Amazingly, I tested this in javac and it actually works: System.out.println("\uuuuuuuu0041"); will print 'A' to stdout. At the very least the descriptive text in chapter 3.3 should highlight this oddity. Even ECJ gets this right. NB: what's the appropriate venue for discussing oddities in the JLS? --Reinier Zwitserloot From pcj at roundroom.net Thu Jan 7 07:15:09 2010 From: pcj at roundroom.net (Peter Jones) Date: Thu, 7 Jan 2010 10:15:09 -0500 Subject: JLS bug (unicode escapes)? In-Reply-To: <560fb5ed1001070348q59017da8wa310fd6829e6659a@mail.gmail.com> References: <560fb5ed1001070348q59017da8wa310fd6829e6659a@mail.gmail.com> Message-ID: On Thu, Jan 7, 2010 at 6:48 AM, Reinier Zwitserloot wrote: > Am I reading this: > > http://java.sun.com/docs/books/jls/third_edition/html/lexical.html#3.3 > > correctly? > > A UnicodeMarker seems to be defined as, in regexp terms: "u+" instead of the > expected "u". So, that would mean: > > \uuuuuuuuuuuuuuuuuuuuuuu0041 ?will still turn into "A" just like \u0041 > would. What on earth is the thinking behind this? > > Amazingly, I tested this in javac and it actually works: > System.out.println("\uuuuuuuu0041"); will print 'A' to stdout. At the very > least the descriptive text in chapter 3.3 should highlight this oddity. The descriptive text in section 3.3 does describe the thinking for this, in the two paragraphs starting with, "The Java programming language specifies a standard way of transforming a program written in Unicode into ASCII...". -- Peter From reinier at zwitserloot.com Thu Jan 7 10:24:48 2010 From: reinier at zwitserloot.com (Reinier Zwitserloot) Date: Thu, 7 Jan 2010 19:24:48 +0100 Subject: JLS bug (unicode escapes)? In-Reply-To: References: <560fb5ed1001070348q59017da8wa310fd6829e6659a@mail.gmail.com> Message-ID: <560fb5ed1001071024p79af75b2u62e06ca38b5e2b95@mail.gmail.com> For some reason I didn't scroll down far enough and see the explanation for this. Apologies for bothering all of you with this. --Reinier Zwitserloot On Thu, Jan 7, 2010 at 4:15 PM, Peter Jones wrote: > On Thu, Jan 7, 2010 at 6:48 AM, Reinier Zwitserloot > wrote: > > Am I reading this: > > > > http://java.sun.com/docs/books/jls/third_edition/html/lexical.html#3.3 > > > > correctly? > > > > A UnicodeMarker seems to be defined as, in regexp terms: "u+" instead of > the > > expected "u". So, that would mean: > > > > \uuuuuuuuuuuuuuuuuuuuuuu0041 will still turn into "A" just like \u0041 > > would. What on earth is the thinking behind this? > > > > Amazingly, I tested this in javac and it actually works: > > System.out.println("\uuuuuuuu0041"); will print 'A' to stdout. At the > very > > least the descriptive text in chapter 3.3 should highlight this oddity. > > The descriptive text in section 3.3 does describe the thinking for > this, in the two paragraphs starting with, "The Java programming > language specifies a standard way of transforming a program written in > Unicode into ASCII...". > > -- Peter > From matthew at matthewadams.me Fri Jan 8 10:57:06 2010 From: matthew at matthewadams.me (Matthew Adams) Date: Fri, 8 Jan 2010 10:57:06 -0800 Subject: FYI: (Incident Review ID: 1684743) Compiler type- & existence-checked reflection syntax sugar via new ".." operator Message-ID: <1ba389ce1001081057j19124be7pc6b7f707a370432e@mail.gmail.com> FYI, I submitted this suggestion to Sun and it just got reviewed and assigned an RFE id. See below. ---------- Forwarded message ---------- From: Sun Microsystems Date: Fri, Jan 8, 2010 at 12:54 AM Subject: Re: (Incident Review ID: 1684743) Compiler type- & existence-checked reflection syntax sugar via new ".." operator To: matthew at matthewadams.me --- Note: you can send us updates about your Incident --- --- by replying to this mail. ?Place new information ?--- --- above these lines. ?Do not include attachments. ? --- --- Our system ignores attachments and anything below --- --- these lines. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?--- Hi Matthew Adams, Thank you for taking the time to suggest this enhancement to the Java Standard Edition. We have determined that this report is an RFE and has been entered into our internal RFE tracking system under Bug Id: 6915224 You can monitor this RFE on the Java Bug Database at: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6915224 It may take a day or two before the RFE shows up in this external database. If you are a member of the Sun Developer Network (SDN), there are two additional options once the bug is visible. 1. Voting for the RFE ? Click http://bugs.sun.com/bugdatabase/addVote.do?bug_id=6915224 2. Adding the report to your Bug Watch list. ? You will receive an email notification when this RFE is updated. ? Click http://bugs.sun.com/bugdatabase/addBugWatch.do?bug_id=6915224 The Sun Developer Network (http://developers.sun.com) is a free service that Sun offers. ?To join, visit https://softwarereg.sun.com/registration/developer/en_US/new_user. SDN members can obtain fully licensed Java IDEs for web and enterprise development. ?More information is at http://developers.sun.com/prodtech/javatools/free/. We greatly appreciate your efforts in identifying areas in the Java Standard Edition where we can improve upon and I would request you to continue doing so. Regards, Nelson ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ NOTICE: This message, including any attachments, is for the intended recipient(s) only. ?If you are not the intended recipient(s), please reply to the sender, delete this message, and refrain from disclosing, copying, or distributing this message. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ --------------- Previous Messages ---------------- --------------------- Report --------------------- ? ? ?category : java ? subcategory : classes_lang ? ? ? release : 7 ? ? ? ? ?type : rfe ? ? ?synopsis : Compiler type- & existence-checked reflection syntax sugar via new ".." operator ?customer name : Matthew Adams ?customer mail : matthew at matthewadams.me ? ? ? ?sdn id : ? ? ?language : en ? ? ? company : Matthew Adams Consulting, Inc. ? ? ?hardware : x86 ? ? ? ? ? ?os : win_xp ? ? ? ?bug id : 6915224 ?date created : Sun Dec 27 09:30:10 MST 2009 date evaluated : Fri Jan 08 01:53:22 MST 2010 ? description : A DESCRIPTION OF THE REQUEST : Add a new operator ".." to allow compile-time type-checked and existence-checked reflection access. JUSTIFICATION : 1. There is no compile-time typesafe way to gain access via reflection to fields, methods, constructors, annotations, etc. 2. The code to obtain aforementioned artifacts can become verbose. EXPECTED VERSUS ACTUAL BEHAVIOR : EXPECTED - See project-coin dev list email: http://mail.openjdk.java.net/pipermail/coin-dev/2009-December/002638.html Copied here for convenience: Proposal: Compile-time type-checked and existence-checked reflection syntax ?Description: Introduce a new, double-dot operator ".." to act as syntax sugar for accessing reflection information with type & existence checking at compile time. Concept: The double-dot operator, meaning "get metamodel artifact", allows for much more concise reflective access to things you know about at build time but must use reflection for some reason. ?Trust me, it happens plenty. ?The choice of ".." for the operator was that first, ".." doesn't introduce a new keyword, and second, in filesystems, ".." usually means "go up a level", which is essentially what we're doing: going up a level from model to metamodel. ?Looking at the examples, you can see how much less code it is compared to the reflection-based equivalent, plus if it's typesafe, you get fewer errors when you're depending on type safety -- that is, at least you knew at compile time that things were all good. ?It still doesn't mean anything at runtime, and you could get NoSuchMethodException, etc. Examples: 1. Get the Field object for the field named "bar" in class Foo: Field bar = Foo..bar; // current way Field bar = Foo.class.getDeclaredField("bar"); 2. Get the Method object for the method with signature "myMethod(int a, String y)" defined on class Goo: Method m = Goo..myMethod(int,String); // note scope & return type don't matter // current way Method m = Goo.class.getDeclaredMethod("myMethod", new Class[int.class, String.class] {}); 3. Get the Class object for the class Snafu. ?This is an interesting case that offers backward compatibility: Class c = Snafu..class; // exactly the same as Snafu.class, the ".." operator's insipiration!! 4. Get the @Foo annotation on the Bar class: Annotation foo = Bar.. at Foo; // current way Annotation foo = Bar.class.getAnnotation(Foo.class); 5. Get the @Foo annotation on the field named "blah" in the class Gorp: Annotation foo = Gorp..blah.. at Foo; // current way Annotation foo = Gorp.class.getDeclaredField("blah").getAnnotation(Foo.class); 6. Get the @Foo annotation on the second parameter of the method "start(int x, @Foo int y, int z)" defined in class Startable: Annotation foo = Startable..start(int,int.. at Foo,int); // current way -- no error checking Annotation[] anns = Startable.class.getMethod("start", new Class[] { int.class, int.class, int.class }).getParameterAnnotations()[1]; Annotation foo = null; for (Annotation ann : anns) { ?if (ann.getClass().equals(Foo.class)) { ? ?foo = ann; ? ?break; // got it ?} } // foo is either null or a reference to the @Foo annotation instance on the second parameter of the method 7. Get all of the @Foo annotations on all of the parameters of the methods "start(@Foo int x, int y, @Foo int z)" defined in class Startable: Annotation[] foo = Startable..start(int.. at Foo,int.. at Foo,int.. at Foo); // returns an array with the first @Foo, null, then the last @Foo // current way left as an exercise to the reader :) 8. Get the @Foo annotation on the "@Foo start(int x, int y, int z)" method defined in class Startable: Annotation foo = Startable..start(int,int,int).. at Foo; // current way Annotation foo = Startable.class.getDeclaredMethod("start", new Class[] { int.class, int.class, int.class }).getAnnotation(Foo.class); Motivation: The double-dot operator would allow for compile-time type-checked reflective operations, like those in the persistence APIs. ?For example, in JPA: @Entity public class Department { ?@OneToMany(mappedBy = "department") // note string ?Set employees; ?//... } becomes @Entity public class Department { ?@OneToMany(mappedBy = Employee..department) // checked at compile time ?Set employees; ?//... } It also is beneficial in many other areas. ?Use your imagination! ?I can't think of many more (it's late), but Criteria queries come to mind... ACTUAL - See project-coin dev list email: http://mail.openjdk.java.net/pipermail/coin-dev/2009-December/002638.html Copied here for convenience: Proposal: Compile-time type-checked and existence-checked reflection syntax ?Description: Introduce a new, double-dot operator ".." to act as syntax sugar for accessing reflection information with type & existence checking at compile time. Concept: The double-dot operator, meaning "get metamodel artifact", allows for much more concise reflective access to things you know about at build time but must use reflection for some reason. ?Trust me, it happens plenty. ?The choice of ".." for the operator was that first, ".." doesn't introduce a new keyword, and second, in filesystems, ".." usually means "go up a level", which is essentially what we're doing: going up a level from model to metamodel. ?Looking at the examples, you can see how much less code it is compared to the reflection-based equivalent, plus if it's typesafe, you get fewer errors when you're depending on type safety -- that is, at least you knew at compile time that things were all good. ?It still doesn't mean anything at runtime, and you could get NoSuchMethodException, etc. Examples: 1. Get the Field object for the field named "bar" in class Foo: Field bar = Foo..bar; // current way Field bar = Foo.class.getDeclaredField("bar"); 2. Get the Method object for the method with signature "myMethod(int a, String y)" defined on class Goo: Method m = Goo..myMethod(int,String); // note scope & return type don't matter // current way Method m = Goo.class.getDeclaredMethod("myMethod", new Class[int.class, String.class] {}); 3. Get the Class object for the class Snafu. ?This is an interesting case that offers backward compatibility: Class c = Snafu..class; // exactly the same as Snafu.class, the ".." operator's insipiration!! 4. Get the @Foo annotation on the Bar class: Annotation foo = Bar.. at Foo; // current way Annotation foo = Bar.class.getAnnotation(Foo.class); 5. Get the @Foo annotation on the field named "blah" in the class Gorp: Annotation foo = Gorp..blah.. at Foo; // current way Annotation foo = Gorp.class.getDeclaredField("blah").getAnnotation(Foo.class); 6. Get the @Foo annotation on the second parameter of the method "start(int x, @Foo int y, int z)" defined in class Startable: Annotation foo = Startable..start(int,int.. at Foo,int); // current way -- no error checking Annotation[] anns = Startable.class.getMethod("start", new Class[] { int.class, int.class, int.class }).getParameterAnnotations()[1]; Annotation foo = null; for (Annotation ann : anns) { ?if (ann.getClass().equals(Foo.class)) { ? ?foo = ann; ? ?break; // got it ?} } // foo is either null or a reference to the @Foo annotation instance on the second parameter of the method 7. Get all of the @Foo annotations on all of the parameters of the methods "start(@Foo int x, int y, @Foo int z)" defined in class Startable: Annotation[] foo = Startable..start(int.. at Foo,int.. at Foo,int.. at Foo); // returns an array with the first @Foo, null, then the last @Foo // current way left as an exercise to the reader :) 8. Get the @Foo annotation on the "@Foo start(int x, int y, int z)" method defined in class Startable: Annotation foo = Startable..start(int,int,int).. at Foo; // current way Annotation foo = Startable.class.getDeclaredMethod("start", new Class[] { int.class, int.class, int.class }).getAnnotation(Foo.class); Motivation: The double-dot operator would allow for compile-time type-checked reflective operations, like those in the persistence APIs. ?For example, in JPA: @Entity public class Department { ?@OneToMany(mappedBy = "department") // note string ?Set employees; ?//... } becomes @Entity public class Department { ?@OneToMany(mappedBy = Employee..department) // checked at compile time ?Set employees; ?//... } It also is beneficial in many other areas. ?Use your imagination! ?I can't think of many more (it's late), but Criteria queries come to mind... ---------- BEGIN SOURCE ---------- No test cases yet. ---------- END SOURCE ---------- CUSTOMER SUBMITTED WORKAROUND : Workaround is conventional reflection-based code. -- mailto:matthew at matthewadams.me skype:matthewadams12 yahoo:matthewadams aol:matthewadams12 google-talk:matthewadams12 at gmail.com msn:matthew at matthewadams.me http://matthewadams.me http://www.linkedin.com/in/matthewadams From Ulf.Zibis at gmx.de Fri Jan 8 11:43:38 2010 From: Ulf.Zibis at gmx.de (Ulf Zibis) Date: Fri, 08 Jan 2010 20:43:38 +0100 Subject: FYI: (Incident Review ID: 1684743) Compiler type- & existence-checked reflection syntax sugar via new ".." operator In-Reply-To: <1ba389ce1001081057j19124be7pc6b7f707a370432e@mail.gmail.com> References: <1ba389ce1001081057j19124be7pc6b7f707a370432e@mail.gmail.com> Message-ID: <4B478AEA.1080006@gmx.de> +1 -Ulf Am 08.01.2010 19:57, Matthew Adams schrieb: > FYI, I submitted this suggestion to Sun and it just got reviewed and > assigned an RFE id. See below. > > > ---------- Forwarded message ---------- > From: Sun Microsystems > Date: Fri, Jan 8, 2010 at 12:54 AM > Subject: Re: (Incident Review ID: 1684743) Compiler type- & > existence-checked reflection syntax sugar via new ".." operator > To: matthew at matthewadams.me > From opinali at gmail.com Sun Jan 24 12:10:33 2010 From: opinali at gmail.com (Osvaldo Pinali Doederlein) Date: Sun, 24 Jan 2010 18:10:33 -0200 Subject: Project Lambda: Java Language Specification draft In-Reply-To: <4B5C952F.1000903@optrak.co.uk> References: <4B5A2CD5.3000107@sun.com> <201001241924.35104.peter.levart@gmail.com> <4B5C952F.1000903@optrak.co.uk> Message-ID: <4B5CA939.9080909@gmail.com> Em 24/01/2010 16:45, Mark Thornton escreveu: > Peter Levart wrote: > >>> #()( {1,2,3} ) // Proposed collection literal expression from Coin >>> >>> >> Well, without parentheses the above example shows why the proposed collection literal expression >> syntax is inappropriate. That syntax is reserved for statements - expressions should not mess >> with it. Without mandatory parentheses, this is ambiguous: >> >> #() {} >> >> ...is this an expression lambda returning empty collection or a statement lambda returning void? >> >> > Unfortunately Java already has array initialisation using {}, so the > syntax clearly isn't reserved just for statements. I think that existing > use for array initialisation was one of the reasons for using {} in > collection literals instead of []. > The collection literals proposal is already underwhelming IMHO, because it boils down to sugar over the Collections.unmodifiableXxx() APIs. We don't get to choose the concrete types of constructed collections, we don't get support for modifiable collections (and as much as I like the OO/Functional paradigm, Java ain't there yet - we'll deal with explicit mutable stuff for eons to come). If I want to use the new syntax to initialize, say, a HashMap, I suppose I can write "new HashMap({a: b, c: d})" and cross my fingers so javac will be smart enough to optimize this into a straight initialization of the HashMap, without building some temporary unmodifiable Map and then passing that to the HashMap(Map) constructor. I expect javac to have such optimizations, but even with that, the resulting syntax would be a bit less elegant than ideal. This proposal should consider that, differently than many scripting languages that have always had maps and lists as first-class language entities, Java has the distinct advantage of offering a very rich (and extensible: Apache, Google, user-provided etc.) choice of concrete implementations for these collections. While higher-level langs may contain multiple implementations of some collection and automatically pick or change the optimal impl for each situation (even JavaFX Script does this for its sequences), this results in some tradeoffs, and the Java style is avoiding these and just letting the user to do these choices explicitly and manually. It's also very frequent that I want a specific impl because it offers extended APIs or behaviors, e.g. 99% of the massive features from java.util.concurrent collections can't be used without resorting to extended APIs.(Collections are not even completely adherent to the Liskov Substitution Principle.) So, I'd like the collection-literals to allow me to write some thing like: HashMap{a:b, c:d}, explicitly specifying the concrete type (if I want that); and also: SortedMap{a:b, c:d}, where I don't provide a class but I provide an interface (or N interfaces? Serializable would come to mind...), and javac picks some concrete type that implements that interface and javac believes to be a good choice for whatever reason. A+ Osvaldo From mthornton at optrak.co.uk Sun Jan 24 12:59:03 2010 From: mthornton at optrak.co.uk (Mark Thornton) Date: Sun, 24 Jan 2010 20:59:03 +0000 Subject: Project Lambda: Java Language Specification draft In-Reply-To: <4B5CA939.9080909@gmail.com> References: <4B5A2CD5.3000107@sun.com> <201001241924.35104.peter.levart@gmail.com> <4B5C952F.1000903@optrak.co.uk> <4B5CA939.9080909@gmail.com> Message-ID: <4B5CB497.9010304@optrak.co.uk> Osvaldo Pinali Doederlein wrote: > Em 24/01/2010 16:45, Mark Thornton escreveu: > >> Peter Levart wrote: >> >> >>>> #()( {1,2,3} ) // Proposed collection literal expression from Coin >>>> >>>> >>>> >>> Well, without parentheses the above example shows why the proposed collection literal expression >>> syntax is inappropriate. That syntax is reserved for statements - expressions should not mess >>> with it. Without mandatory parentheses, this is ambiguous: >>> >>> #() {} >>> >>> ...is this an expression lambda returning empty collection or a statement lambda returning void? >>> >>> >>> >> Unfortunately Java already has array initialisation using {}, so the >> syntax clearly isn't reserved just for statements. I think that existing >> use for array initialisation was one of the reasons for using {} in >> collection literals instead of []. >> >> > > The collection literals proposal is already underwhelming IMHO, because > it boils down to sugar over the Collections.unmodifiableXxx() APIs. We > I partly agree. For lists and sets something like this public CollectionLiterals { public static List list(T... e) {...} } used with static import is almost as brief. Maps though are more annoying to initialise. The best I can manage with existing Java public CollectionLiterals { public static MapBuilder mapOf(K key, V value) {...} } public static interface MapBuilder { MapBuilder and(K key, V value); Map create(); } Giving mapOf(a,b).and(c,d) ... .create(); However I have managed a StackOverflow in javac with code like this :-(. Now if there was a short way of creating tuples so that public static Map mapOf(Map.Entry ... entries) {} could be used like mapOf((a,b), (c,d), ...) preferably without so many parentheses. > concrete implementations for these collections. While higher-level langs > may contain multiple implementations of some collection and > automatically pick or change the optimal impl for each situation (even > Easier to do with functional style where changing the implementation, when a value or mapping is added or removed, presents no problem. Mark From opinali at gmail.com Sun Jan 24 13:53:24 2010 From: opinali at gmail.com (Osvaldo Pinali Doederlein) Date: Sun, 24 Jan 2010 19:53:24 -0200 Subject: Project Lambda: Java Language Specification draft In-Reply-To: <4B5CB497.9010304@optrak.co.uk> References: <4B5A2CD5.3000107@sun.com> <201001241924.35104.peter.levart@gmail.com> <4B5C952F.1000903@optrak.co.uk> <4B5CA939.9080909@gmail.com> <4B5CB497.9010304@optrak.co.uk> Message-ID: <4B5CC154.20109@gmail.com> Em 24/01/2010 18:59, Mark Thornton escreveu: > Osvaldo Pinali Doederlein wrote: >> >> The collection literals proposal is already underwhelming IMHO, >> because it boils down to sugar over the Collections.unmodifiableXxx() >> APIs. We > I partly agree. For lists and sets something like this > > public CollectionLiterals { > public static List list(T... e) {...} > } > > used with static import is almost as brief. Maps though are more > annoying to initialise. The best I can manage with existing Java These idioms (like Builder / fluent interfaces) are stuff I won't touch with a 10-foot pole. You are forced to allocate at least one extra object (a Builder), so there's extra overhead unless you rely on optimizations like escape analysis + scalar replacement. I don't like the language/library design of "let's ignore performance, make a mess, and pray our advanced JIT compiler will clean it up". This often fails, remarkably for client-side code that must run on the less featured VMs like HotSpot Client; startup/warmup time is another issue even for the top JITs. On top of that, even a good fluent interface is pathetic wrt readability if compared to proper language-level syntax. I prefer to completely ignore this technique/trend and just write a full page of add() or put() calls. Tuples could be even worse, because now we're allocating one temp object per entry; now your runtime performance will certainly suffer miserably if these tuples are not optimized out (granted, that's more likely if tuples are added as "lightweight" headerless objects, which is something MLVM people are researching). It can be argued that the performance of literal collections is not very important because such collections are typically very small - people don't populate a million-element Map with literal code, right? This is a good general assumption, but there are important exceptions like machine-generated code (e.g. parsers) which often contains enormous datasets encoded as initialized variables. This remembers me of another RFE that I never lose an opportunity to remember: the classfile format's poor constant pool - you cannot encode a populated array, or an object that allows compile-time construction (e.g. "new Point(0,0)" - constructor only assigns to fields and is called with compile-time literals). Such initializations are supported by VERY bulky bytecode, that's initialized at class-init time or construction time, when ideally they could just be memcpy'd from the constant pool (or even mmapped, with something like the CDS). A+ Osvaldo > > public CollectionLiterals { > public static MapBuilder mapOf(K key, V value) {...} > } > > public static interface MapBuilder { > MapBuilder and(K key, V value); > Map create(); > } > > Giving > > mapOf(a,b).and(c,d) ... .create(); > > However I have managed a StackOverflow in javac with code like this :-(. > > Now if there was a short way of creating tuples so that > > public static Map mapOf(Map.Entry ... entries) {} > > could be used like > > mapOf((a,b), (c,d), ...) > > preferably without so many parentheses. > > >> concrete implementations for these collections. While higher-level >> langs may contain multiple implementations of some collection and >> automatically pick or change the optimal impl for each situation (even > Easier to do with functional style where changing the implementation, > when a value or mapping is added or removed, presents no problem. > > > Mark > From reinier at zwitserloot.com Sun Jan 24 14:03:33 2010 From: reinier at zwitserloot.com (Reinier Zwitserloot) Date: Sun, 24 Jan 2010 23:03:33 +0100 Subject: Project Lambda: Java Language Specification draft In-Reply-To: <4B5CC154.20109@gmail.com> References: <4B5A2CD5.3000107@sun.com> <201001241924.35104.peter.levart@gmail.com> <4B5C952F.1000903@optrak.co.uk> <4B5CA939.9080909@gmail.com> <4B5CB497.9010304@optrak.co.uk> <4B5CC154.20109@gmail.com> Message-ID: <560fb5ed1001241403i643888and758c68a48b47b25@mail.gmail.com> You're worried about the performance impact of 1 object allocation that is almost guaranteed to be short-lived and may even be entirely eliminated by the hotspot compiler? Basing language decisions on that kind of backward thinking is a very very bad idea. Your argument also makes literally no sense at all, you should read your own tripe before you post it. You're complaining about the performance of *GENERATED* code such as parsers. WTF? If for some reason the current paradigm of creating a new empty array list and adding elements to it one element at a time is measurably faster than using a hypothetical list literal, then, generate that code instead of the list literal. Duh. It's a code generator. It's the one place where verbiage and spectacular lack of brevity are utterly irrelevant. I thought basing language decisions on fear of performance impact for a short-lived singular object instantiation was as worse as it was going to get, but you've outdone yourself in the span of a single post: You're now basing decisions of language design on making life easy for code generators. You've missed April 1st by a few months, mate. --Reinier Zwitserloot On Sun, Jan 24, 2010 at 10:53 PM, Osvaldo Pinali Doederlein < opinali at gmail.com> wrote: > Em 24/01/2010 18:59, Mark Thornton escreveu: > > Osvaldo Pinali Doederlein wrote: > >> > >> The collection literals proposal is already underwhelming IMHO, > >> because it boils down to sugar over the Collections.unmodifiableXxx() > >> APIs. We > > I partly agree. For lists and sets something like this > > > > public CollectionLiterals { > > public static List list(T... e) {...} > > } > > > > used with static import is almost as brief. Maps though are more > > annoying to initialise. The best I can manage with existing Java > > These idioms (like Builder / fluent interfaces) are stuff I won't touch > with a 10-foot pole. You are forced to allocate at least one extra > object (a Builder), so there's extra overhead unless you rely on > optimizations like escape analysis + scalar replacement. I don't like > the language/library design of "let's ignore performance, make a mess, > and pray our advanced JIT compiler will clean it up". This often fails, > remarkably for client-side code that must run on the less featured VMs > like HotSpot Client; startup/warmup time is another issue even for the > top JITs. On top of that, even a good fluent interface is pathetic wrt > readability if compared to proper language-level syntax. I prefer to > completely ignore this technique/trend and just write a full page of > add() or put() calls. > > Tuples could be even worse, because now we're allocating one temp object > per entry; now your runtime performance will certainly suffer miserably > if these tuples are not optimized out (granted, that's more likely if > tuples are added as "lightweight" headerless objects, which is something > MLVM people are researching). > > It can be argued that the performance of literal collections is not very > important because such collections are typically very small - people > don't populate a million-element Map with literal code, right? This is a > good general assumption, but there are important exceptions like > machine-generated code (e.g. parsers) which often contains enormous > datasets encoded as initialized variables. This remembers me of another > RFE that I never lose an opportunity to remember: the classfile format's > poor constant pool - you cannot encode a populated array, or an object > that allows compile-time construction (e.g. "new Point(0,0)" - > constructor only assigns to fields and is called with compile-time > literals). Such initializations are supported by VERY bulky bytecode, > that's initialized at class-init time or construction time, when ideally > they could just be memcpy'd from the constant pool (or even mmapped, > with something like the CDS). > > A+ > Osvaldo > > > > > public CollectionLiterals { > > public static MapBuilder mapOf(K key, V value) {...} > > } > > > > public static interface MapBuilder { > > MapBuilder and(K key, V value); > > Map create(); > > } > > > > Giving > > > > mapOf(a,b).and(c,d) ... .create(); > > > > However I have managed a StackOverflow in javac with code like this :-(. > > > > Now if there was a short way of creating tuples so that > > > > public static Map mapOf(Map.Entry ... entries) {} > > > > could be used like > > > > mapOf((a,b), (c,d), ...) > > > > preferably without so many parentheses. > > > > > >> concrete implementations for these collections. While higher-level > >> langs may contain multiple implementations of some collection and > >> automatically pick or change the optimal impl for each situation (even > > Easier to do with functional style where changing the implementation, > > when a value or mapping is added or removed, presents no problem. > > > > > > Mark > > > > > From per at bothner.com Sun Jan 24 14:27:29 2010 From: per at bothner.com (Per Bothner) Date: Sun, 24 Jan 2010 14:27:29 -0800 Subject: Project Lambda: Java Language Specification draft In-Reply-To: <4B5CC154.20109@gmail.com> References: <4B5A2CD5.3000107@sun.com> <201001241924.35104.peter.levart@gmail.com> <4B5C952F.1000903@optrak.co.uk> <4B5CA939.9080909@gmail.com> <4B5CB497.9010304@optrak.co.uk> <4B5CC154.20109@gmail.com> Message-ID: <4B5CC951.1050801@bothner.com> On 01/24/2010 01:53 PM, Osvaldo Pinali Doederlein wrote: > It can be argued that the performance of literal collections is not very > important because such collections are typically very small - people > don't populate a million-element Map with literal code, right? This is a > good general assumption, but there are important exceptions like > machine-generated code (e.g. parsers) which often contains enormous > datasets encoded as initialized variables. Luckily, the performance of large literals isn't a problem on the Java platform, because you can't write/generate large literals, thanks to the limitations of the class file format. :-( -- --Per Bothner per at bothner.com http://per.bothner.com/ From opinali at gmail.com Mon Jan 25 05:13:26 2010 From: opinali at gmail.com (Osvaldo Doederlein) Date: Mon, 25 Jan 2010 11:13:26 -0200 Subject: Project Lambda: Java Language Specification draft In-Reply-To: <560fb5ed1001241403i643888and758c68a48b47b25@mail.gmail.com> References: <4B5A2CD5.3000107@sun.com> <201001241924.35104.peter.levart@gmail.com> <4B5C952F.1000903@optrak.co.uk> <4B5CA939.9080909@gmail.com> <4B5CB497.9010304@optrak.co.uk> <4B5CC154.20109@gmail.com> <560fb5ed1001241403i643888and758c68a48b47b25@mail.gmail.com> Message-ID: Hi, 2010/1/24 Reinier Zwitserloot > You're worried about the performance impact of 1 object allocation that is > almost guaranteed to be short-lived and may even be entirely eliminated by > the hotspot compiler? > It's not just "1 object allocation". May be a dozen allocations if I'm building a tree of objects, each node requiring its own builder. May be a few thousands, if I'm doing that inside some method that happens to be called from a loop. These *gratuitous* inefficiencies always find a way to bite you in the butt. The result is often a balkanization of the language, as performance-critical code avoids some/all higher-level features. (I'm not talking extreme/niche things, like game engines that preallocate all objects at startup. I'm talking much more mundane stuff, like XML parsers that partially avoids/duplicates APIs like java.lang.String.) > Basing language decisions on that kind of backward thinking is a very very > bad idea. > Your argument also makes literally no sense at all, you should read your > own tripe before you post it. You're complaining about the performance of > *GENERATED* code such as parsers. WTF? > That was just one easy example, granted not a good one - right now, good parser builders avoid even the overhead of array initialization, with hideous tricks like encoding numbers into strings (which can be stuffed in the constant pool). > I thought basing language decisions on fear of performance impact for a > short-lived singular object instantiation was as worse as it was going to > get, but you've outdone yourself in the span of a single post: You're now > basing decisions of language design on making life easy for code generators. > You've missed April 1st by a few months, mate. > And you are missing much more, if your POV is just ignoring such low-level efficiency issues if they make things any hard. See, I'm NOT proposing to twist the language design around such things - if a given, very useful feature really demands extra allocation, or more complex method dispatch or whatever, so be it. But, when the design space allows some choices that impact performance, we should obviously consider this factor as something important. Case in point: Java5's enhanced-for, which ALWAYS uses a Iterator object, even for collections like ArrayList (@see RandomAccess interface; it was created WITH THAT PURPOSE btw). So I'd expect javac to generate code that uses size() and get(int) for Iterable objects which static type includes List&RandomAccess; but it doesn't. I have complained about this in a couple occasions and I'm still waiting for a justification. Unless somebody provides me such justification (perhaps there's one - I just don't know it), this is plain wrong. Performance-critical code continues to be written in a lower-level style, dodging facilities like enhanced-for because programmers learn that javac generates unnecessarily inefficient code. And don't come with cheap talking of "may be eliminated by hotspot" (EA / scalar replacement). This optimization was not even in the horizon many years ago, when JDK 5 was released. It's not available yet - the first Sun HotSpot release to do this will be the one shipping with JDK 7, and even then, this will be a Server-only optimization. We won't have such goodness in Client VMs for some years more. We won't have it in the more constrained JavaME runtimes for many years more. We don't want bogus allocations in RTSJ VMs either. And I'm not interested in how fast my code may run circa 2015. And btw, another technology that's yet far from the silver-bullet level is GC. Allocating objects like it costs nothing, often costs a lot due to cache pollution alone - even with perfect GC behavior. This trend is only getting worse as computing platforms evolve to have increasingly deeper memory hierarchies. More sophisticated GCs usually have some tradeoff - read barriers, larger heap sizes to get same job done, etc. You accuse me of "backward thinking", but these lists are discusing a (pretty minor and conservative - even w/ current lambdas proposal) update of the Java programming language. You see, Java is not Groovy or Scala or Ruby or Clojure or even JavaFX Script. Java is a language that was designed back in 1995, with some important performance tradeoffs - like primitive types, simple vtable-based dispatch etc. - and like it or not, these tradeoffs did have a BIG share for the success of the Java language and the Java platform - the competition at the time was C/C++ and MFC, not Ruby on Rails. Today, the ability of Java to double as a systems programming language is still critical, because people are building EVERYTHING in Java, not just application-level code. We hack protocol stacks, media codecs, web servers, crypto libraries, imaging, every kind of middleware, and tons of similar stuff written in Pure Java code. Just look the sources of your typical JavaEE server, it contains all these things - and it works like a charm (at least in the server space, where the startup and memory overheads of all this "pureness" is not a big issue). High-level features are indeed, often a good opportunity to get more performance, and not less. This is possible when the source compiler is able to exploit extra semantic knowledge, and perform sophisticated transformations in the code. For example, a JavaFX Script "for" loop (which body has non-Void result) is conceptually a generator, which always returns a sequence with all values produced in each iteration. But in practice, the compiler only does this when necessary. If the for's value is not assigned or used anywhere, that sequence is not generated at all. If the for's value is inserted in another sequence that was being built in the outer scope, the compiler generates code that just adds each generated element directly to the outer sequence. There are also other optimizations to use special sequences for unboxed primitive types, or temporary mutable sequences for bursts of updates in conceptually-immutable sequences. All these optimizations are possible and transparent, because the language makes their effects opaque to the programmer. This is nothing new, many language (notably functional langs) have been using such tricks for decades. I would like to see the java language, and the javac compiler, evolving to enable such optimizations in our collections - that would blend wonderfully with frameworks that make intense use of lambdas, because the app code is often a big stream of temporary collections and function objects (i.e. get this List, filter it with some lambda X to produce another temp List, then apply another lambda Y that picks/produces a key for each object to get a Map, etc.); now, first-class collection support may be necessary to produce optimal code (e.g., perform a single loop over the initial List, applying the inlined/combined code from X and Y and populating the final Map, etc.). If we don't do that eventually, Java will soon be losing benchmarks to the likes of Clojure - which doesn't make any sense, as we pay - and must continue to pay forever - the significant cost of Java's lower-level designs. The lambdas proposal is another fine example of this kind of opportunity. Lambdas have a gerate potential to be FASTER than the existing, lower-level feature they replace (inner classes), basically for two reasons: 1) The compiler will have the choice of not allocating any object, and producing less code bloat than inner classes, remarkably because the lambda syntax doesn't carry the dreaded "new" keyword. If a lambda doesn't capture any enclosing state, it can be allocated statically (singleton). If a lambda is simple enough and invoked locally (in the same method it's defined) or invoked from a method that's trivial to inline (like a static helper method / control abstraction), the lambda itself can be inlined. 2) Lambdas can be implemented with MethodHandles (see R?mi's "christmas Gift"), which potentially produces faster and less bulky code, and makes transformations like currying "accelerated" by the JIT support for MethodHandle. Now, these optimizations are only possible if the spec is written with sufficiently care to not make them dangerous, e.g. by exposing to the application code some implementation detail that would change if certain optimizations are applied (and I agree that optimizations in general should be non-normative; their existence should not cause portability or linking/RRBC compatibility issues). I realize that these ideas go against the traditional javac stance of not doing even most easy optimizations, but maybe it's time to revisit this decision. It was OK in Java 1.0 when the language was close to a 1-to-1 mapping to the bytecode spec, so there wasn't much opportunity for optimization (other than trivial "bytecode quickening" opts). But the language is changing; we are adding such features as lambdas and special syntax for collections, features that enable important optimizations (sometimes producing MASSIVE performance gains) which are possible and even relatively easy to do at source compilation time - but very difficult and expensive to do at JIT compilation time (remarkably if the JIT must first pattern-match a very convoluted, non-standard, javac-generated bytecode). Even if the Sun javac team has no resources/time to perform said optimizations in time for JDK7-fcs, it's good enough to have a spec that makes them possible; then we can wait for 7uXX updates, or competing compilers like ECJ, or special "bytecode optimization" programs like ProGuard. A+ Osvaldo > > --Reinier Zwitserloot > > > > On Sun, Jan 24, 2010 at 10:53 PM, Osvaldo Pinali Doederlein < > opinali at gmail.com> wrote: > >> Em 24/01/2010 18:59, Mark Thornton escreveu: >> > Osvaldo Pinali Doederlein wrote: >> >> >> >> The collection literals proposal is already underwhelming IMHO, >> >> because it boils down to sugar over the Collections.unmodifiableXxx() >> >> APIs. We >> > I partly agree. For lists and sets something like this >> > >> > public CollectionLiterals { >> > public static List list(T... e) {...} >> > } >> > >> > used with static import is almost as brief. Maps though are more >> > annoying to initialise. The best I can manage with existing Java >> >> These idioms (like Builder / fluent interfaces) are stuff I won't touch >> with a 10-foot pole. You are forced to allocate at least one extra >> object (a Builder), so there's extra overhead unless you rely on >> optimizations like escape analysis + scalar replacement. I don't like >> the language/library design of "let's ignore performance, make a mess, >> and pray our advanced JIT compiler will clean it up". This often fails, >> remarkably for client-side code that must run on the less featured VMs >> like HotSpot Client; startup/warmup time is another issue even for the >> top JITs. On top of that, even a good fluent interface is pathetic wrt >> readability if compared to proper language-level syntax. I prefer to >> completely ignore this technique/trend and just write a full page of >> add() or put() calls. >> >> Tuples could be even worse, because now we're allocating one temp object >> per entry; now your runtime performance will certainly suffer miserably >> if these tuples are not optimized out (granted, that's more likely if >> tuples are added as "lightweight" headerless objects, which is something >> MLVM people are researching). >> >> It can be argued that the performance of literal collections is not very >> important because such collections are typically very small - people >> don't populate a million-element Map with literal code, right? This is a >> good general assumption, but there are important exceptions like >> machine-generated code (e.g. parsers) which often contains enormous >> datasets encoded as initialized variables. This remembers me of another >> RFE that I never lose an opportunity to remember: the classfile format's >> poor constant pool - you cannot encode a populated array, or an object >> that allows compile-time construction (e.g. "new Point(0,0)" - >> constructor only assigns to fields and is called with compile-time >> literals). Such initializations are supported by VERY bulky bytecode, >> that's initialized at class-init time or construction time, when ideally >> they could just be memcpy'd from the constant pool (or even mmapped, >> with something like the CDS). >> >> A+ >> Osvaldo >> >> > >> > public CollectionLiterals { >> > public static MapBuilder mapOf(K key, V value) {...} >> > } >> > >> > public static interface MapBuilder { >> > MapBuilder and(K key, V value); >> > Map create(); >> > } >> > >> > Giving >> > >> > mapOf(a,b).and(c,d) ... .create(); >> > >> > However I have managed a StackOverflow in javac with code like this :-(. >> > >> > Now if there was a short way of creating tuples so that >> > >> > public static Map mapOf(Map.Entry ... entries) {} >> > >> > could be used like >> > >> > mapOf((a,b), (c,d), ...) >> > >> > preferably without so many parentheses. >> > >> > >> >> concrete implementations for these collections. While higher-level >> >> langs may contain multiple implementations of some collection and >> >> automatically pick or change the optimal impl for each situation (even >> > Easier to do with functional style where changing the implementation, >> > when a value or mapping is added or removed, presents no problem. >> > >> > >> > Mark >> > >> >> >> > From opinali at gmail.com Mon Jan 25 06:19:08 2010 From: opinali at gmail.com (Osvaldo Doederlein) Date: Mon, 25 Jan 2010 12:19:08 -0200 Subject: Project Lambda: Java Language Specification draft In-Reply-To: <4B5CC951.1050801@bothner.com> References: <4B5A2CD5.3000107@sun.com> <201001241924.35104.peter.levart@gmail.com> <4B5C952F.1000903@optrak.co.uk> <4B5CA939.9080909@gmail.com> <4B5CB497.9010304@optrak.co.uk> <4B5CC154.20109@gmail.com> <4B5CC951.1050801@bothner.com> Message-ID: 2010/1/24 Per Bothner > On 01/24/2010 01:53 PM, Osvaldo Pinali Doederlein wrote: > > It can be argued that the performance of literal collections is not very > > important because such collections are typically very small - people > > don't populate a million-element Map with literal code, right? This is a > > good general assumption, but there are important exceptions like > > machine-generated code (e.g. parsers) which often contains enormous > > datasets encoded as initialized variables. > > Luckily, the performance of large literals isn't a problem on the Java > platform, because you can't write/generate large literals, thanks to the > limitations of the class file format. > > :-( > > It's a problem at least for those literals that are as large as the current format allows... I remember noticing some JDK7 commits that optimize the loading time of certain APIs (Unicode encoders?), by refactoring initialization of large static datasets into resource files or strings. Even with some extra parsing/decoding effort, the result was faster loading. Unfortunately Java always suffered from binary formats that were designed without any concern for loading time or sharing among several processes. The ZIP envelope, even without compression, is as bad as you can get to organize a bunch of related classes. The classfile format is justified by portability, verification etc., and it's generally OK but it could be better; besides a better constant pool it should support multiple classes in the same file, this would buy us big reduction in JAR files (remarkably much less redundancy in constant pool entries), and allow optimized linkage (no symbol resolution) between classes of that same file. JAR should be replaced by a good binary format that's optimally designed for quick location of all objects (without extra cruft in manifest files), with standard unified data/code/linking/debug-info sections like native formats, etc. The Pack200 format mostly removes the massive redundancy of JAR files; with a better classfile format, we could approach Pack200's efficiency with just tgz compression for downloadable JARs - and, for installed JARs, have significantly smaller files without any compression or other tricks. If you look at the I/O patterns of Java cold-startup, with utilities like Windows SysInternal's Process Monitor or Solaris's dtrace, it's just sad, the VM performs a huge number of tiny reads - 30 bytes here, 40 bytes there, thousands upon thousands of times. The CDS covers roughly half of the core libraries, but not app code, frameworks, containers, etc. The Java Applets (with or without JavaFX) are not yet sufficiently competitive in loading time; Flash, and even the more similar Silverlight, are still noticeably better, even after after all improvements from 6u10-6u18. Sun is working very hard to fix this problem, which is critical to their plans with JavaFX and JavaStore. JDK7 with Jigsaw will make another (hopefully big) leap forward, with a much better format for deployed modules, perhaps even ahead-of-time compilation (JIT caching). But in other words, they are paying a heavy price -- years of engineering effort since the initial 6uN project, and the risk of missing narrow time-to-market windows -- to undo the mistake made years back, when they didn't consider important to design a robust, optimized deployment format (.NET incudes this in their Assemblies design since first version). So far the only public info is for the (deployable) module files, no info yet about installed formats (this doesn't really need a public spec because it's implementation-specific just like CDS - but it will suck if important platforms, say MacOSX, don't get it ported, or develop something similar). And no sign of enhancements to the classfiles that still live inside modules (at least in the deployable form), so that seems like yet another missed opportunity (I know, I know, too many RFEs too little time/resources for 7fcs...). A+ Osvaldo From reinier at zwitserloot.com Mon Jan 25 06:40:41 2010 From: reinier at zwitserloot.com (Reinier Zwitserloot) Date: Mon, 25 Jan 2010 15:40:41 +0100 Subject: Project Lambda: Java Language Specification draft In-Reply-To: References: <4B5A2CD5.3000107@sun.com> <201001241924.35104.peter.levart@gmail.com> <4B5C952F.1000903@optrak.co.uk> <4B5CA939.9080909@gmail.com> <4B5CB497.9010304@optrak.co.uk> <4B5CC154.20109@gmail.com> <560fb5ed1001241403i643888and758c68a48b47b25@mail.gmail.com> Message-ID: <560fb5ed1001250640s33eb4885qd62f895db0ccfb24@mail.gmail.com> inline. On Mon, Jan 25, 2010 at 2:13 PM, Osvaldo Doederlein wrote: These *gratuitous* inefficiencies always find a way to bite you in the butt. > Ridiculous hyperbole. The only performance issues I've ever run into were solved by making much higher level optimizations, such as improving the performance of a tight loop someplace. In fact, I have never run into a situation where performance got nickel-and-dimed to death, nor have I met anybody where this is the case. I'm not denying that it could _ever_ happen, but you are literally saying that nickel-and-dime performance issues *ALWAYS* occur, where in fact its more likely to be a 1 in 50,000 programs occurrence. > The result is often a balkanization of the language, as > performance-critical code avoids some/all higher-level features. > The fact that some stupid tools write crappy micro-optimized code is not proof that micro-optimization is a good idea or has any measurable effect. Case in point: The code for ecj is a complete and utter dog - an unmaintainable trainwreck. It's littered with use of char arrays instead of strings, and they even mix generated code with handwritten code in a single source file just to serve the micro-optimization god. javac on the other hand, is almost laughably non-micro-optimized. They even use a conslist (immutable lists defined as having a head element along with a tail list containing all other elements. Appending something to the end of such a thing costs a hefty O(n), needing to make a new object for each element in the entire list) - which is not something hotspot optimizes well. And yet, javac is doing about as well as ecj, speed-wise. Netbeans even uses javac, with no speedups in the parser code, as-you-type. So, I pass the onus of proof back to you, Osvaldo. I hereby claim that micro-optimizations aren't worth it until proven otherwise. See, I'm NOT proposing to twist the language design around such things > and yet in the next paragraph you propose considerably complicating the code generated by the foreach loop depending on the compile-time type of the iterable expression. This would introduce a bunch of new puzzlers and potential for stuff to break when folks update to new libraries. In order words, you *ARE* proposing to twist the language deisgn around such things. > I'm still waiting for a justification. > Well, now you know why. There's also the issue of ConcurrentModificationException which is much, much more difficult to track when there isn't an Iterator object involved, if you needed another reason. Trying to accomodate your micro-optimizations here would have complicated everything. > And don't come with cheap talking of "may be eliminated by hotspot" (EA / > scalar replacement). > generational garbage collection and hotspot isn't cheap talk. Just compare java 1.0 with java 1.6. > You see, Java is not Groovy or Scala or Ruby or Clojure or even JavaFX > Script. > How are these pacifisms helping the discussion forward? No, of course not. What's your point? --Reinier Zwitserloot From mthornton at optrak.co.uk Mon Jan 25 06:47:20 2010 From: mthornton at optrak.co.uk (Mark Thornton) Date: Mon, 25 Jan 2010 14:47:20 +0000 Subject: Project Lambda: Java Language Specification draft In-Reply-To: <560fb5ed1001250640s33eb4885qd62f895db0ccfb24@mail.gmail.com> References: <4B5A2CD5.3000107@sun.com> <201001241924.35104.peter.levart@gmail.com> <4B5C952F.1000903@optrak.co.uk> <4B5CA939.9080909@gmail.com> <4B5CB497.9010304@optrak.co.uk> <4B5CC154.20109@gmail.com> <560fb5ed1001241403i643888and758c68a48b47b25@mail.gmail.com> <560fb5ed1001250640s33eb4885qd62f895db0ccfb24@mail.gmail.com> Message-ID: <4B5DAEF8.10806@optrak.co.uk> Reinier Zwitserloot wrote: > inline. > > On Mon, Jan 25, 2010 at 2:13 PM, Osvaldo Doederlein wrote: > > These *gratuitous* inefficiencies always find a way to bite you in the butt. > > > Ridiculous hyperbole. The only performance issues I've ever run into were > solved by making much higher level optimizations, such as improving the > performance of a tight loop someplace. In fact, I have never run into a > situation where performance got nickel-and-dimed to death, nor have I met > anybody where this is the case. I'm not denying that it could _ever_ happen, > The one place where lots of little inefficiencies really hurts is start up time. Admittedly many Java users don't care much about startup time (if they did they probably wouldn't be using Java ...). Mark Thornton From opinali at gmail.com Mon Jan 25 10:29:56 2010 From: opinali at gmail.com (Osvaldo Doederlein) Date: Mon, 25 Jan 2010 16:29:56 -0200 Subject: Project Lambda: Java Language Specification draft In-Reply-To: <560fb5ed1001250640s33eb4885qd62f895db0ccfb24@mail.gmail.com> References: <4B5A2CD5.3000107@sun.com> <201001241924.35104.peter.levart@gmail.com> <4B5C952F.1000903@optrak.co.uk> <4B5CA939.9080909@gmail.com> <4B5CB497.9010304@optrak.co.uk> <4B5CC154.20109@gmail.com> <560fb5ed1001241403i643888and758c68a48b47b25@mail.gmail.com> <560fb5ed1001250640s33eb4885qd62f895db0ccfb24@mail.gmail.com> Message-ID: 2010/1/25 Reinier Zwitserloot > On Mon, Jan 25, 2010 at 2:13 PM, Osvaldo Doederlein wrote: > >> These *gratuitous* inefficiencies always find a way to bite you in the >> butt. >> > > Ridiculous hyperbole. The only performance issues I've ever run into were > solved by making much higher level optimizations, such as improving the > performance of a tight loop someplace. In fact, I have never run into a > situation where performance got nickel-and-dimed to death, nor have I met > anybody where this is the case. I'm not denying that it could _ever_ happen, > but you are literally saying that nickel-and-dime performance issues > *ALWAYS* occur, where in fact its more likely to be a 1 in 50,000 programs > occurrence. > I agree with you -- on the surface. Yes, I don't often see a program that suffers a big cost because it's allocating a couple extra objects, or invoking a virtual method, in a specific site. But this is part of the problem: most decently-written code shows a flat profile (no hotspots) that's apparently a dead-end for optimization efforts (salvo architectural changes). But reality may be slightly different: the program may contain a huge number of small inefficiencies, and these add up. One easy, non-Java example is a dynamic typed language: its advocates may claim that each dispatch costs only a few nanoseconds and it's ridiculous to complain about that. But a real-world program will pay this overhead a million times per second, plus those dynamic calls block important optimizations like inlining - and the result is the often pathetic performance of languages like Ruby. For Java, I don't need to go much farther than the Swing toolkit, widely recognized as a feat of OO-overenginneering. They designed it, apparently with the assumption that polymorphism, stack frames, code size, extra indirection of some design patterns, etc. are all "nickel-and-dime issues" that wouldn't matter. But they did matter, because the library is huge and it accumulates hundreds of such tiny overheads in a single operation. (There were of course other issues like insufficient Java2D acceleration before 6u1x, but it's not perfect yet; SWT still beats it easily - that's partially apples-and-oranges, but that's just a random, easy example.) > > >> The result is often a balkanization of the language, as >> performance-critical code avoids some/all higher-level features. >> > > The fact that some stupid tools write crappy micro-optimized code is not > proof that micro-optimization is a good idea or has any measurable effect. > Case in point: The code for ecj is a complete and utter dog - an > unmaintainable trainwreck. It's littered with use of char arrays instead of > strings, and they even mix generated code with handwritten code in a single > source file just to serve the micro-optimization god. javac on the other > hand, is almost laughably non-micro-optimized. They even use a conslist > (immutable lists defined as having a head element along with a tail list > containing all other elements. Appending something to the end of such a > thing costs a hefty O(n), needing to make a new object for each element in > the entire list) - which is not something hotspot optimizes well. > I am aware of this reputation of ECJ's impl - Eclipse is my main IDE, I have reported a few ECJ compilation bugs myself, remarkably when they were catching up to Java5. But, god it's fast... AND it's always been, even with JVMs from 2001. I know javac is now significantly improved (and "embeddable" / IDE-friendly); my NetBeans compiles pretty fast too, indeed both compilers are now I/O-constrained on any recent machine/JVM, so this is not anymore a good case study. > And yet, javac is doing about as well as ecj, speed-wise. Netbeans even > uses javac, with no speedups in the parser code, as-you-type. So, I pass the > onus of proof back to you, Osvaldo. I hereby claim that micro-optimizations > aren't worth it until proven otherwise. > You invite me to a losing "proof" game - I'm not deeply informed about the implementation of specific/open/well-known software, as I devote no time to FOSS projects, except occasional bug reports and discussions. The many items I know as a user or by hearsay, would require checking sources, an effort I won't invest my time in just to win a pissing contest. But I can mention MANY cases from my real projects - just all proprietary, so you must take my word for it. For example, I have a large J2EE1.4 app that scales up to ~150 complex transactions per second (significant business code plus database, JMS, etc.) on a 2-node appserver cluster, with full clustering and XA consistency. The app is distributed via EJB, passing each transaction's state (up to a few dozen Kb) through multiple EARs. I measured the overhead of these dispatches to be very significant. So I've had to optimize the serialization of many classes from the transaction state, with some read/writeObject methods. I also reimplemented the ByteArrayInputStream and ByteArrayOutputStream, because these were used heavily and such dirty tricks as eliminating synchronization and adding a method to return the internal byte[] buffer (no defensive copying) proved to deliver significant gain, in either CPU time or global GC overheads. (Granted, I had to support the old IBM JDK 1.4.2 from WebSphere 6.0 - its GC for one thing is miserable for today's standards; relying on a more modern JVM could have avoided the need for *some* low-level optimizations.) > > See, I'm NOT proposing to twist the language design around such things >> > > and yet in the next paragraph you propose considerably complicating the > code generated by the foreach loop depending on the compile-time type of the > iterable expression. This would introduce a bunch of new puzzlers and > potential for stuff to break when folks update to new libraries. In order > words, you *ARE* proposing to twist the language deisgn around such things. > I'm not doing that. I am proposing a easy optimization, that generates code not any more complex than the code with Iterator. There ain't any new puzzlers or compatibility issues (see below). There are refactoring tools, like the Eclipse IDE, that can automatically convert standard for loops into enhanced-for (when possible, e.g. no funny tricks with the index variable) and these are safe refactorings (no risk of changing program behavior), indeed I have applied these refactorings on large codebases as part of the effort to migrate to Java5, without any issues. Here the onus of proof is yours. > I'm still waiting for a justification. >> > > Well, now you know why. There's also the issue of > ConcurrentModificationException which is much, much more difficult to track > when there isn't an Iterator object involved, if you needed another reason. > Trying to accomodate your micro-optimizations here would have complicated > everything. > No, you didn't provide any argument here, you just cried that the proposed javac optimization could be difficult (obviously wrong) or introduce some problem. ConcurrentModificationException *should* not be a problem; unfortunately the JVMLS specs enhanced-for explicitly in terms of a straight desugaring to Iterator for any Iterable (14.14.2), so you can build twisted testcases, including code that relies on CME to be thrown, or code that relies on user-provided collections which hasNext()/next() or size()/get(int) methods have side effects. These would be just backwards-compatibility issues, because the spec wasn't good in the first place. For another thing the same syntax applies to primitive arrays but using plain indexing (it's just not very bad because arrays cannot change structurally, so the code is exception-free). The JVMLS could just have a third case for Iterables that implement RandomAccess, then these would be implicitly free from CEE (but could throw IOOBE, or iterate the same objects repeatedly etc., instead, if the collection is structurally changed after the loop starts). Yeah that would add yet special case to learn (if you care for precise behavior), but the current spec is already counter-intuitive because Java developers quickly learn that indexed iteration of ArrayList and friends is better than using an Iterator (and won't ever throw CCE!), then comes enhanced-for and breaks this intuition. > And don't come with cheap talking of "may be eliminated by hotspot" (EA / >> scalar replacement). >> > > generational garbage collection and hotspot isn't cheap talk. Just compare > java 1.0 with java 1.6. > Done that; see for example a couple bug reports that I filed recently, for the upcoming G1 collector. I deal and study this stuff all the time, both for fun and for direct professional need. So, I'm aware of the many issues that we still have with GC. This doesn't mean that GC didn't get incredibly better in the last 15 years. I'm on top of the latest enhancements from JikesRVM, Azul Systems, IBM, Sun, everybody who publishes their research. But the challenges are "upgraded" every year, too. > You see, Java is not Groovy or Scala or Ruby or Clojure or even JavaFX >> Script. >> > > How are these pacifisms helping the discussion forward? No, of course not. > What's your point? > In this case the point should be obvious - don't treat Java as a higher-level language like Ruby, for which something like an extra useless object allocation is arguably much less significant. (A very gross Brazilian joke goes like this: "What is a small fart, if you've already crapped your pants".) Java is an important application developent language but it's also a relatively low-level one for that role (by today's standards), and it's an increasingly important systems programming language. We cannot design the next release of Java ignoring all its important usages, and just focusing on people who write next year's JavaEE6/SOA applications. (And I say that, as a developer who spends >90% of his time with these kinds of apps.) A+ Osvaldo From reinier at zwitserloot.com Thu Jan 28 10:55:39 2010 From: reinier at zwitserloot.com (Reinier Zwitserloot) Date: Thu, 28 Jan 2010 19:55:39 +0100 Subject: Project Lambda: Java Language Specification draft In-Reply-To: References: <4B5A2CD5.3000107@sun.com> <201001241924.35104.peter.levart@gmail.com> <4B5C952F.1000903@optrak.co.uk> <4B5CA939.9080909@gmail.com> <4B5CB497.9010304@optrak.co.uk> <4B5CC154.20109@gmail.com> <560fb5ed1001241403i643888and758c68a48b47b25@mail.gmail.com> <560fb5ed1001250640s33eb4885qd62f895db0ccfb24@mail.gmail.com> Message-ID: <560fb5ed1001281055w5466fa31w7147340238131b19@mail.gmail.com> We're veering rather wildly off the path here, with comments that have very little to do with performance concerns we should keep in mind whilst designing language features. I'll keep it short, and inline. On Mon, Jan 25, 2010 at 7:29 PM, Osvaldo Doederlein wrote: > > They designed [swing], apparently with the assumption that polymorphism, > stack frames, code size, extra indirection of some design patterns, etc. are > all "nickel-and-dime issues" that wouldn't matter. > You fantastically misunderstand the problems with the swing API. The complaints are about the _API_ - the way you build apps with it. Not about performance. Netbeans performs just fine and the API didn't change one whit. Yes, swing apps used to be slow. They fixed it, without removing any of the nickel and dime stuff. It would be annoying and complicated, from an API point of view, to state that list literals come in lots of varieties and have all these options. It would be far simpler if a list literal is always the same immutable structure, and if you need more/different features, you create a new list initialized with the list literal. Swing is actually an example of why your performance crusade is in fact wrong: It proves that high-level optimizations are the only ones that matter, and it also proves that making API simple and eliminate as many options and complications as possible is a good thing. > my NetBeans compiles pretty fast too, indeed both compilers are now > I/O-constrained on any recent machine/JVM, so this is not anymore a good > case study. > So, javac vs. ecj is no longer a good case study? Huh? No, it _IS_ a good case study, it is serious anecdotal avidence that that micro-optimization, which is what ecj has done, does not help performance any, and instead makes a dog of a code base that is such a drag on maintainability and flexiblity that eclipse is legendary for lagging behind new java features so much. I'm still waiting on even one case study on your end that this micro optimization bullpuckey is worth screwing up language features for. > > You invite me to a losing "proof" game > I wouldn't invite you to it if I wasn't fairly certain you'd lose the game. The point is - your argument should not be used to complicate the list literals proposal, and I've given several cases which prove that micro optimization is not a major concern to back up my sentiments. You, on the other hand, keeping saying, in elaborate overtures, "No, no, they are important! Trust me, even though what I say goes against all common knowledge and I have no proof or even a use case to back up what I say!". If cannot speak for those implementing the list literals or for the rest of coin-dev, so on behalf of myself: This "I've been in this business for years" hand waving isn't convincing me at all. Prove that micro-optimizations are worth making this new feature complex for. Yeah that would add yet special case to learn (if you care for precise > behavior), but the current spec is already counter-intuitive because Java > developers quickly learn that indexed iteration of ArrayList and friends is > better than using an Iterator (and won't ever throw CCE!), then comes > enhanced-for and breaks this intuition. > indexed iteration across an arraylist is better than using an iterator? I beg your pardon? Iterators don't HAVE to be fail-fast. There's extra logic in ArrayList and friends to make them fail-fast. someone back then (correctly, but a full discussion is beyond the scope of this thread) decided that fail-fast is worth it. If you want to turn this around and say that fail-fast is actively harmful compared to the standard 'who knows what's going to happen' behaviour of indexed access through an AL, that's your right, but you certainly can't do it by just saying that "It's better!" without backing this up! From neal at gafter.com Thu Jan 28 11:11:46 2010 From: neal at gafter.com (Neal Gafter) Date: Thu, 28 Jan 2010 11:11:46 -0800 Subject: Project Lambda: Java Language Specification draft In-Reply-To: <560fb5ed1001281055w5466fa31w7147340238131b19@mail.gmail.com> References: <4B5A2CD5.3000107@sun.com> <4B5C952F.1000903@optrak.co.uk> <4B5CA939.9080909@gmail.com> <4B5CB497.9010304@optrak.co.uk> <4B5CC154.20109@gmail.com> <560fb5ed1001241403i643888and758c68a48b47b25@mail.gmail.com> <560fb5ed1001250640s33eb4885qd62f895db0ccfb24@mail.gmail.com> <560fb5ed1001281055w5466fa31w7147340238131b19@mail.gmail.com> Message-ID: <15e8b9d21001281111g700ee074v38b671ba580ecffb@mail.gmail.com> On Thu, Jan 28, 2010 at 10:55 AM, Reinier Zwitserloot wrote: >> ?my NetBeans compiles pretty fast too, indeed both compilers are now >> I/O-constrained on any recent machine/JVM, so this is not anymore a good >> case study. >> > > So, javac vs. ecj is no longer a good case study? Huh? No, it _IS_ a good > case study, it is serious anecdotal avidence that that micro-optimization, > which is what ecj has done, does not help performance any, and instead makes > a dog of a code base that is such a drag on maintainability and flexiblity > that eclipse is legendary for lagging behind new java features so much. Reiner: have you actually looked at the javac code base? It's got lots of micro-optimizations. For the most part, they were done because the changes made a significant difference in compile-time. I think the more likely explanation of the lag in language features is that Sun and others use javac (not ecj) to prototype new language features. By the way, looping through an ArrayList using indexing happens to be faster than looping through using an iterator because the latter requires two method calls per element, while the former requires only one. It's not hard to verify this experimentally. Had the for-each loop been library-defined, it could have been changed to a more efficient implementation in later releases. From reinier at zwitserloot.com Thu Jan 28 11:57:34 2010 From: reinier at zwitserloot.com (Reinier Zwitserloot) Date: Thu, 28 Jan 2010 20:57:34 +0100 Subject: Project Lambda: Java Language Specification draft In-Reply-To: <15e8b9d21001281111g700ee074v38b671ba580ecffb@mail.gmail.com> References: <4B5A2CD5.3000107@sun.com> <4B5CA939.9080909@gmail.com> <4B5CB497.9010304@optrak.co.uk> <4B5CC154.20109@gmail.com> <560fb5ed1001241403i643888and758c68a48b47b25@mail.gmail.com> <560fb5ed1001250640s33eb4885qd62f895db0ccfb24@mail.gmail.com> <560fb5ed1001281055w5466fa31w7147340238131b19@mail.gmail.com> <15e8b9d21001281111g700ee074v38b671ba580ecffb@mail.gmail.com> Message-ID: <560fb5ed1001281157t1c56f616sb9b978451d6163ef@mail.gmail.com> If there's some pragmatic use case that is considerably slower 'where it counts' with the existing list literal proposal (list literals are immutable and to create mutable ones or ones with non-standard behaviours, and you still want to use list literals, you have to wrap the literal in e.g. "new ArrayList(_LITERAL_)") - that'd be interesting. However, ecj remains an example that going by your gut does not lead to good results. The foreach implementation could have been built differently, however, the number of foreach loops that get rewritten just to avoid the second method call are rather scarce, so, I don't see how this is proof that we should _complicate_ new language proposals just to accomodate such micro optimization concerns. Now, if an alternate proposal for list literals exists that does cater to these concerns and is also more elegant, or at least not more complicated, that'd be worth considering, of course. Performance (at least at this level) is not nearly as important as a clean API and a simple easy to understand feature; that does not of course mean that performance is completely irrelevant, either. It just plays second fiddle. --Reinier Zwitserloot On Thu, Jan 28, 2010 at 8:11 PM, Neal Gafter wrote: > On Thu, Jan 28, 2010 at 10:55 AM, Reinier Zwitserloot > wrote: > >> my NetBeans compiles pretty fast too, indeed both compilers are now > >> I/O-constrained on any recent machine/JVM, so this is not anymore a good > >> case study. > >> > > > > So, javac vs. ecj is no longer a good case study? Huh? No, it _IS_ a good > > case study, it is serious anecdotal avidence that that > micro-optimization, > > which is what ecj has done, does not help performance any, and instead > makes > > a dog of a code base that is such a drag on maintainability and > flexiblity > > that eclipse is legendary for lagging behind new java features so much. > > Reiner: have you actually looked at the javac code base? It's got > lots of micro-optimizations. For the most part, they were done > because the changes made a significant difference in compile-time. I > think the more likely explanation of the lag in language features is > that Sun and others use javac (not ecj) to prototype new language > features. > > By the way, looping through an ArrayList using indexing happens to be > faster than looping through using an iterator because the latter > requires two method calls per element, while the former requires only > one. It's not hard to verify this experimentally. Had the for-each > loop been library-defined, it could have been changed to a more > efficient implementation in later releases. > From opinali at gmail.com Thu Jan 28 14:14:49 2010 From: opinali at gmail.com (Osvaldo Doederlein) Date: Thu, 28 Jan 2010 20:14:49 -0200 Subject: Project Lambda: Java Language Specification draft In-Reply-To: <560fb5ed1001281055w5466fa31w7147340238131b19@mail.gmail.com> References: <4B5A2CD5.3000107@sun.com> <4B5C952F.1000903@optrak.co.uk> <4B5CA939.9080909@gmail.com> <4B5CB497.9010304@optrak.co.uk> <4B5CC154.20109@gmail.com> <560fb5ed1001241403i643888and758c68a48b47b25@mail.gmail.com> <560fb5ed1001250640s33eb4885qd62f895db0ccfb24@mail.gmail.com> <560fb5ed1001281055w5466fa31w7147340238131b19@mail.gmail.com> Message-ID: 2010/1/28 Reinier Zwitserloot > They designed [swing], apparently with the assumption that polymorphism, >> stack frames, code size, extra indirection of some design patterns, etc. are >> all "nickel-and-dime issues" that wouldn't matter. >> > > You fantastically misunderstand the problems with the swing API. The > complaints are about the _API_ - the way you build apps with it. Not about > performance. Netbeans performs just fine and the API didn't change one whit. > Yes, swing apps used to be slow. They fixed it, without removing any of the > nickel and dime stuff. It would be annoying and complicated, from an API > point of view, to state that list literals come in lots of varieties and > have all these options. It would be far simpler if a list literal is always > the same immutable structure, and if you need more/different features, you > create a new list initialized with the list literal. > Swing performs well _now_, after a full decade of JVM&HW improvements. SWT had near-native performance in 2001. And the current speed still comes with a cost in size, loading time, JIT overheads. (I know this is not due only to the factors I mention, but also to the lightweight-component architecture. But those are important factors still.) > Swing is actually an example of why your performance crusade is in fact > wrong: It proves that high-level optimizations are the only ones that > matter, and it also proves that making API simple and eliminate as many > options and complications as possible is a good thing. > Unfortunately, most high-level optimizations are not possible in Swing - these typically require structural changes, not possible in a white-box framework that exposes too much of its internals as public APIs. So I bet there was a ton of low-level opts instead - I'm not intimate with Swing's sources, but I'm always reading sources of other JDK pieces and it is SHOCK-FULL of "nickel-and-dime" opts, e.g. copying fields to local variables to avoid repeated getfield's seems to be standard practice, as well as manual hoisting of .length/.size() outside loops, etc. (I'm not claiming that most of Swing's improvements came from such optimizations; the major boost probably came from increased Java2D GPU acceleration. Once gain, just not a great example.) > my NetBeans compiles pretty fast too, indeed both compilers are now >> I/O-constrained on any recent machine/JVM, so this is not anymore a good >> case study. >> > > So, javac vs. ecj is no longer a good case study? Huh? No, it _IS_ a good > case study, it is serious anecdotal avidence that that micro-optimization, > which is what ecj has done, does not help performance any, and instead makes > a dog of a code base that is such a drag on maintainability and flexiblity > that eclipse is legendary for lagging behind new java features so much. > Here I won't dup Neal's reply. > I'm still waiting on even one case study on your end that this micro > optimization bullpuckey is worth screwing up language features for. > I never proposed that, I did explain my position but that's one of the parts you chose to not reply. > You, on the other hand, keeping saying, in elaborate overtures, "No, no, >> they are important! Trust me, even though what I say goes against all common >> knowledge and I have no proof or even a use case to back up what I say!". >> > I can provide microbenchmarks that show the impact of low-level opts, but you'd just shout back the clich? "microbenchmaks are worthless". So please let me explain (again). I agree that low-level opts may only provide extremely small gains (a few nanoseconds or bytes). But they DO make a difference when your program does that thing A LOT. For example, pick any Java program that is maths-bound and uses float values (e.g. a renderer), and replace float->double. You'll notice an important big drop in performance. A more concrete example: the compressed-oops optimization in Sun's and IBM's recent JVMs, which basically saves 4 bytes per reference field. At best you can argue that we should leave low-level opts to the compiler, not pollute our app code with that... but this is EXACTLY WHAT I WANT TO DO. I want to use high-level language features like lambdas and collections, without paying a price any higher than necessary. And I'm not proposing major tradeoffs of functionality or syntax. (Very often, the major tradeoff is extra effort in the language design, specification, and compiler implementation.) > Yeah that would add yet special case to learn (if you care for precise >> behavior), but the current spec is already counter-intuitive because Java >> developers quickly learn that indexed iteration of ArrayList and friends is >> better than using an Iterator (and won't ever throw CCE!), then comes >> enhanced-for and breaks this intuition. >> > > indexed iteration across an arraylist is better than using an iterator? I > beg your pardon? > > Iterators don't HAVE to be fail-fast. There's extra logic in ArrayList and > friends to make them fail-fast. someone back then (correctly, but a full > discussion is beyond the scope of this thread) decided that fail-fast is > worth it. If you want to turn this around and say that fail-fast is actively > harmful compared to the standard 'who knows what's going to happen' > behaviour of indexed access through an AL, that's your right, but you > certainly can't do it by just saying that "It's better!" without backing > this up! > If you desire fail-fast behavior, you can just use a standard for loop with explicit Iterator. Now we can argue which priority should weight more - performance or the protection of CCE. The enhanced-for supports primitive arrays, and you can do a mess with a primitive array (cannot change its structure, but can move elements around, including huge arraycopy operations). Also, the fail-fast behavior has a very flexible/related spec (check CCE's javadocs - there's a TON of caveats), and an implementation that would simply ignore it everywhere would be just fine. In fact I would gladly vote to make the fail-fast checks optional, guarded by JDK1.4 assertions so I could enable/disable them with -ea/-da; javac could also avoid the enhanced-for optimization of RandomAccess collections with -g. IT'S A FRIGGIN' DEBUGGING FEATURE. If treated as such, we could even make it better with more extensive checking. (IBM once did such changes in their JDK 1.4.2 - and yeah it was great because it picked a bug in my app; OTOH it sucked because the race was harmless and the CCE was screwing an app in production inside a large bank. This impl change was uncompliant because the CCE was being thrown by a method that doesn't document it as possible behavior, so I reported this as a bug to the customer; IBM later removed these changes.) A+ Osvaldo From forax at univ-mlv.fr Thu Jan 28 16:05:31 2010 From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=) Date: Fri, 29 Jan 2010 01:05:31 +0100 Subject: Project Lambda: Java Language Specification draft In-Reply-To: <15e8b9d21001281111g700ee074v38b671ba580ecffb@mail.gmail.com> References: <4B5A2CD5.3000107@sun.com> <4B5C952F.1000903@optrak.co.uk> <4B5CA939.9080909@gmail.com> <4B5CB497.9010304@optrak.co.uk> <4B5CC154.20109@gmail.com> <560fb5ed1001241403i643888and758c68a48b47b25@mail.gmail.com> <560fb5ed1001250640s33eb4885qd62f895db0ccfb24@mail.gmail.com> <560fb5ed1001281055w5466fa31w7147340238131b19@mail.gmail.com> <15e8b9d21001281111g700ee074v38b671ba580ecffb@mail.gmail.com> Message-ID: <4B62264B.6080200@univ-mlv.fr> Le 28/01/2010 20:11, Neal Gafter a ?crit : ... > By the way, looping through an ArrayList using indexing happens to be > faster than looping through using an iterator because the latter > requires two method calls per element, while the former requires only > one. It's not hard to verify this experimentally. Had the for-each > loop been library-defined, it could have been changed to a more > efficient implementation in later releases. > This not true if the code is hot. c2 is able to track the real iterator class, unvirtualize hasNext and next and inline them, In fact, it inlines ArrayList$Itr.hasNext, ArrayList.access$100, ArrayList$Itr.next, ArrayList$Itr.checkForComodification and ArrayList.access$200. It also seems to be able to collapse range checks done in hasNext and next with the range check done on the array (I'm not totally sure on that point). But it is not able to remove the check for concurrent modification (even if it detects that it never fails) and it is not be able to inline a method containing the iterator loop (the bytecode blob is too big). R?mi From neal at gafter.com Thu Jan 28 21:16:37 2010 From: neal at gafter.com (Neal Gafter) Date: Thu, 28 Jan 2010 21:16:37 -0800 Subject: Project Lambda: Java Language Specification draft In-Reply-To: <4B62264B.6080200@univ-mlv.fr> References: <4B5A2CD5.3000107@sun.com> <4B5CB497.9010304@optrak.co.uk> <4B5CC154.20109@gmail.com> <560fb5ed1001241403i643888and758c68a48b47b25@mail.gmail.com> <560fb5ed1001250640s33eb4885qd62f895db0ccfb24@mail.gmail.com> <560fb5ed1001281055w5466fa31w7147340238131b19@mail.gmail.com> <15e8b9d21001281111g700ee074v38b671ba580ecffb@mail.gmail.com> <4B62264B.6080200@univ-mlv.fr> Message-ID: <15e8b9d21001282116u61e94780l2125921fd5a6e6b@mail.gmail.com> On Thu, Jan 28, 2010 at 4:05 PM, R?mi Forax wrote: > Le 28/01/2010 20:11, Neal Gafter a ?crit : >> By the way, looping through an ArrayList using indexing happens to be >> faster than looping through using an iterator because the latter >> requires two method calls per element, while the former requires only >> one. ?It's not hard to verify this experimentally. > > This not true if the code is hot. Have you run experiments to back up that assertion? From forax at univ-mlv.fr Fri Jan 29 02:03:10 2010 From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=) Date: Fri, 29 Jan 2010 11:03:10 +0100 Subject: Project Lambda: Java Language Specification draft In-Reply-To: <15e8b9d21001282116u61e94780l2125921fd5a6e6b@mail.gmail.com> References: <4B5A2CD5.3000107@sun.com> <4B5CB497.9010304@optrak.co.uk> <4B5CC154.20109@gmail.com> <560fb5ed1001241403i643888and758c68a48b47b25@mail.gmail.com> <560fb5ed1001250640s33eb4885qd62f895db0ccfb24@mail.gmail.com> <560fb5ed1001281055w5466fa31w7147340238131b19@mail.gmail.com> <15e8b9d21001281111g700ee074v38b671ba580ecffb@mail.gmail.com> <4B62264B.6080200@univ-mlv.fr> <15e8b9d21001282116u61e94780l2125921fd5a6e6b@mail.gmail.com> Message-ID: <4B62B25E.4000900@univ-mlv.fr> Le 29/01/2010 06:16, Neal Gafter a ?crit : > On Thu, Jan 28, 2010 at 4:05 PM, R?mi Forax wrote: >> Le 28/01/2010 20:11, Neal Gafter a ?crit : >>> By the way, looping through an ArrayList using indexing happens to be >>> faster than looping through using an iterator because the latter >>> requires two method calls per element, while the former requires only >>> one. It's not hard to verify this experimentally. >> This not true if the code is hot. > Have you run experiments to back up that assertion? Yes, I had done a similar experiment one week ago when testing method handles. I've updated it this morning to remove method handle things. You can find the two codes and the generated assembler codes is here: http://cr.openjdk.java.net/~forax/assembly-loop/loop-assembly.zip Look for a method named __test__ and takes the second one, the first one is generated before Hotspot decides to inline iterator's methods. The method increment do parsing/toString to avoid to be inlined, __test__ is not inlined too. I've also tried to use the Iterator of a LinkedList during the warm-up to avoid too easy code for a CHA analysis but it doesn't change the generated code. R?mi From tronicek at fit.cvut.cz Fri Jan 29 04:46:16 2010 From: tronicek at fit.cvut.cz (Zdenek Tronicek) Date: Fri, 29 Jan 2010 13:46:16 +0100 Subject: Project Lambda: Java Language Specification draft In-Reply-To: <4B62B25E.4000900@univ-mlv.fr> References: <4B5A2CD5.3000107@sun.com> <4B5CB497.9010304@optrak.co.uk> <4B5CC154.20109@gmail.com> <560fb5ed1001241403i643888and758c68a48b47b25@mail.gmail.com> <560fb5ed1001250640s33eb4885qd62f895db0ccfb24@mail.gmail.com> <560fb5ed1001281055w5466fa31w7147340238131b19@mail.gmail.com> <15e8b9d21001281111g700ee074v38b671ba580ecffb@mail.gmail.com> <4B62264B.6080200@univ-mlv.fr> <15e8b9d21001282116u61e94780l2125921fd5a6e6b@mail.gmail.com> <4B62B25E.4000900@univ-mlv.fr> Message-ID: Hi Remi, System.nanoTime() is not very suitable for performance tests because neither Unix nor Windows have resolution of 1ns. The resolution is typically 100-1000 times lower. So, I changed System.nanoTime() to System.currentTimeMillis(), increased the size of the list to 1000000 (million), and looping through get() is faster than looping through Iterator. Difference is approx. 7%. (On Windows XP, Intel). Z. -- Zdenek Tronicek FIT CTU in Prague R?mi Forax napsal(a): > Le 29/01/2010 06:16, Neal Gafter a ?crit : >> On Thu, Jan 28, 2010 at 4:05 PM, R?mi Forax wrote: >>> Le 28/01/2010 20:11, Neal Gafter a ?crit : >>>> By the way, looping through an ArrayList using indexing happens to be >>>> faster than looping through using an iterator because the latter >>>> requires two method calls per element, while the former requires only >>>> one. It's not hard to verify this experimentally. >>> This not true if the code is hot. >> Have you run experiments to back up that assertion? > > Yes, > I had done a similar experiment one week ago when testing method handles. > I've updated it this morning to remove method handle things. > You can find the two codes and the generated assembler codes is here: > http://cr.openjdk.java.net/~forax/assembly-loop/loop-assembly.zip > > Look for a method named __test__ and takes the second one, > the first one is generated before Hotspot decides to inline iterator's > methods. > > The method increment do parsing/toString to avoid to be inlined, > __test__ is not inlined too. > I've also tried to use the Iterator of a LinkedList during the warm-up > to avoid too easy code for a CHA analysis but it doesn't change the > generated code. > > R?mi > > From opinali at gmail.com Fri Jan 29 05:15:56 2010 From: opinali at gmail.com (Osvaldo Doederlein) Date: Fri, 29 Jan 2010 11:15:56 -0200 Subject: Project Lambda: Java Language Specification draft In-Reply-To: <4B62B25E.4000900@univ-mlv.fr> References: <4B5A2CD5.3000107@sun.com> <560fb5ed1001241403i643888and758c68a48b47b25@mail.gmail.com> <560fb5ed1001250640s33eb4885qd62f895db0ccfb24@mail.gmail.com> <560fb5ed1001281055w5466fa31w7147340238131b19@mail.gmail.com> <15e8b9d21001281111g700ee074v38b671ba580ecffb@mail.gmail.com> <4B62264B.6080200@univ-mlv.fr> <15e8b9d21001282116u61e94780l2125921fd5a6e6b@mail.gmail.com> <4B62B25E.4000900@univ-mlv.fr> Message-ID: Even a raw analysis of this asm code is interesting: the iterator version of __test__ is ~48Kb, versus 40Kb for the indexed version. (Although sometimes bigger compiled code is better - superior inlining wins, all else being equal - it doesn't seem the case here.) I think the creation of strings produces too much noise, so I removed 'count' and changed increment() to just update a public static long field (accum *= value), in my experience this is always sufficient to prevent dead-code optimization. I also created an outer 1000X loop in __test__() to have a bigger number of iterations for more precise timing, without needing a much bigger ArrayList (would make this a FSB benchmark), and as a bonus makes loop optimizations harder. Results in nanoseconds-per-iteration: JDK 5_22: Indexed / Server: 4.39ns Iterator / Server: 9.45ns (2.15 X Indexed / Server) Indexed / Client: 13.40ns (3.05 X Indexed / Server) Iterator / Client: 31.30ns (2.33 X Indexed / Client, 3.31 X Iterator / Server) JDK 6u18: Indexed / Server: 4.49ns Iterator / Server: 4.62ns (1.02 X Indexed / Server) Indexed / Client: 11.60ns (2.58 X Indexed / Server) Iterator / Client: 42ns (3.62 X Indexed / Client, 9.09 X Iterator / Server) JDK 7-b81: Indexed / Server: 4.30ns Iterator / Server: 5.27ns (1.22 X Indexed / Server) Indexed / Client: 6.66ns (1.54 X Indexed / Server) Iterator / Client: 17.50ns (2.62 X Indexed / Client, 3.32 X Iterator / Server) JDK 7-b81 +XX:+DoEscapeAnalysis: Indexed / Server: 4.30ns Iterator / Server: 5.14ns (1.19 X Indexed / Server) The numbers speak for themselves. Iterating an ArrayList without iterators can provide massive gains in the weaker HotSpot Client VMs; even in Server, we can observe significant gains. The 2% advantage for 6u18 seems tiny, but because the iteration itself goes along with a lot of benchmark overhead, ANY statistically-significant advantage is important - this is usually the case for microbenchmarks of extremely simple operations. The Escape Analysis / scalar replacement optimization of bleeding-edge JDK7 gives another nice speedup, but clearly not enough to remove all Iterator overhead (probably due to the factors in R?mi's analysis), although this is not definitive as the Iterator / Server test case appears to suffer from a regression (scores are worse than 6u18). A+ Osvaldo 2010/1/29 R?mi Forax > Le 29/01/2010 06:16, Neal Gafter a ?crit : > > On Thu, Jan 28, 2010 at 4:05 PM, R?mi Forax wrote: > >> Le 28/01/2010 20:11, Neal Gafter a ?crit : > >>> By the way, looping through an ArrayList using indexing happens to be > >>> faster than looping through using an iterator because the latter > >>> requires two method calls per element, while the former requires only > >>> one. It's not hard to verify this experimentally. > >> This not true if the code is hot. > > Have you run experiments to back up that assertion? > > Yes, > I had done a similar experiment one week ago when testing method handles. > I've updated it this morning to remove method handle things. > You can find the two codes and the generated assembler codes is here: > http://cr.openjdk.java.net/~forax/assembly-loop/loop-assembly.zip > > Look for a method named __test__ and takes the second one, > the first one is generated before Hotspot decides to inline iterator's > methods. > > The method increment do parsing/toString to avoid to be inlined, > __test__ is not inlined too. > I've also tried to use the Iterator of a LinkedList during the warm-up > to avoid too easy code for a CHA analysis but it doesn't change the > generated code. > > R?mi > > From forax at univ-mlv.fr Fri Jan 29 05:18:30 2010 From: forax at univ-mlv.fr (=?UTF-8?B?UsOpbWkgRm9yYXg=?=) Date: Fri, 29 Jan 2010 14:18:30 +0100 Subject: Project Lambda: Java Language Specification draft In-Reply-To: References: <4B5A2CD5.3000107@sun.com> <4B5CB497.9010304@optrak.co.uk> <4B5CC154.20109@gmail.com> <560fb5ed1001241403i643888and758c68a48b47b25@mail.gmail.com> <560fb5ed1001250640s33eb4885qd62f895db0ccfb24@mail.gmail.com> <560fb5ed1001281055w5466fa31w7147340238131b19@mail.gmail.com> <15e8b9d21001281111g700ee074v38b671ba580ecffb@mail.gmail.com> <4B62264B.6080200@univ-mlv.fr> <15e8b9d21001282116u61e94780l2125921fd5a6e6b@mail.gmail.com> <4B62B25E.4000900@univ-mlv.fr> Message-ID: <4B62E026.2010903@univ-mlv.fr> Le 29/01/2010 13:46, Zdenek Tronicek a ?crit : > Hi Remi, > > System.nanoTime() is not very suitable for performance tests because > neither Unix nor Windows have resolution of 1ns. The resolution is > typically 100-1000 times lower. > So, I changed System.nanoTime() to System.currentTimeMillis(), increased > the size of the list to 1000000 (million), and looping through get() is > faster than looping through Iterator. Difference is approx. 7%. > (On Windows XP, Intel). > > Z. > This is not what David Holmes says: http://blogs.sun.com/dholmes/entry/inside_the_hotspot_vm_clocks R?mi From opinali at gmail.com Fri Jan 29 05:23:27 2010 From: opinali at gmail.com (Osvaldo Doederlein) Date: Fri, 29 Jan 2010 11:23:27 -0200 Subject: Project Lambda: Java Language Specification draft In-Reply-To: References: <4B5A2CD5.3000107@sun.com> <560fb5ed1001250640s33eb4885qd62f895db0ccfb24@mail.gmail.com> <560fb5ed1001281055w5466fa31w7147340238131b19@mail.gmail.com> <15e8b9d21001281111g700ee074v38b671ba580ecffb@mail.gmail.com> <4B62264B.6080200@univ-mlv.fr> <15e8b9d21001282116u61e94780l2125921fd5a6e6b@mail.gmail.com> <4B62B25E.4000900@univ-mlv.fr> Message-ID: 2010/1/29 Zdenek Tronicek > Hi Remi, > > System.nanoTime() is not very suitable for performance tests because > neither Unix nor Windows have resolution of 1ns. The resolution is > typically 100-1000 times lower. > So, I changed System.nanoTime() to System.currentTimeMillis(), increased > the size of the list to 1000000 (million), and looping through get() is > faster than looping through Iterator. Difference is approx. 7%. > (On Windows XP, Intel). > (IOnly read this reply now.) ~200ns accuracy is certainly better than 1ms; in my experience, nanoTime() definitely beats currentTimeMillis() when you are measuring operations in nanosecond scale. Of course it's still critical to keep the total execution time (of each measurement) much above 1ms, preferably close to 1s. But the extra accuracy of nanoTime() will never hurt... especially on OSes with a crappy timer, like Windows Vista (and all older Windows) that has a ~15ms timer (Windows 7 moves to 1ms, so currentTimeMillis() will honor 1ms precision). And if you run on an OS capable of Real-Time scheduling (e.g. Solaris 10's soft-RT sched class), I expect the nanoTime() API to provide even better real-world accuracy. A+ Osvaldo > Z. > -- > Zdenek Tronicek > FIT CTU in Prague > > > R?mi Forax napsal(a): > > Le 29/01/2010 06:16, Neal Gafter a ?crit : > >> On Thu, Jan 28, 2010 at 4:05 PM, R?mi Forax wrote: > >>> Le 28/01/2010 20:11, Neal Gafter a ?crit : > >>>> By the way, looping through an ArrayList using indexing happens to be > >>>> faster than looping through using an iterator because the latter > >>>> requires two method calls per element, while the former requires only > >>>> one. It's not hard to verify this experimentally. > >>> This not true if the code is hot. > >> Have you run experiments to back up that assertion? > > > > Yes, > > I had done a similar experiment one week ago when testing method handles. > > I've updated it this morning to remove method handle things. > > You can find the two codes and the generated assembler codes is here: > > http://cr.openjdk.java.net/~forax/assembly-loop/loop-assembly.zip > > > > Look for a method named __test__ and takes the second one, > > the first one is generated before Hotspot decides to inline iterator's > > methods. > > > > The method increment do parsing/toString to avoid to be inlined, > > __test__ is not inlined too. > > I've also tried to use the Iterator of a LinkedList during the warm-up > > to avoid too easy code for a CHA analysis but it doesn't change the > > generated code. > > > > R?mi > > > > > > > From jborgers at xebia.com Fri Jan 29 07:02:51 2010 From: jborgers at xebia.com (Jeroen Borgers) Date: Fri, 29 Jan 2010 16:02:51 +0100 Subject: Project Lambda: Java Language Specification draft In-Reply-To: <4B62E026.2010903@univ-mlv.fr> References: <4B5A2CD5.3000107@sun.com> <4B5CB497.9010304@optrak.co.uk><4B5CC154.20109@gmail.com><560fb5ed1001241403i643888and758c68a48b47b25@mail.gmail.com><560fb5ed1001250640s33eb4885qd62f895db0ccfb24@mail.gmail.com><560fb5ed1001281055w5466fa31w7147340238131b19@mail.gmail.com><15e8b9d21001281111g700ee074v38b671ba580ecffb@mail.gmail.com><4B62264B.6080200@univ-mlv.fr><15e8b9d21001282116u61e94780l2125921fd5a6e6b@mail.gmail.com><4B62B25E.4000900@univ-mlv.fr> <4B62E026.2010903@univ-mlv.fr> Message-ID: Hi Remi, Personally I'm also very cautious with using System.nanoTime(). Like David Holmes points out: "Be aware though, this call [nanoTime()] can also take microseconds to execute on some platforms." While currentTimeMillis() only takes a few clock cycles, nanoTime() takes microseconds on Windows. So, nanoTime can cause considerable overhead if you do a lot of measurements inside applications. It can also distort measurements in micro benchmarks. Regards, Jeroen Borgers. -----Original Message----- From: coin-dev-bounces at openjdk.java.net [mailto:coin-dev-bounces at openjdk.java.net] On Behalf Of R?mi Forax Sent: vrijdag 29 januari 2010 14:19 To: Zdenek Tronicek Cc: coin-dev at openjdk.java.net Subject: Re: Project Lambda: Java Language Specification draft Le 29/01/2010 13:46, Zdenek Tronicek a ?crit : > Hi Remi, > > System.nanoTime() is not very suitable for performance tests because > neither Unix nor Windows have resolution of 1ns. The resolution is > typically 100-1000 times lower. > So, I changed System.nanoTime() to System.currentTimeMillis(), increased > the size of the list to 1000000 (million), and looping through get() is > faster than looping through Iterator. Difference is approx. 7%. > (On Windows XP, Intel). > > Z. > This is not what David Holmes says: http://blogs.sun.com/dholmes/entry/inside_the_hotspot_vm_clocks R?mi