Project Lambda: Java Language Specification draft

Mon Jan 25 10:29:56 PST 2010

2010/1/25 Reinier Zwitserloot <reinier at zwitserloot.com>

> On Mon, Jan 25, 2010 at 2:13 PM, Osvaldo Doederlein <opinali at gmail.com>wrote:
>
>> These *gratuitous* inefficiencies always find a way to bite you in the
>> butt.
>>
>
> Ridiculous hyperbole. The only performance issues I've ever run into were
> solved by making much higher level optimizations, such as improving the
> performance of a tight loop someplace. In fact, I have never run into a
> situation where performance got nickel-and-dimed to death, nor have I met
> anybody where this is the case. I'm not denying that it could _ever_ happen,
> but you are literally saying that nickel-and-dime performance issues
> *ALWAYS* occur, where in fact its more likely to be a 1 in 50,000 programs
> occurrence.
>

I agree with you -- on the surface. Yes, I don't often see a program that
suffers a big cost because it's allocating a couple extra objects, or
invoking a virtual method, in a specific site. But this is part of the
problem: most decently-written code shows a flat profile (no hotspots)
that's apparently a dead-end for optimization efforts (salvo architectural
changes). But reality may be slightly different: the program may contain a
huge number of small inefficiencies, and these add up. One easy, non-Java
example is a dynamic typed language: its advocates may claim that each
dispatch costs only a few nanoseconds and it's ridiculous to complain about
that. But a real-world program will pay this overhead a million times per
second, plus those dynamic calls block important optimizations like inlining
- and the result is the often pathetic performance of languages like Ruby.

For Java, I don't need to go much farther than the Swing toolkit, widely
recognized as a feat of OO-overenginneering. They designed it, apparently
with the assumption that polymorphism, stack frames, code size, extra
indirection of some design patterns, etc. are all "nickel-and-dime issues"
that wouldn't matter. But they did matter, because the library is huge and
it accumulates hundreds of such tiny overheads in a single operation. (There
were of course other issues like insufficient Java2D acceleration before
6u1x, but it's not perfect yet; SWT still beats it easily - that's partially
apples-and-oranges, but that's just a random, easy example.)

>
>
>> The result is often a balkanization of the language, as
>> performance-critical code avoids some/all higher-level features.
>>
>
> The fact that some stupid tools write crappy micro-optimized code is not
> proof that micro-optimization is a good idea or has any measurable effect.
> Case in point: The code for ecj is a complete and utter dog - an
> unmaintainable trainwreck. It's littered with use of char arrays instead of
> strings, and they even mix generated code with handwritten code in a single
> source file just to serve the micro-optimization god. javac on the other
> hand, is almost laughably non-micro-optimized. They even use a conslist
> (immutable lists defined as having a head element along with a tail list
> containing all other elements. Appending something to the end of such a
> thing costs a hefty O(n), needing to make a new object for each element in
> the entire list) - which is not something hotspot optimizes well.
>

I am aware of this reputation of ECJ's impl - Eclipse is my main IDE, I have
reported a few ECJ compilation bugs myself, remarkably when they were
catching up to Java5. But, god it's fast... AND it's always been, even with
JVMs from 2001. I know javac is now significantly improved (and "embeddable"
/ IDE-friendly); my NetBeans compiles pretty fast too, indeed both compilers
are now I/O-constrained on any recent machine/JVM, so this is not anymore a
good case study.

> And yet, javac is doing about as well as ecj, speed-wise. Netbeans even
> uses javac, with no speedups in the parser code, as-you-type. So, I pass the
> onus of proof back to you, Osvaldo. I hereby claim that micro-optimizations
> aren't worth it until proven otherwise.
>

You invite me to a losing "proof" game - I'm not deeply informed about the
implementation of specific/open/well-known software, as I devote no time to
FOSS projects, except occasional bug reports and discussions. The many items
I know as a user or by hearsay, would require checking sources, an effort I
won't invest my time in just to win a pissing contest. But I can mention
MANY cases from my real projects - just all proprietary, so you must take my
word for it. For example, I have a large J2EE1.4 app that scales up to ~150
complex transactions per second (significant business code plus database,
JMS, etc.) on a 2-node appserver cluster, with full clustering and XA
consistency. The app is distributed via EJB, passing each transaction's
state (up to a few dozen Kb) through multiple EARs. I measured the overhead
of these dispatches to be very significant. So I've had to optimize the
serialization of many classes from the transaction state, with some
read/writeObject methods. I also reimplemented the ByteArrayInputStream and
ByteArrayOutputStream, because these were used heavily and such dirty tricks
as eliminating synchronization and adding a method to return the internal
byte[] buffer (no defensive copying) proved to deliver significant gain, in
either CPU time or global GC overheads. (Granted, I had to support the old
IBM JDK 1.4.2 from WebSphere 6.0 - its GC for one thing is miserable for
today's standards; relying on a more modern JVM could have avoided the need
for *some* low-level optimizations.)

>
> See, I'm NOT proposing to twist the language design around such things
>>
>
> and yet in the next paragraph you propose considerably complicating the
> code generated by the foreach loop depending on the compile-time type of the
> iterable expression. This would introduce a bunch of new puzzlers and
> potential for stuff to break when folks update to new libraries. In order
> words, you *ARE* proposing to twist the language deisgn around such things.
>

I'm not doing that. I am proposing a easy optimization, that generates code
not any more complex than the code with Iterator. There ain't any new
puzzlers or compatibility issues (see below). There are refactoring tools,
like the Eclipse IDE, that can automatically convert standard for loops into
enhanced-for (when possible, e.g. no funny tricks with the index variable)
and these are safe refactorings (no risk of changing program behavior),
indeed I have applied these refactorings on large codebases as part of the
effort to migrate to Java5, without any issues. Here the onus of proof is
yours.

> I'm still waiting for a justification.
>>
>
> Well, now you know why. There's also the issue of
> ConcurrentModificationException which is much, much more difficult to track
> when there isn't an Iterator object involved, if you needed another reason.
> Trying to accomodate your micro-optimizations here would have complicated
> everything.
>

No, you didn't provide any argument here, you just cried that the proposed
javac optimization could be difficult (obviously wrong) or introduce some
problem. ConcurrentModificationException *should* not be a problem;
unfortunately the JVMLS specs enhanced-for explicitly in terms of a straight
desugaring to Iterator for any Iterable (14.14.2), so you can build twisted
testcases, including code that relies on CME to be thrown, or code that
relies on user-provided collections which hasNext()/next() or
size()/get(int) methods have side effects. These would be just
backwards-compatibility issues, because the spec wasn't good in the first
place. For another thing the same syntax applies to primitive arrays but
using plain indexing (it's just not very bad because arrays cannot change
structurally, so the code is exception-free). The JVMLS could just have a
third case for Iterables that implement RandomAccess, then these would be
implicitly free from CEE (but could throw IOOBE, or iterate the same objects
repeatedly etc., instead, if the collection is structurally changed after
the loop starts). Yeah that would add yet special case to learn (if you care
for precise behavior), but the current spec is already counter-intuitive
because Java developers quickly learn that indexed iteration of ArrayList
and friends is better than using an Iterator (and won't ever throw CCE!),
then comes enhanced-for and breaks this intuition.

> And don't come with cheap talking of "may be eliminated by hotspot" (EA /
>> scalar replacement).
>>
>
> generational garbage collection and hotspot isn't cheap talk. Just compare
> java 1.0 with java 1.6.
>

Done that; see for example a couple bug reports that I filed recently, for
the upcoming G1 collector. I deal and study this stuff all the time, both
for fun and for direct professional need. So, I'm aware of the many issues
that we still have with GC. This doesn't mean that GC didn't get incredibly
better in the last 15 years. I'm on top of the latest enhancements from
JikesRVM, Azul Systems, IBM, Sun, everybody who publishes their research.
But the challenges are "upgraded" every year, too.

>  You see, Java is not Groovy or Scala or Ruby or Clojure or even JavaFX
>> Script.
>>
>
> How are these pacifisms helping the discussion forward? No, of course not.
> What's your point?
>

In this case the point should be obvious - don't treat Java as a
higher-level language like Ruby, for which something like an extra useless
object allocation is arguably much less significant. (A very gross Brazilian
joke goes like this: "What is a small fart, if you've already crapped your
pants".) Java is an important application developent language but it's also
a relatively low-level one for that role (by today's standards), and it's an
increasingly important systems programming language. We cannot design the
next release of Java ignoring all its important usages, and just focusing on
people who write next year's JavaEE6/SOA applications. (And I say that, as a
developer who spends >90% of his time with these kinds of apps.)

A+
Osvaldo