Performance of locally copied members ?
David Holmes
David.Holmes at Oracle.Com
Mon May 3 22:27:03 UTC 2010
I've forwarded this to hotspot-compiler-dev.
I know Doug introduced this for final fields because at the time the
compiler was not optimizing their use, but I had thought that issue was
long since resolved at least in C2. If C1 is lagging then we need to see
that it catches up.
There should not be a need to code this way at the Java-level. (Note, as
Martin says sometimes you must copy a field to a local for correctness -
the field might change value but the current code must not see that -
but that's not the case we're concerned with.)
David Holmes
Osvaldo Doederlein said the following on 05/04/10 06:13:
> 2010/5/3 Martin Buchholz <martinrb at <mailto:martinrb at>>
> It's a coding style made popular by Doug Lea.
> It's an extreme optimization that probably isn't necessary;
> you can expect the JIT to make the same optimizations.
> It certainly is necessary - unfortunately. Testing my
> particle/octree-based 3D renderer without this manual optimization
> (dumping FPS performance each 100 frames, begin at 10th score after
> startup):
> JDK 6u21-b03, Hotspot Client:
> 159.4896331738437fps
> 161.29032258064515fps
> 158.73015873015873fps
> 160.0fps
> 159.23566878980893fps
> JDK 6u21-b03, Hotspot Server:
> 197.23865877712032fps
> 204.91803278688525fps
> 196.07843137254903fps
> 200.40080160320642fps
> 198.01980198019803fps
> Now let's cache 8 instance variables into local variables (most final, a
> couple non-final ones too):
> JDK 6u21-b03, Hotspot Client:
> 169.4915254237288fps
> 172.1170395869191fps
> 168.63406408094434fps
> 168.0672268907563fps
> 170.64846416382252fps
> JDK 6u21-b03, Hotspot Server:
> 197.62845849802372fps
> 200.40080160320642fps
> 196.8503937007874fps
> 199.6007984031936fps
> 203.2520325203252fps
> So, the manual optimization makes no difference for Hotspot Server; but
> hell it does for Client - 6% better performance in this test; and the
> test is not only the complex, deeply nested rendering loops that use
> those cacheable variables to read the input data and update the output
> pixel and Z buffers - there's also other code that burns significant CPU
> and doesn't use these variables, remarkably buffer filling and copying
> steps. This means the speedup in the optimized code should be much
> higher than 6%, I only reported / cared to measure the application's
> global performance.
> We'll need to deal with HotSpot Client for years to come, not to mention
> smaller platforms (JavaME, JavaFX Mobile&TV) which JIT compilers are
> even lesser than JavaSE's C1. Tuned bytecode is also faster to
> interpret, which benefits warm-up time too. Please keep your dirty
> purist hands off the API code that Doug and others micro-optimized; it
> is necessary. :)
> And my +1 to add the same opts to other perf-critical APIs. Even most
> important for java.nio as under C1, it doesn't currently benefit from
> intrinsic compilation of critical DirectBuffer methods.
> A+
> Osvaldo
> (you can try to check the machine code yourself!)
> Nevertheless, copying to locals produces the smallest
> bytecode, and for low-level code it's nice to write code
> that's a little closer to the machine.
> Also, optimizations of finals (can cache even across volatile
> reads) could be better. John Rose is working on that.
> For some algorithms in j.u.c,
> copying to a local is necessary for correctness.
> Martin
> On Mon, May 3, 2010 at 04:40, Ulf Zibis <Ulf.Zibis at
> <mailto:Ulf.Zibis at>> wrote:
> > Hi,
> >
> > in class String I often see member variables copied to local
> variables.
> > In java.nio.Buffer I don't see that (e.g. for "position" in
> nextPutIndex(int
> > nb)).
> > Now I'm wondering.
> >
> > From JMM (Java-Memory-Model) I learned, that jvm can hold
> non-volatile
> > variables in a cache for each thread, so e.g. even in CPU
> register for few
> > ones.
> > From this knowing, I don't understand, why doing the local
> caching manually
> > in String (and many other classes), instead trusting on the JVM.
> >
> > Can anybody help me in understanding this ?
> >
> > -Ulf
> >
> >
> >
More information about the core-libs-dev
mailing list