java2d performance java7 / java8
DRC
dcommander at users.sourceforge.net
Thu Feb 12 05:28:38 UTC 2015
On 2/10/15 7:52 AM, Sergey Bylokhov wrote:
> You can run this test on jdk 8u31 and 8u40 to see a difference:
> http://cr.openjdk.java.net/~serb/8029253/webrev.04/test/java/awt/image/DrawImage/UnmanagedDrawImagePerformance.java.html
>
> And the test from this bug report:
> https://bugs.openjdk.java.net/browse/JDK-8017247
After looking at those tests, they are definitely not related to the
issue I'm seeing here. Although the TurboVNC Viewer (my application)
does use bilinear interpolation if desktop scaling is enabled, that is
not the "common" usage case. Normally, it's just going to be drawing a
BufferedImage with no interpolation, so that at least clarifies that I
shouldn't be expecting any different behavior with Java 9. The question
now becomes: how to optimally take advantage of the OpenGL pipeline.
As you pointed out (and I agree, based on my research) reducing the
software-to-surface blits is key, although I don't have a firm grasp on
how to do that. My code is basically just doing the following:
public void paintComponent(Graphics g) {
Graphics2D g2 = (Graphics2D) g;
if (scaling enabled) {
g2.setRenderingHint(RenderingHints.KEY_INTERPOLATION,
RenderingHints.VALUE_INTERPOLATION_BILINEAR);
g2.drawImage(im.getImage(), 0, 0, scaledWidth, scaledHeight, null);
} else {
g2.drawImage(im.getImage(), 0, 0, null);
}
g2.dispose();
}
public void updateWindow() {
Rect r = damage;
if (!r.isEmpty()) {
if (scaling enabled) {
blah blah blah (adjust coordinates, mainly)
paintImmediately(x, y, width, height);
} else {
paintImmediately(x, y, width, height);
}
damage.clear();
}
}
As VNC rectangles from the server are decoded, the "damage" rectangle
gets updated to reflect the extent of the "damaged" pixels, and that
extent is passed into paintImmediately(). In examining the OpenJDK
source, however, it appears that glDrawPixels() is always called with
the full extent of the BufferedImage, regardless of whether only a small
portion of that image has actually changed. If there is something else
I can do to help debug this, please let me know. I have a working JDK
build. I fully admit that I may be doing something wrong or
suboptimally, but bear in mind that I've spent probably over 100 hours
on this, so it's not as if I'm a naive n00b here. If there's something
I'm missing, then trust me that it isn't obvious!
> Can you share standalone jar file of this workload?
Here is everything you need to reproduce the issue:
http://www.turbovnc.org/turbovnc_mac_performance_stuff.tar.gz
Untar, then do
> cd turbovnc_mac_performance_stuff
> java -server -d64 -Dsun.java2d.trace=count -cp VncViewer.jar com.turbovnc.vncviewer.ImageDrawTest
(let it run for 20 seconds or so, then CTRL-C it.)
> java -server -d64 -jar VncViewer.jar -bench compilation-16.rfb -benchiter 3 -benchwarmup 2
(let it run to completion.)
Results from Java 6u51 on my Mac Mini (2009 vintage, 2 GHz Intel Core
Duo, nVidia GeForce 9400):
ImageDrawTest: ~100 Mpixels/sec
(all calls are to sun.java2d.loops.Blit::Blit(IntRgb, SrcNoEa,
IntArgbPre))
compilation-16: Average 1.392763 s (Decode = 0.198173 s, Blit =
1.005974 s)
Results from Java 8u31 on my Mac Mini:
ImageDrawTest: ~70 Mpixels/sec
(Calls are split between
sun.java2d.opengl.OGLRTTSurfaceToSurfaceBlit::Blit("OpenGL Surface
(render-to-texture)", AnyAlpha, "OpenGL Surface") and
sun.java2d.opengl.OGLSwToSurfaceBlit::Blit(IntArgbPre, AnyAlpha,
"OpenGL Surface"))
compilation-16: Average 6.216550 s (Decode = 0.194989 s, Blit =
5.534781 s)
Results from Java 8u31 on my Mac Mini without alpha-enabled image
(-Dturbovnc.forcealpha=false):
ImageDrawTest: ~18 Mpixels/sec
(Calls are split between:
sun.java2d.opengl.OGLRTTSurfaceToSurfaceBlit::Blit("OpenGL Surface
(render-to-texture)", AnyAlpha, "OpenGL Surface") and
sun.java2d.opengl.OGLSwToSurfaceBlit::Blit(IntRgb, AnyAlpha,
"OpenGL Surface"))
compilation-16: Average 27.153480 s (Decode = 0.200333 s, Blit =
26.523137 s)
So, as you can see, using an alpha-enabled image improved the
performance under Java 7/8 by about 4x, both when drawing large images
(ImageDrawTest) and when doing smaller image updates (compilation-16.)
However, the blitting performance under Java 7/8 for small image
workloads is still about 5x slower than it was under Java 6. Results
from a different machine:
Results from Java 6u51 on my Macbook Pro (2011 vintage, 2.4 GHz Intel
Core i5, Intel HD Graphics 3000):
ImageDrawTest: ~100 Mpixels/sec
(all calls to sun.java2d.loops.Blit::Blit(IntRgb, SrcNoEa, IntArgbPre))
compilation-16: Average 0.592772 s (Decode = 0.113879 s, Blit =
0.351596 s)
Results from Java 8u31 on my Macbook Pro:
ImageDrawTest: ~66 Mpixels/sec
(Calls split between
sun.java2d.opengl.OGLRTTSurfaceToSurfaceBlit::Blit("OpenGL Surface
(render-to-texture)", AnyAlpha, "OpenGL Surface") and
sun.java2d.opengl.OGLSwToSurfaceBlit::Blit(IntArgbPre, AnyAlpha,
"OpenGL Surface"))
compilation-16: Average 6.806324 s (Decode = 0.188252 s, Blit =
6.457852 s)
Results from Java 8u31 on my Macbook Pro without alpha-enabled image
(-Dturbovnc.forcealpha=false):
ImageDrawTest: ~50 Mpixels/sec
(Calls split between
sun.java2d.opengl.OGLRTTSurfaceToSurfaceBlit::Blit("OpenGL Surface
(render-to-texture)", AnyAlpha, "OpenGL Surface") and
sun.java2d.opengl.OGLSwToSurfaceBlit::Blit(IntRgb, AnyAlpha,
"OpenGL Surface"))
compilation-16: Average 10.272508 s (Decode = 0.147805 s, Blit =
9.955666 s)
Using an ARGB_PRE BufferedImage didn't help out nearly as much on this
machine, and whereas the large image performance looks similar to that
of the Mac Mini, the small image blitting performance still suffers by
nearly a factor of 20 (although it is improved-- before the use of
ARGB_PRE images, it was about a factor of 30 slower.)
The architecture of this solution makes the use of VolatileImages
impractical-- basically, I have to decode the VNC rectangles in real
time as they arrive, so if the VolatileImage were to go away, I would
have no way of rebuilding it.
More information about the macosx-port-dev
mailing list