java2d performance java7 / java8

Sergey Bylokhov Sergey.Bylokhov at oracle.com
Tue Feb 17 14:01:33 UTC 2015


Hello,
Thanks for the provided info! I am able to reproduce this bug even on 
windows: gdi vs ogl. I will take a look at it.

On 12.02.2015 8:28, DRC wrote:
> On 2/10/15 7:52 AM, Sergey Bylokhov wrote:
>> You can run this test on jdk 8u31 and 8u40 to see a difference:
>> http://cr.openjdk.java.net/~serb/8029253/webrev.04/test/java/awt/image/DrawImage/UnmanagedDrawImagePerformance.java.html 
>>
>>
>> And the test from this bug report:
>> https://bugs.openjdk.java.net/browse/JDK-8017247
>
> After looking at those tests, they are definitely not related to the 
> issue I'm seeing here.  Although the TurboVNC Viewer (my application) 
> does use bilinear interpolation if desktop scaling is enabled, that is 
> not the "common" usage case.  Normally, it's just going to be drawing 
> a BufferedImage with no interpolation, so that at least clarifies that 
> I shouldn't be expecting any different behavior with Java 9.  The 
> question now becomes:  how to optimally take advantage of the OpenGL 
> pipeline. As you pointed out (and I agree, based on my research) 
> reducing the software-to-surface blits is key, although I don't have a 
> firm grasp on how to do that.  My code is basically just doing the 
> following:
>
>   public void paintComponent(Graphics g) {
>     Graphics2D g2 = (Graphics2D) g;
>     if (scaling enabled) {
>       g2.setRenderingHint(RenderingHints.KEY_INTERPOLATION,
> RenderingHints.VALUE_INTERPOLATION_BILINEAR);
>       g2.drawImage(im.getImage(), 0, 0, scaledWidth, scaledHeight, null);
>     } else {
>       g2.drawImage(im.getImage(), 0, 0, null);
>     }
>     g2.dispose();
>   }
>
>   public void updateWindow() {
>     Rect r = damage;
>     if (!r.isEmpty()) {
>       if (scaling enabled) {
>         blah blah blah (adjust coordinates, mainly)
>         paintImmediately(x, y, width, height);
>       } else {
>         paintImmediately(x, y, width, height);
>       }
>       damage.clear();
>     }
>   }
>
> As VNC rectangles from the server are decoded, the "damage" rectangle 
> gets updated to reflect the extent of the "damaged" pixels, and that 
> extent is passed into paintImmediately().  In examining the OpenJDK 
> source, however, it appears that glDrawPixels() is always called with 
> the full extent of the BufferedImage, regardless of whether only a 
> small portion of that image has actually changed.  If there is 
> something else I can do to help debug this, please let me know.  I 
> have a working JDK build.  I fully admit that I may be doing something 
> wrong or suboptimally, but bear in mind that I've spent probably over 
> 100 hours on this, so it's not as if I'm a naive n00b here.  If 
> there's something I'm missing, then trust me that it isn't obvious!
>
>
>> Can you share standalone jar file of this workload?
>
> Here is everything you need to reproduce the issue:
> http://www.turbovnc.org/turbovnc_mac_performance_stuff.tar.gz
>
> Untar, then do
>> cd turbovnc_mac_performance_stuff
>> java -server -d64 -Dsun.java2d.trace=count -cp VncViewer.jar 
>> com.turbovnc.vncviewer.ImageDrawTest
>   (let it run for 20 seconds or so, then CTRL-C it.)
>> java -server -d64 -jar VncViewer.jar -bench compilation-16.rfb 
>> -benchiter 3 -benchwarmup 2
>   (let it run to completion.)
>
>   Results from Java 6u51 on my Mac Mini (2009 vintage, 2 GHz Intel 
> Core Duo, nVidia GeForce 9400):
>   ImageDrawTest:   ~100 Mpixels/sec
>     (all calls are to sun.java2d.loops.Blit::Blit(IntRgb, SrcNoEa, 
> IntArgbPre))
>   compilation-16:  Average 1.392763 s (Decode = 0.198173 s, Blit = 
> 1.005974 s)
>
>   Results from Java 8u31 on my Mac Mini:
>   ImageDrawTest:   ~70 Mpixels/sec
>     (Calls are split between
>      sun.java2d.opengl.OGLRTTSurfaceToSurfaceBlit::Blit("OpenGL 
> Surface (render-to-texture)", AnyAlpha, "OpenGL Surface") and
>      sun.java2d.opengl.OGLSwToSurfaceBlit::Blit(IntArgbPre, AnyAlpha, 
> "OpenGL Surface"))
>   compilation-16:  Average 6.216550 s (Decode = 0.194989 s, Blit = 
> 5.534781 s)
>
>   Results from Java 8u31 on my Mac Mini without alpha-enabled image 
> (-Dturbovnc.forcealpha=false):
>   ImageDrawTest:   ~18 Mpixels/sec
>     (Calls are split between:
>      sun.java2d.opengl.OGLRTTSurfaceToSurfaceBlit::Blit("OpenGL 
> Surface (render-to-texture)", AnyAlpha, "OpenGL Surface") and
>      sun.java2d.opengl.OGLSwToSurfaceBlit::Blit(IntRgb, AnyAlpha, 
> "OpenGL Surface"))
>   compilation-16:  Average 27.153480 s (Decode = 0.200333 s, Blit = 
> 26.523137 s)
>
> So, as you can see, using an alpha-enabled image improved the 
> performance under Java 7/8 by about 4x, both when drawing large images 
> (ImageDrawTest) and when doing smaller image updates (compilation-16.) 
> However, the blitting performance under Java 7/8 for small image 
> workloads is still about 5x slower than it was under Java 6.  Results 
> from a different machine:
>
>   Results from Java 6u51 on my Macbook Pro (2011 vintage, 2.4 GHz 
> Intel Core i5, Intel HD Graphics 3000):
>   ImageDrawTest:   ~100 Mpixels/sec
>     (all calls to sun.java2d.loops.Blit::Blit(IntRgb, SrcNoEa, 
> IntArgbPre))
>   compilation-16:  Average 0.592772 s (Decode = 0.113879 s, Blit = 
> 0.351596 s)
>
>   Results from Java 8u31 on my Macbook Pro:
>   ImageDrawTest:   ~66 Mpixels/sec
>     (Calls split between
>      sun.java2d.opengl.OGLRTTSurfaceToSurfaceBlit::Blit("OpenGL 
> Surface (render-to-texture)", AnyAlpha, "OpenGL Surface") and
>      sun.java2d.opengl.OGLSwToSurfaceBlit::Blit(IntArgbPre, AnyAlpha, 
> "OpenGL Surface"))
>   compilation-16:  Average 6.806324 s (Decode = 0.188252 s, Blit = 
> 6.457852 s)
>
>   Results from Java 8u31 on my Macbook Pro without alpha-enabled image 
> (-Dturbovnc.forcealpha=false):
>   ImageDrawTest:   ~50 Mpixels/sec
>     (Calls split between
>      sun.java2d.opengl.OGLRTTSurfaceToSurfaceBlit::Blit("OpenGL 
> Surface (render-to-texture)", AnyAlpha, "OpenGL Surface") and
>      sun.java2d.opengl.OGLSwToSurfaceBlit::Blit(IntRgb, AnyAlpha, 
> "OpenGL Surface"))
>   compilation-16:  Average 10.272508 s (Decode = 0.147805 s, Blit = 
> 9.955666 s)
>
> Using an ARGB_PRE BufferedImage didn't help out nearly as much on this 
> machine, and whereas the large image performance looks similar to that 
> of the Mac Mini, the small image blitting performance still suffers by 
> nearly a factor of 20 (although it is improved-- before the use of 
> ARGB_PRE images, it was about a factor of 30 slower.)
>
> The architecture of this solution makes the use of VolatileImages 
> impractical-- basically, I have to decode the VNC rectangles in real 
> time as they arrive, so if the VolatileImage were to go away, I would 
> have no way of rebuilding it.


-- 
Best regards, Sergey.



More information about the macosx-port-dev mailing list