Canvas performance on Mac OS

Mike mikegps1 at gmail.com
Thu Apr 9 16:11:37 UTC 2015


This is important 
Thanks guys 

Sent from my iPhone

> On Apr 8, 2015, at 9:25 AM, Chris Newland <cnewland at chrisnewland.com> wrote:
> 
> Hi Jim,
> 
> I'll post the verbose prism output from my iMac when I get home.
> 
> Just tried this on my Linux workstation and the performance gap is the
> same between es2 and sw so I don't think it's an OSX issue.
> 
> uname -a
> Linux chris 3.2.0-4-amd64 #1 SMP Debian 3.2.65-1+deb7u2 x86_64 GNU/Linux
> 
> "$JAVA_HOME/bin/java" -classpath target/DemoFX.jar
> com.chrisnewland.demofx.standalone.Sierpinski
> fps: 1
> fps: 20
> fps: 31
> fps: 32
> fps: 33
> fps: 35
> fps: 34
> fps: 33
> 
> "$JAVA_HOME/bin/java" -Dprism.order=sw -classpath target/DemoFX.jar
> com.chrisnewland.demofx.standalone.Sierpinski
> fps: 1
> fps: 54
> fps: 56
> fps: 60
> fps: 59
> fps: 60
> fps: 61
> fps: 61
> fps: 60
> 
> This is a Xeon W3520 quad-core HT box with an Nvidia Quadro FX 580
> graphics card running driver 304.125
> 
> Regards,
> 
> Chris
> 
> 
>> On Wed, April 8, 2015 00:16, Jim Graham wrote:
>> OK, I took the time to put my rMBP on a diet yesterday and find room to
>> install a 10.10 partition.  I get the same numbers for Sierpinski on 10.10,
>> so my theory that something changed in the OGL implementation for 10.10
>> doesn't hold water.
>> 
>> But, I then tried it using the integrated graphics.  I get really poor
>> performance using the integrated Intel 4000 graphics, but I get great
>> numbers on the discrete nVidia 650m.  It makes sense that the Intel
>> graphics wouldn't be as powerful as the discrete graphics, but we
>> shouldn't be taxing it that much to make that big of a difference.
>> 
>> Just to be sure - is that iMac a dual graphics system, or is it
>> all-AMD-all-the-time?  You can see which GPU is being used if you run it
>> with -Dprism.verbose=true...
>> 
>> ...jim
>> 
>> 
>>> On 4/2/15 4:13 PM, Jim Graham wrote:
>>> 
>>> On my retina MBP (10.8) I get 60fps for es2 and 44fps for sw.  Are you
>>> running a newer version of MacOS?
>>> 
>>> ...jim
>>> 
>>> 
>>>> On 3/31/15 3:40 PM, Chris Newland wrote:
>>>> 
>>>> Hi Hervé,
>>>> 
>>>> 
>>>> That's a valid question :)
>>>> 
>>>> 
>>>> Probably because
>>>> 
>>>> 
>>>> a) All my non-UI graphics experience is with immediate-mode / raster
>>>> systems
>>>> 
>>>> b) I'm interested in using JavaFX for particle effects / demoscene /
>>>> gaming so assumed (perhaps wrongly?) that scenegraph was not the way
>>>> to go for that due to the very large number of nodes.
>>>> 
>>>> Numbers for my Sierpinski filled triangle example:
>>>> 
>>>> 
>>>> System: 2011 iMac Core i7 3.4GHz / 20GB RAM / AMD Radeon HD 6970M
>>>> 1024 MB
>>>> 
>>>> 
>>>> java -Dprism.order=es2 -cp target/classes/
>>>> com.chrisnewland.demofx.standalone.Sierpinski fps: 1
>>>> fps: 23
>>>> fps: 18
>>>> fps: 25
>>>> fps: 18
>>>> fps: 23
>>>> fps: 23
>>>> fps: 19
>>>> fps: 25
>>>> 
>>>> 
>>>> java -Dprism.order=sw -cp target/classes/
>>>> com.chrisnewland.demofx.standalone.Sierpinski fps: 1
>>>> fps: 54
>>>> fps: 60
>>>> fps: 60
>>>> fps: 60
>>>> fps: 60
>>>> fps: 60
>>>> fps: 60
>>>> fps: 60
>>>> fps: 60
>>>> fps: 60
>>>> 
>>>> 
>>>> There are never more than 2500 filled triangles on screen. JDK is
>>>> 1.8.0_40
>>>> 
>>>> 
>>>> I would say there is a performance problem here? (or at least a need
>>>> for documentation so as to set expectations for gc.fillPolygon).
>>>> 
>>>> Best regards,
>>>> 
>>>> 
>>>> Chris
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On Tue, March 31, 2015 22:00, Hervé Girod wrote:
>>>>> 
>>>>> Why don't you use Nodes rather than Canvas ?
>>>>> 
>>>>> 
>>>>> 
>>>>> Sent from my iPhone
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Mar 31, 2015, at 22:31, Chris Newland
>>>>>> <cnewland at chrisnewland.com>
>>>>>> wrote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Hi Jim,
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Thanks, that makes things much clearer.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> I was surprised how much was going on under the hood of
>>>>>> GraphicsContext
>>>>>> and hoped it was just magic glue that gave the best of GPU
>>>>>> acceleration where available and immediate-mode-like simple
>>>>>> rasterizing where not.
>>>>>> 
>>>>>> I've managed to find an anomaly with GraphicsContext.fillPolygon
>>>>>> where the software pipeline achieves the full 60fps but ES2 can
>>>>>> only manage 30-35fps. It uses lots of overlapping filled triangles
>>>>>> so I expect suffers from the problem you've described.
>>>>>> 
>>>>>> SSCCE:
>>>>>> https://github.com/chriswhocodes/DemoFX/blob/master/src/main/java/
>>>>>> com/ch
>>>>>> 
>>>>>> risnewland/demofx/standalone/Sierpinski.java
>>>>>> 
>>>>>> Was full frame rate canvas drawing an expected use case for
>>>>>> JavaFX or
>>>>>> would I be better off with Graphics2D?
>>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Chris
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On Mon, March 30, 2015 20:04, Jim Graham wrote:
>>>>>>> Hi Chris,
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> drawLine() is a very simple primitive that can be optimized
>>>>>>> with a GPU
>>>>>>> shader.  It either looks like a (potentially rotated) rectangle
>>>>>>> or a rounded rect - and we have optimized shaders for both
>>>>>>> cases.  A large number of drawLine() calls turns into simply
>>>>>>> accumulating a large vertex list and uploading it to the GPU
>>>>>>> with an appropriate shader which is very fast.
>>>>>>> 
>>>>>>> drawPolygon() is a very complex operation that involves things
>>>>>>> like:
>>>>>>> 
>>>>>>> 
>>>>>>> - dealing with line joins between segments that don't exist for
>>>>>>> drawLine() - dealing with only rendering common points of
>>>>>>> intersection once
>>>>>>> 
>>>>>>> To handle all of that complexity we have to involve a
>>>>>>> rasterizer that takes the entire collection of lines, analyzes
>>>>>>> the stroke attributes and interactions and computes a coverage
>>>>>>> mask for each pixel in the region. We do that in software
>>>>>>> currently for all pipelines.
>>>>>>> 
>>>>>>> For the ES2 pipeline Line.v.Poly is dominated by pure GPU vs
>>>>>>> CPU path
>>>>>>> rasterization.
>>>>>>> 
>>>>>>> For the SW pipeline, drawLine is a simplified case of
>>>>>>> drawPolygon and so the overhead of lots of calls to drawLine()
>>>>>>> dominates its performance.
>>>>>>> 
>>>>>>> I would expect ES2 to blow the SW pipeline out of the water
>>>>>>> with drawLine() performance (as long as there are no additional
>>>>>>> rendering primitives interspersed in the set of lines).
>>>>>>> 
>>>>>>> But, both should be on the same footing for the drawPolygon
>>>>>>> case. Does
>>>>>>> the ES2 pipeline compare similarly (hopefully better than) the
>>>>>>> SW
>>>>>>> pipeline for the polygon case?
>>>>>>> 
>>>>>>> One thing I noticed is that we have no optimized case for
>>>>>>> drawLine() on the SW pipeline.  It generates a path containing a
>>>>>>> single MOVETO and LINETO and feeds it to the generalized path
>>>>>>> rasterizer when it could instead compute the rounded/square
>>>>>>> rectangle and render it more directly.  If we added that support
>>>>>>> then I'd expect the SW pipeline to perform the set of drawLine
>>>>>>> calls faster than drawPolygon as well...
>>>>>>> 
>>>>>>> ...jim
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> On 3/28/15 3:22 AM, Chris Newland wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Hi Robert,
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> I've not filed a Jira yet as I was hoping to find time to
>>>>>>>> investigate thoroughly but when I saw your question I thought
>>>>>>>> I'd
>>>>>>>> better add my findings.
>>>>>>>> 
>>>>>>>> I believe the issue is in the ES2Pipeline as if I run with
>>>>>>>> -Dprism.order=sw then strokePolygon outperforms the series of
>>>>>>>> strokeLine commands as expected:
>>>>>>>> 
>>>>>>>> java -cp target/DemoFX.jar -Dprism.order=sw
>>>>>>>> com.chrisnewland.demofx.DemoFXApplication -c 500 -m line
>>>>>>>> Result:
>>>>>>>> 44fps
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> java -cp target/DemoFX.jar -Dprism.order=sw
>>>>>>>> com.chrisnewland.demofx.DemoFXApplication -c 500 -m poly
>>>>>>>> Result:
>>>>>>>> 60fps
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Will see if I can find the root cause as I've got plenty more
>>>>>>>> examples where ES2Pipeline performs horribly on my Mac which
>>>>>>>> should have no problem throwing around a few thousand polys.
>>>>>>>> 
>>>>>>>> I realise there's a *lot* of indirection involved in making
>>>>>>>> JavaFX
>>>>>>>> support such a wide range of underlying graphics systems but I
>>>>>>>> do think there's a bug here.
>>>>>>>> 
>>>>>>>> Will file a Jira if I can contribute a bit more than "feels
>>>>>>>> slow" ;)
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Cheers,
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Chris
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On Sat, March 28, 2015 10:06, Robert Krüger wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> This is consistent with what I am observing. Is this
>>>>>>>>> something that Oracle is aware of? Looking at Jira, I don't
>>>>>>>>> see that anyone is working on this:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> https://javafx-jira.kenai.com/issues/?jql=status%20in%20(Op
>>>>>>>>> en%2C% 20%2
>>>>>>>>> 2In%
>>>>>>>>> 20Progress%22%2C%20Reopened)%20AND%20labels%20in%20(macosx)%
>>>>>>>>> 20%20A
>>>>>>>>> ND%2
>>>>>>>>> 0la
>>>>>>>>> bels%20in%20(performance)
>>>>>>>>> 
>>>>>>>>> Given that one of the One of the main reasons to use JFX
>>>>>>>>> for me is to be able to develop with one code base for at
>>>>>>>>> least OSX and Windows and
>>>>>>>>> the official statement what JavaFX is for, i.e.
>>>>>>>>> 
>>>>>>>>> "JavaFX is a set of graphics and media packages that
>>>>>>>>> enables developers to design, create, test, debug, and
>>>>>>>>> deploy rich client applications that operate consistently
>>>>>>>>> across diverse platforms"
>>>>>>>>> 
>>>>>>>>> and the fact that this is clearly not the case currently
>>>>>>>>> (8u40)
>>>>>>>>> as soon as I do something else than simple forms, I run into
>>>>>>>>> performance/quality problems on the Mac, I am a bit unsure
>>>>>>>>> what to make of all that. Is Mac OSX a second-class citizen
>>>>>>>>> as far as dev resources are concerned?
>>>>>>>>> 
>>>>>>>>> Tobi and Chris, have you filed Jira Issues on Mac graphics
>>>>>>>>> performance that can be tracked?
>>>>>>>>> 
>>>>>>>>> I will file an issue with a simple test case and hope for
>>>>>>>>> the best.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, Mar 27, 2015 at 11:08 PM, Chris Newland
>>>>>>>>> <cnewland at chrisnewland.com>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> Possibly related:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> I can reproduce a massive (90%) performance drop on OSX
>>>>>>>>>> between drawing a wireframe polygon on a Canvas using a
>>>>>>>>>> series of gc.strokeLine(double x1, double y1, double x2,
>>>>>>>>>> double y2) commands versus using a single
>>>>>>>>>> gc.strokePolygon(double[] xPoints, double[] yPoints, int
>>>>>>>>>> count) command.
>>>>>>>>>> 
>>>>>>>>>> Creating the polygons manually with strokeLine() is
>>>>>>>>>> significantly faster using the ES2Pipeline on OSX.
>>>>>>>>>> 
>>>>>>>>>> This is reproducible in a little GitHub JavaFX
>>>>>>>>>> benchmarking project I've created:
>>>>>>>>>> https://github.com/chriswhocodes/DemoFX
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Build with ant
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Run with:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> # use strokeLine
>>>>>>>>>> ./run.sh -c 5000 -m line
>>>>>>>>>> result: 60 (sixty) fps
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> # use strokePolygon
>>>>>>>>>> ./run.sh -c 5000 -m poly
>>>>>>>>>> result: 6 (six) fps
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> System is 2011 iMac 27" / Mavericks / 3.4GHz Core i7 /
>>>>>>>>>> 20GB RAM
>>>>>>>>>> /
>>>>>>>>>> Radeon
>>>>>>>>>> 6970M 1024MB
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Looking at the code paths in
>>>>>>>>>> javafx.scene.canvas.GraphicsContext:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> gc.strokeLine() maps to writeOp4(x1, y1, x2, y2,
>>>>>>>>>> NGCanvas.STROKE_LINE)
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> gc.strokePolygon() maps to writePoly(xPoints, yPoints,
>>>>>>>>>> nPoints, true, NGCanvas.STROKE_PATH) which involves
>>>>>>>>>> significantly more work with adding to and flushing a
>>>>>>>>>> GrowableDataBuffer.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> I've not had time to dig any deeper than this but it's
>>>>>>>>>> surely a bug when building a poly manually is 10x faster
>>>>>>>>>> than using the convenience method.
>>>>>>>>>> 
>>>>>>>>>> Cheers,
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Chris
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Fri, March 27, 2015 21:26, Tobias Bley wrote:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> In my opinion the whole graphics performance on MacOSX
>>>>>>>>>>> isn’t good at all with JavaFX….
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> Am 27.03.2015 um 22:10 schrieb Robert Krüger
>>>>>>>>>>>> <krueger at lesspain.de>:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> The bad full screen performance is without the arcs.
>>>>>>>>>>>> It is
>>>>>>>>>>>> just one call to fillRect, two to strokeOval and one
>>>>>>>>>>>> to fillOval, that's all. I will build a simple test
>>>>>>>>>>>> case and file an issue.
>>>>>>>>>>>> 
>>>>>>>>>>>> On Fri, Mar 27, 2015 at 9:58 PM, Jim Graham
>>>>>>>>>>>> <james.graham at oracle.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi Robert,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Please file a Jira issue with a simple test case.
>>>>>>>>>>>>> Arcs
>>>>>>>>>>>>> are handled as a generalized shape rather than via a
>>>>>>>>>>>>> predetermined shader, but it shouldn't be that
>>>>>>>>>>>>> slow. Something else may
>>>>>>>>>>>>> be going on.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Another test might be to replace the arcs with
>>>>>>>>>>>>> rectangles or ellipses and see if the performance
>>>>>>>>>>>>> changes...
>>>>>>>>>>>>> 
>>>>>>>>>>>>> ...jim
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On 3/27/15 1:52 PM, Robert Krüger wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I have a super-simple animation implemented using
>>>>>>>>>>>>>> AnimationTimer
>>>>>>>>>>>>>> and Canvas where the canvas just performs a few
>>>>>>>>>>>>>> draw operations, i.e. fills the screen with a
>>>>>>>>>>>>>> color and then draws and fills 2-3 circles and I
>>>>>>>>>>>>>> have already observed that each drawing operation
>>>>>>>>>>>>>> I add, results in
>>>>>>>>>>>>>> significant CPU load (e.g. when I draw < 10 arcs
>>>>>>>>>>>>>> in addition to the circles, the CPU load goes up
>>>>>>>>>>>>>> to 30-40% on a Mac Book Pro for a Canvas size of
>>>>>>>>>>>>>> 600x600(!).
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Now I tested the animation in full screen mode
>>>>>>>>>>>>>> (only
>>>>>>>>>>>>>> with a few circles) and playback is unusable for a
>>>>>>>>>>>>>> serious application (very choppy). Is 2D canvas
>>>>>>>>>>>>>> performance known to be very bad on Mac or am I
>>>>>>>>>>>>>> doing something wrong? Are there workarounds for
>>>>>>>>>>>>>> this?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Robert
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> --
>>>>>>>>>>>> Robert Krüger
>>>>>>>>>>>> Managing Partner
>>>>>>>>>>>> Lesspain GmbH & Co. KG
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> www.lesspain-software.com
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Robert Krüger
>>>>>>>>> Managing Partner
>>>>>>>>> Lesspain GmbH & Co. KG
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> www.lesspain-software.com
> 
> 


More information about the openjfx-dev mailing list