Canvas performance on Mac OS
Mike
mikegps1 at gmail.com
Thu Apr 9 16:11:37 UTC 2015
This is important
Thanks guys
Sent from my iPhone
> On Apr 8, 2015, at 9:25 AM, Chris Newland <cnewland at chrisnewland.com> wrote:
>
> Hi Jim,
>
> I'll post the verbose prism output from my iMac when I get home.
>
> Just tried this on my Linux workstation and the performance gap is the
> same between es2 and sw so I don't think it's an OSX issue.
>
> uname -a
> Linux chris 3.2.0-4-amd64 #1 SMP Debian 3.2.65-1+deb7u2 x86_64 GNU/Linux
>
> "$JAVA_HOME/bin/java" -classpath target/DemoFX.jar
> com.chrisnewland.demofx.standalone.Sierpinski
> fps: 1
> fps: 20
> fps: 31
> fps: 32
> fps: 33
> fps: 35
> fps: 34
> fps: 33
>
> "$JAVA_HOME/bin/java" -Dprism.order=sw -classpath target/DemoFX.jar
> com.chrisnewland.demofx.standalone.Sierpinski
> fps: 1
> fps: 54
> fps: 56
> fps: 60
> fps: 59
> fps: 60
> fps: 61
> fps: 61
> fps: 60
>
> This is a Xeon W3520 quad-core HT box with an Nvidia Quadro FX 580
> graphics card running driver 304.125
>
> Regards,
>
> Chris
>
>
>> On Wed, April 8, 2015 00:16, Jim Graham wrote:
>> OK, I took the time to put my rMBP on a diet yesterday and find room to
>> install a 10.10 partition. I get the same numbers for Sierpinski on 10.10,
>> so my theory that something changed in the OGL implementation for 10.10
>> doesn't hold water.
>>
>> But, I then tried it using the integrated graphics. I get really poor
>> performance using the integrated Intel 4000 graphics, but I get great
>> numbers on the discrete nVidia 650m. It makes sense that the Intel
>> graphics wouldn't be as powerful as the discrete graphics, but we
>> shouldn't be taxing it that much to make that big of a difference.
>>
>> Just to be sure - is that iMac a dual graphics system, or is it
>> all-AMD-all-the-time? You can see which GPU is being used if you run it
>> with -Dprism.verbose=true...
>>
>> ...jim
>>
>>
>>> On 4/2/15 4:13 PM, Jim Graham wrote:
>>>
>>> On my retina MBP (10.8) I get 60fps for es2 and 44fps for sw. Are you
>>> running a newer version of MacOS?
>>>
>>> ...jim
>>>
>>>
>>>> On 3/31/15 3:40 PM, Chris Newland wrote:
>>>>
>>>> Hi Hervé,
>>>>
>>>>
>>>> That's a valid question :)
>>>>
>>>>
>>>> Probably because
>>>>
>>>>
>>>> a) All my non-UI graphics experience is with immediate-mode / raster
>>>> systems
>>>>
>>>> b) I'm interested in using JavaFX for particle effects / demoscene /
>>>> gaming so assumed (perhaps wrongly?) that scenegraph was not the way
>>>> to go for that due to the very large number of nodes.
>>>>
>>>> Numbers for my Sierpinski filled triangle example:
>>>>
>>>>
>>>> System: 2011 iMac Core i7 3.4GHz / 20GB RAM / AMD Radeon HD 6970M
>>>> 1024 MB
>>>>
>>>>
>>>> java -Dprism.order=es2 -cp target/classes/
>>>> com.chrisnewland.demofx.standalone.Sierpinski fps: 1
>>>> fps: 23
>>>> fps: 18
>>>> fps: 25
>>>> fps: 18
>>>> fps: 23
>>>> fps: 23
>>>> fps: 19
>>>> fps: 25
>>>>
>>>>
>>>> java -Dprism.order=sw -cp target/classes/
>>>> com.chrisnewland.demofx.standalone.Sierpinski fps: 1
>>>> fps: 54
>>>> fps: 60
>>>> fps: 60
>>>> fps: 60
>>>> fps: 60
>>>> fps: 60
>>>> fps: 60
>>>> fps: 60
>>>> fps: 60
>>>> fps: 60
>>>>
>>>>
>>>> There are never more than 2500 filled triangles on screen. JDK is
>>>> 1.8.0_40
>>>>
>>>>
>>>> I would say there is a performance problem here? (or at least a need
>>>> for documentation so as to set expectations for gc.fillPolygon).
>>>>
>>>> Best regards,
>>>>
>>>>
>>>> Chris
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> On Tue, March 31, 2015 22:00, Hervé Girod wrote:
>>>>>
>>>>> Why don't you use Nodes rather than Canvas ?
>>>>>
>>>>>
>>>>>
>>>>> Sent from my iPhone
>>>>>
>>>>>
>>>>>
>>>>>> On Mar 31, 2015, at 22:31, Chris Newland
>>>>>> <cnewland at chrisnewland.com>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hi Jim,
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks, that makes things much clearer.
>>>>>>
>>>>>>
>>>>>>
>>>>>> I was surprised how much was going on under the hood of
>>>>>> GraphicsContext
>>>>>> and hoped it was just magic glue that gave the best of GPU
>>>>>> acceleration where available and immediate-mode-like simple
>>>>>> rasterizing where not.
>>>>>>
>>>>>> I've managed to find an anomaly with GraphicsContext.fillPolygon
>>>>>> where the software pipeline achieves the full 60fps but ES2 can
>>>>>> only manage 30-35fps. It uses lots of overlapping filled triangles
>>>>>> so I expect suffers from the problem you've described.
>>>>>>
>>>>>> SSCCE:
>>>>>> https://github.com/chriswhocodes/DemoFX/blob/master/src/main/java/
>>>>>> com/ch
>>>>>>
>>>>>> risnewland/demofx/standalone/Sierpinski.java
>>>>>>
>>>>>> Was full frame rate canvas drawing an expected use case for
>>>>>> JavaFX or
>>>>>> would I be better off with Graphics2D?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On Mon, March 30, 2015 20:04, Jim Graham wrote:
>>>>>>> Hi Chris,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> drawLine() is a very simple primitive that can be optimized
>>>>>>> with a GPU
>>>>>>> shader. It either looks like a (potentially rotated) rectangle
>>>>>>> or a rounded rect - and we have optimized shaders for both
>>>>>>> cases. A large number of drawLine() calls turns into simply
>>>>>>> accumulating a large vertex list and uploading it to the GPU
>>>>>>> with an appropriate shader which is very fast.
>>>>>>>
>>>>>>> drawPolygon() is a very complex operation that involves things
>>>>>>> like:
>>>>>>>
>>>>>>>
>>>>>>> - dealing with line joins between segments that don't exist for
>>>>>>> drawLine() - dealing with only rendering common points of
>>>>>>> intersection once
>>>>>>>
>>>>>>> To handle all of that complexity we have to involve a
>>>>>>> rasterizer that takes the entire collection of lines, analyzes
>>>>>>> the stroke attributes and interactions and computes a coverage
>>>>>>> mask for each pixel in the region. We do that in software
>>>>>>> currently for all pipelines.
>>>>>>>
>>>>>>> For the ES2 pipeline Line.v.Poly is dominated by pure GPU vs
>>>>>>> CPU path
>>>>>>> rasterization.
>>>>>>>
>>>>>>> For the SW pipeline, drawLine is a simplified case of
>>>>>>> drawPolygon and so the overhead of lots of calls to drawLine()
>>>>>>> dominates its performance.
>>>>>>>
>>>>>>> I would expect ES2 to blow the SW pipeline out of the water
>>>>>>> with drawLine() performance (as long as there are no additional
>>>>>>> rendering primitives interspersed in the set of lines).
>>>>>>>
>>>>>>> But, both should be on the same footing for the drawPolygon
>>>>>>> case. Does
>>>>>>> the ES2 pipeline compare similarly (hopefully better than) the
>>>>>>> SW
>>>>>>> pipeline for the polygon case?
>>>>>>>
>>>>>>> One thing I noticed is that we have no optimized case for
>>>>>>> drawLine() on the SW pipeline. It generates a path containing a
>>>>>>> single MOVETO and LINETO and feeds it to the generalized path
>>>>>>> rasterizer when it could instead compute the rounded/square
>>>>>>> rectangle and render it more directly. If we added that support
>>>>>>> then I'd expect the SW pipeline to perform the set of drawLine
>>>>>>> calls faster than drawPolygon as well...
>>>>>>>
>>>>>>> ...jim
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> On 3/28/15 3:22 AM, Chris Newland wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi Robert,
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I've not filed a Jira yet as I was hoping to find time to
>>>>>>>> investigate thoroughly but when I saw your question I thought
>>>>>>>> I'd
>>>>>>>> better add my findings.
>>>>>>>>
>>>>>>>> I believe the issue is in the ES2Pipeline as if I run with
>>>>>>>> -Dprism.order=sw then strokePolygon outperforms the series of
>>>>>>>> strokeLine commands as expected:
>>>>>>>>
>>>>>>>> java -cp target/DemoFX.jar -Dprism.order=sw
>>>>>>>> com.chrisnewland.demofx.DemoFXApplication -c 500 -m line
>>>>>>>> Result:
>>>>>>>> 44fps
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> java -cp target/DemoFX.jar -Dprism.order=sw
>>>>>>>> com.chrisnewland.demofx.DemoFXApplication -c 500 -m poly
>>>>>>>> Result:
>>>>>>>> 60fps
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Will see if I can find the root cause as I've got plenty more
>>>>>>>> examples where ES2Pipeline performs horribly on my Mac which
>>>>>>>> should have no problem throwing around a few thousand polys.
>>>>>>>>
>>>>>>>> I realise there's a *lot* of indirection involved in making
>>>>>>>> JavaFX
>>>>>>>> support such a wide range of underlying graphics systems but I
>>>>>>>> do think there's a bug here.
>>>>>>>>
>>>>>>>> Will file a Jira if I can contribute a bit more than "feels
>>>>>>>> slow" ;)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> On Sat, March 28, 2015 10:06, Robert Krüger wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This is consistent with what I am observing. Is this
>>>>>>>>> something that Oracle is aware of? Looking at Jira, I don't
>>>>>>>>> see that anyone is working on this:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> https://javafx-jira.kenai.com/issues/?jql=status%20in%20(Op
>>>>>>>>> en%2C% 20%2
>>>>>>>>> 2In%
>>>>>>>>> 20Progress%22%2C%20Reopened)%20AND%20labels%20in%20(macosx)%
>>>>>>>>> 20%20A
>>>>>>>>> ND%2
>>>>>>>>> 0la
>>>>>>>>> bels%20in%20(performance)
>>>>>>>>>
>>>>>>>>> Given that one of the One of the main reasons to use JFX
>>>>>>>>> for me is to be able to develop with one code base for at
>>>>>>>>> least OSX and Windows and
>>>>>>>>> the official statement what JavaFX is for, i.e.
>>>>>>>>>
>>>>>>>>> "JavaFX is a set of graphics and media packages that
>>>>>>>>> enables developers to design, create, test, debug, and
>>>>>>>>> deploy rich client applications that operate consistently
>>>>>>>>> across diverse platforms"
>>>>>>>>>
>>>>>>>>> and the fact that this is clearly not the case currently
>>>>>>>>> (8u40)
>>>>>>>>> as soon as I do something else than simple forms, I run into
>>>>>>>>> performance/quality problems on the Mac, I am a bit unsure
>>>>>>>>> what to make of all that. Is Mac OSX a second-class citizen
>>>>>>>>> as far as dev resources are concerned?
>>>>>>>>>
>>>>>>>>> Tobi and Chris, have you filed Jira Issues on Mac graphics
>>>>>>>>> performance that can be tracked?
>>>>>>>>>
>>>>>>>>> I will file an issue with a simple test case and hope for
>>>>>>>>> the best.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Mar 27, 2015 at 11:08 PM, Chris Newland
>>>>>>>>> <cnewland at chrisnewland.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Possibly related:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I can reproduce a massive (90%) performance drop on OSX
>>>>>>>>>> between drawing a wireframe polygon on a Canvas using a
>>>>>>>>>> series of gc.strokeLine(double x1, double y1, double x2,
>>>>>>>>>> double y2) commands versus using a single
>>>>>>>>>> gc.strokePolygon(double[] xPoints, double[] yPoints, int
>>>>>>>>>> count) command.
>>>>>>>>>>
>>>>>>>>>> Creating the polygons manually with strokeLine() is
>>>>>>>>>> significantly faster using the ES2Pipeline on OSX.
>>>>>>>>>>
>>>>>>>>>> This is reproducible in a little GitHub JavaFX
>>>>>>>>>> benchmarking project I've created:
>>>>>>>>>> https://github.com/chriswhocodes/DemoFX
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Build with ant
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Run with:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> # use strokeLine
>>>>>>>>>> ./run.sh -c 5000 -m line
>>>>>>>>>> result: 60 (sixty) fps
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> # use strokePolygon
>>>>>>>>>> ./run.sh -c 5000 -m poly
>>>>>>>>>> result: 6 (six) fps
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> System is 2011 iMac 27" / Mavericks / 3.4GHz Core i7 /
>>>>>>>>>> 20GB RAM
>>>>>>>>>> /
>>>>>>>>>> Radeon
>>>>>>>>>> 6970M 1024MB
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Looking at the code paths in
>>>>>>>>>> javafx.scene.canvas.GraphicsContext:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> gc.strokeLine() maps to writeOp4(x1, y1, x2, y2,
>>>>>>>>>> NGCanvas.STROKE_LINE)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> gc.strokePolygon() maps to writePoly(xPoints, yPoints,
>>>>>>>>>> nPoints, true, NGCanvas.STROKE_PATH) which involves
>>>>>>>>>> significantly more work with adding to and flushing a
>>>>>>>>>> GrowableDataBuffer.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I've not had time to dig any deeper than this but it's
>>>>>>>>>> surely a bug when building a poly manually is 10x faster
>>>>>>>>>> than using the convenience method.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Chris
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, March 27, 2015 21:26, Tobias Bley wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> In my opinion the whole graphics performance on MacOSX
>>>>>>>>>>> isnââ¬â¢t good at all with JavaFXââ¬Â¦.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Am 27.03.2015 um 22:10 schrieb Robert Krüger
>>>>>>>>>>>> <krueger at lesspain.de>:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> The bad full screen performance is without the arcs.
>>>>>>>>>>>> It is
>>>>>>>>>>>> just one call to fillRect, two to strokeOval and one
>>>>>>>>>>>> to fillOval, that's all. I will build a simple test
>>>>>>>>>>>> case and file an issue.
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Mar 27, 2015 at 9:58 PM, Jim Graham
>>>>>>>>>>>> <james.graham at oracle.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Robert,
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please file a Jira issue with a simple test case.
>>>>>>>>>>>>> Arcs
>>>>>>>>>>>>> are handled as a generalized shape rather than via a
>>>>>>>>>>>>> predetermined shader, but it shouldn't be that
>>>>>>>>>>>>> slow. Something else may
>>>>>>>>>>>>> be going on.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Another test might be to replace the arcs with
>>>>>>>>>>>>> rectangles or ellipses and see if the performance
>>>>>>>>>>>>> changes...
>>>>>>>>>>>>>
>>>>>>>>>>>>> ...jim
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 3/27/15 1:52 PM, Robert Krüger wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have a super-simple animation implemented using
>>>>>>>>>>>>>> AnimationTimer
>>>>>>>>>>>>>> and Canvas where the canvas just performs a few
>>>>>>>>>>>>>> draw operations, i.e. fills the screen with a
>>>>>>>>>>>>>> color and then draws and fills 2-3 circles and I
>>>>>>>>>>>>>> have already observed that each drawing operation
>>>>>>>>>>>>>> I add, results in
>>>>>>>>>>>>>> significant CPU load (e.g. when I draw < 10 arcs
>>>>>>>>>>>>>> in addition to the circles, the CPU load goes up
>>>>>>>>>>>>>> to 30-40% on a Mac Book Pro for a Canvas size of
>>>>>>>>>>>>>> 600x600(!).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Now I tested the animation in full screen mode
>>>>>>>>>>>>>> (only
>>>>>>>>>>>>>> with a few circles) and playback is unusable for a
>>>>>>>>>>>>>> serious application (very choppy). Is 2D canvas
>>>>>>>>>>>>>> performance known to be very bad on Mac or am I
>>>>>>>>>>>>>> doing something wrong? Are there workarounds for
>>>>>>>>>>>>>> this?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Robert
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Robert Krüger
>>>>>>>>>>>> Managing Partner
>>>>>>>>>>>> Lesspain GmbH & Co. KG
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> www.lesspain-software.com
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Robert Krüger
>>>>>>>>> Managing Partner
>>>>>>>>> Lesspain GmbH & Co. KG
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> www.lesspain-software.com
>
>
More information about the openjfx-dev
mailing list