Optimised, high-performance, multi-threaded rendering pipeline

Felix Bembrick felix.bembrick at gmail.com
Mon Nov 28 08:28:32 UTC 2016


No disappointment, no surprises.

It was a rhetorical question...

> On 28 Nov. 2016, at 19:08, Michael Paus <mp at jugs.org> wrote:
> 
>> Am 28.11.16 um 08:51 schrieb Felix Bembrick:
>> Great - good to see interest growing.
>> 
>> Especially given that you work for Oracle, right?
> Sorry, if I have to disappoint you on that but I do not work for Oracle.
> I run my own little company and are the head of the Java User Group Stuttgart.
> <http://www.jugs.de/>
>> 
>>> On 28 Nov. 2016, at 18:10, Michael Paus <mp at jugs.org> wrote:
>>> 
>>> I am interested too although I have only been listening quietly so far due to lack of time.
>>> Cheers
>>> Michael
>>> 
>>>> Am 28.11.16 um 06:54 schrieb Felix Bembrick:
>>>> Sorry Gerrit - you did indeed.
>>>> 
>>>> Maybe you'd also like to participate in the offline discussion (especially now that you don't work for Oracle)?
>>>> 
>>>>> On 28 Nov. 2016, at 16:07, han.solo at icloud.com wrote:
>>>>> 
>>>>> Well I mentioned before that I'm interested too :)
>>>>> 
>>>>> Cheers,
>>>>> 
>>>>> Gerrit
>>>>> 
>>>>> 
>>>>> Am 27. Nov. 2016, 22:58 +0100 schrieb Felix Bembrick <felix.bembrick at gmail.com>:
>>>>>> Well, given that you and Benjamin seem to be the only people interested in it, perhaps we should discuss it offline (so as not to bother Oracle or spam list this)...
>>>>>> 
>>>>>>> On 28 Nov. 2016, at 06:57, Tobias Bley <bley at jpro.io> wrote:
>>>>>>> 
>>>>>>> Where can we read more about your HPR renderer?
>> Am 28.11.16 um 08:51 schrieb Felix Bembrick:
>> 
>> Great - good to see interest growing.
>> 
>> Especially given that you work for Oracle, right?
> Sorry, if I have to disappoint you on that but I do not work for Oracle.
> I run my own little company and are the head of the Java User Group Stuttgart.
> <http://www.jugs.de/>
>> 
>>> On 28 Nov. 2016, at 18:10, Michael Paus <mp at jugs.org> wrote:
>>> 
>>> I am interested too although I have only been listening quietly so far due to lack of time.
>>> Cheers
>>> Michael
>>> 
>>>> Am 28.11.16 um 06:54 schrieb Felix Bembrick:
>>>> Sorry Gerrit - you did indeed.
>>>> 
>>>> Maybe you'd also like to participate in the offline discussion (especially now that you don't work for Oracle)?
>>>> 
>>>>> On 28 Nov. 2016, at 16:07, han.solo at icloud.com wrote:
>>>>> 
>>>>> Well I mentioned before that I'm interested too :)
>>>>> 
>>>>> Cheers,
>>>>> 
>>>>> Gerrit
>>>>> 
>>>>> 
>>>>> Am 27. Nov. 2016, 22:58 +0100 schrieb Felix Bembrick <felix.bembrick at gmail.com>:
>>>>>> Well, given that you and Benjamin seem to be the only people interested in it, perhaps we should discuss it offline (so as not to bother Oracle or spam list this)...
>>>>>> 
>>>>>>> On 28 Nov. 2016, at 06:57, Tobias Bley <bley at jpro.io> wrote:
>>>>>>> 
>>>>>>> Where can we read more about your HPR renderer?
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> Am 25.11.2016 um 16:45 schrieb Felix Bembrick <felix.bembrick at gmail.com>:
>>>>>>>> 
>>>>>>>> Short answer? Maybe.
>>>>>>>> 
>>>>>>>> But exactly one more word than any from Oracle ;-)
>>>>>>>> 
>>>>>>>>> On 26 Nov. 2016, at 00:07, Tobias Bley <bley at jpro.io> wrote:
>>>>>>>>> 
>>>>>>>>> A very short answer ;) ….
>>>>>>>>> 
>>>>>>>>> Do you have any URL?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> Am 25.11.2016 um 12:19 schrieb Felix Bembrick <felix.bembrick at gmail.com>:
>>>>>>>>>> 
>>>>>>>>>> Yes.
>>>>>>>>>> 
>>>>>>>>>>> On 25 Nov. 2016, at 21:45, Tobias Bley <bley at jpro.io> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Hi,
>>>>>>>>>>> 
>>>>>>>>>>> @Felix: Is there any Github project, demo video or trial to test HPR with JavaFX?
>>>>>>>>>>> 
>>>>>>>>>>> Best regards,
>>>>>>>>>>> Tobi
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> Am 11.11.2016 um 12:08 schrieb Felix Bembrick <felix.bembrick at gmail.com>:
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks Laurent,
>>>>>>>>>>>> 
>>>>>>>>>>>> That's another thing we discovered: using Java itself in the most performant way can help a lot.
>>>>>>>>>>>> 
>>>>>>>>>>>> It can be tricky, but profiling can often highlight various patterns of object instantiation that show-up red flags and can lead you directly to regions of the code that can be refactored to be significantly more efficient.
>>>>>>>>>>>> 
>>>>>>>>>>>> Also, the often overlooked GC log analysis can lead to similar discoveries and remedies.
>>>>>>>>>>>> 
>>>>>>>>>>>> Blessings,
>>>>>>>>>>>> 
>>>>>>>>>>>> Felix
>>>>>>>>>>>> 
>>>>>>>>>>>>> On 11 Nov. 2016, at 21:55, Laurent Bourgès <bourges.laurent at gmail.com> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> To optimize Pisces that became the Marlin rasterizer, I carefully avoided any both array allocation (byte/int/float pools) and also reduced array copies or clean up ie only clear dirty parts.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> This approach is generic and could be applied in other critical places of the rendering pipelines.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> FYI here are my fosdem 2016 slides on the Marlin renderer:
>>>>>>>>>>>>> https://bourgesl.github.io/fosdem-2016/slides/fosdem-2016-Marlin.pdf
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Of course I would be happy to share my experience and work with a tiger team on optimizing JavaFX graphics.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> However I would like getting sort of sponsoring for my potential contributions...
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> Laurent
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Le 11 nov. 2016 11:29, "Tobi" <tobi at ultramixer.com> a écrit :
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> thanks Felix, Laurent and Chris for sharing your stuff with the community!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I am happy to see starting a discussion about boosting up the JavaFX rendering performance. I can confirm that the performance of JavaFX scene graph is not there where it should be. So multithreading would be an excellent, but difficult approach.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Felix, concerning your research of other toolkits: Do they all use multithreading or are there any toolkits which use single threading but are faster than JavaFX?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> So maybe there are other points than multithreading where we can boost the performance?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 2) your HPR sounds great. Did you already try DemoFX (part 3) benchmark with your HPR?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>> Tobi
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Am 10.11.2016 um 19:11 schrieb Felix Bembrick <felix.bembrick at gmail.com>:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> (Thanks to Kevin for lifting my "awaiting moderation" impasse).
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> So, with all the recent discussions regarding the great contribution by
>>>>>>>>>>>>>>> Laurent Bourgès of MarlinFX, it was suggested that a separate thread be
>>>>>>>>>>>>>>> started to discuss parallelisation of the JavaFX rendering pipeline in
>>>>>>>>>>>>>>> general.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> As has been correctly pointed-out, converting or modifying the existing
>>>>>>>>>>>>>>> rendering pipeline into a fully multi-threaded and performant beast is
>>>>>>>>>>>>>>> indeed quite a complex task.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> But, that's exactly what myself and my colleagues have been working on for
>>>>>>>>>>>>>>> about 2 years.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The result is what we call the Hyper Rendering Pipeline (HPR).
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Work on HPR started when we developed FXMark and were (bitterly)
>>>>>>>>>>>>>>> disappointed with the performance of the JavaFX scene graph. Many JavaFX
>>>>>>>>>>>>>>> developers have blogged about the need to dramatically minimise the number
>>>>>>>>>>>>>>> of nodes (especially on embedded devices) in order to achieve even
>>>>>>>>>>>>>>> "acceptable" performance. Often it is the case that most (if not all
>>>>>>>>>>>>>>> rendering) is eventually done in a single Canvas node.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Now, as well already know, the JavaFX Canvas does perform very well and the
>>>>>>>>>>>>>>> recent awesome work (DemoFX) by Chris Newland, just for example, shows what
>>>>>>>>>>>>>>> can be done with this one node.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> But, the majority of the animation plumbing in JavaFX is related to the
>>>>>>>>>>>>>>> scene graph itself and is designed to make use of multiple nodes and node
>>>>>>>>>>>>>>> types. At the moment, the performance of this scene graph is the Achilles
>>>>>>>>>>>>>>> Heel of JavaFX (or at least one of them).
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Enter HPR.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I personally have worked with a number of hardware-accelerated toolkits
>>>>>>>>>>>>>>> over the years and am astounded by just how sluggish the rendering pipeline
>>>>>>>>>>>>>>> for JavaFX is. When I am animating just a couple of hundred nodes using
>>>>>>>>>>>>>>> JavaFX and transitions, I am lucky to get more than about 30 FPS, but on
>>>>>>>>>>>>>>> the same (very powerful) machine, I can use other toolkits to render
>>>>>>>>>>>>>>> thousands of "objects" and achieve frame rates well over 1000 FPS.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> So, we refactored the entire scene graph rendering pipeline with the
>>>>>>>>>>>>>>> following goals and principles:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 1. It is written using JavaFX 9 and Java 9 (but could theoretically be
>>>>>>>>>>>>>>> back-ported to JavaFX 8 though I see no reason to).
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 2. We analysed how other toolkits had optimised their own rendering
>>>>>>>>>>>>>>> pipelines (especially Qt which has made some significant advances in this
>>>>>>>>>>>>>>> area in recent years). We also analysed recent examples of multi-threaded
>>>>>>>>>>>>>>> rendering using the new Vulkan API.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 3. We carefully analysed and determined which parts of the pipeline should
>>>>>>>>>>>>>>> best utilise the CPU and which parts should best utilise the GPU.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 4. For those parts most suited to the CPU, we use the advanced concurrency
>>>>>>>>>>>>>>> features of Java 8/9 to maximise parallelisation and throughput by
>>>>>>>>>>>>>>> utilising multiple cores & threads in as an efficient manner as possible.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 5. We devoted a large amount of time to optimising the "communication"
>>>>>>>>>>>>>>> between the CPU and GPU to be far less "chatty" and this alone led to some
>>>>>>>>>>>>>>> huge performance gains.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 6. We also looked at the structure of the scene graph itself and after
>>>>>>>>>>>>>>> studying products such as OpenSceneGraph, we refactored the JavaFX scene
>>>>>>>>>>>>>>> graph in such a way that it lends itself to optimised rendering much more
>>>>>>>>>>>>>>> easily.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 7. This is clearly not a "small" patch. In fact to refer to it as a
>>>>>>>>>>>>>>> "patch" is probably rather inappropriate.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The end result is that we now have a fully-functional prototype of HPR and,
>>>>>>>>>>>>>>> already, we are seeing very significant performance improvements.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> At the minimum, scene graph rendering performance has improved by 500% and,
>>>>>>>>>>>>>>> with judicious and sometimes "tricky" use of caching, we have seen
>>>>>>>>>>>>>>> improvements in performance of 10x or more.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> And... we are only just *starting* with the performance optimisation phase.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The potential for HPR is massive as it opens-up the possibility for the
>>>>>>>>>>>>>>> JavaFX scene graph and the animation/transition infrastructure to be used
>>>>>>>>>>>>>>> for a whole new class of applications including games, advanced
>>>>>>>>>>>>>>> visualisations etc., without having to rely on imperative programming of a
>>>>>>>>>>>>>>> single Canvas node.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I believe that HPR, along with tremendous recent developments like JPro and
>>>>>>>>>>>>>>> the outstanding work by Gluon on mobiles and embedded devices, could
>>>>>>>>>>>>>>> position JavaFX to be the best graphics toolkit of any kind in any language
>>>>>>>>>>>>>>> and, be the ONLY *truly* cross-platform graphics technology available.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> WORA for graphics and UIs is finally within reach!
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Blessings,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Felix
>>> 
> 


More information about the openjfx-dev mailing list