[OpenJDK 2D-Dev] sun.java2D.Pisces renderer Performance and Memory enhancements
Laurent Bourgès
bourges.laurent at gmail.com
Wed Apr 17 20:14:13 UTC 2013
Jim,
thanks for having some interest for my efforts !
As I got almost no feedback, I felt quite disappointed and was thinking
that improving pisces was not important ...
Here are ductus results and comparison (open office format):
http://jmmc.fr/~bourgesl/share/java2d-pisces/ductus_det.log
http://jmmc.fr/~bourgesl/share/java2d-pisces/compareRef_Patch.ods
test threads ops Tavg Tmed stdDev rms *Med+Stddev* min max boulder_17 1
20 73,92% 69,34% 27,98% 69,34% *69,14%* 69,81% 146,89% boulder_17 2 20
110,86% 110,48% 613,80% 112,01% *125,43%* 94,71% 136,35% boulder_17 4 20
135,28% 135,86% 226,61% 136,46% *141,85%* 125,14% 111,32% shp_alllayers_47
1 20 82,25% 82,49% 47,50% 82,48% *82,30%* 82,64% 78,08% shp_alllayers_47 2
20 115,87% 115,67% 315,45% 115,85% *119,89%* 109,33% 128,71%
shp_alllayers_47 4 20 218,59% 218,76% 169,38% 218,65% *216,45%* 220,17%
206,17%
Ductus vs Patch
*1* *80,74%* *2* *120,69%* *4* *205,92%*
Reminder: Ref vs Patch
*1* *237,71%* *2* *271,68%* *4* *286,15%*
Note: I only have 2 cores + HT on my laptop and do not test with more
threads (64 like andrea).
2013/4/16 Jim Graham <james.graham at oracle.com>
> If I'm reading this correctly, your patch is faster even for a single
> thread? That's great news.
>
Not yet, but ductus is now only 20% faster than my patch and 20% and 2x
slower with 2 and 4 threads :
I still hope to beat it applying few more optimizations:
- Renderer.ScanLine iterator / Renderer.endRendering can be improved
- avoid few more array zero fill (reset) if possible
- adding statistics to better set initial array sizes ...
- use SunGraphics2D to hold an hard reference (temporarly) on
RendererContext (to avoid many ThreadLocal.get())
- cache eviction (WeakReference or SoftReference) ?
Why not use divide and conquer (thread pool) to boost single thread
rendering if the machine has more cpu cores ?
It would be helpful if the AATileGenerator has access to SunGraphics2D to
get rendering hints or share variables (cache ...)
For the moment, I did not modify the algorithms itself but I will do it to
enhance clipping (dasher segments / stroker) ...
> One of the problems we've had with replacing Ductus is that it has been
> faster in a single thread situation than the open source versions we've
> created. One of its drawbacks is that it had been designed to take
> advantage of some AA-accelerating hardware that never came to be. With the
> accelerator it would have been insanely fast, but hardware went in a
> different direction. The problem was that this early design goal caused
> the entire library to be built around an abstraction layer that allowed for
> a single "tile producer" internally (because there would be only one -
> insanely fast - hardware chip available) and the software version of the
> abstraction layer thus had a lot of native "static" data structures
> (there's only one of me, right?) that prevented MT access. It was probably
> solvable, but I'd be happier if someone could come up with a faster
> rasterizer, imagining that there must have been some sort of advancements
> in the nearly 2 decades since the original was written.
>
> If I'm misinterpreting and single thread is still slower than Ductus (or
> if it is still slower on some other platforms), then <frowny face>.
>
Not yet: slower than ductus by 20% but faster than current pisces by 2
times !
> Also, this is with the Java version, right?
Yes, my patch is pure java given as webrev:
http://jmmc.fr/~bourgesl/share/java2d-pisces/webrev-1/
> We got a decent 2x speedup in FX by porting the version of Open Pisces
> that you started with to C code (all except on Linux where we couldn't find
> the right gcc options to make it faster than Hotspot). So, we have yet to
> investigate a native version in the JDK which might provide even more
> gains...
>
Personally I prefer working on java code as hotspot can perform so much
optimizations for free and no pointers to deal with and more important:
concurrent primitives (thread local, collections) !
Laurent
>
> On 4/15/13 3:01 AM, Laurent Bourgčs wrote:
>
>> Jim, Andrea,
>>
>> I updated MapBench to provide test statistics (avg, median, stddev, rms,
>> med + stddev, min, max) and CSV output (tab separator):
>> http://jmmc.fr/~bourgesl/**share/java2d-pisces/MapBench/<http://jmmc.fr/%7Ebourgesl/share/java2d-pisces/MapBench/>
>> <http://jmmc.fr/%7Ebourgesl/**share/java2d-pisces/MapBench/<http://jmmc.fr/%7Ebourgesl/share/java2d-pisces/MapBench/>
>> >
>>
>>
>>
>> Here are the results (OpenJDK8 Ref vs Patched):
>> http://jmmc.fr/~bourgesl/**share/java2d-pisces/ref_det.**log<http://jmmc.fr/%7Ebourgesl/share/java2d-pisces/ref_det.log>
>> http://jmmc.fr/~bourgesl/**share/java2d-pisces/patch_det.**log<http://jmmc.fr/%7Ebourgesl/share/java2d-pisces/patch_det.log>
>>
>> test threads ops Tavg Tmed stdDev rms
>> Med+Stddev min max
>> boulder_17 1 20 180,22% 181,08% 1186,01%
>> 181,17% 185,92%
>> 176,35% 170,36%
>> boulder_17 2 20 183,15% 183,80% 162,68%
>> 183,78% 183,17%
>> 174,01% 169,89%
>> boulder_17 4 20 216,62% 218,03% 349,31%
>> 218,87% 226,68%
>> 172,15% 167,54%
>> shp_alllayers_47 1 20 243,90% 244,86%
>> 537,92% 244,87% 246,39%
>> 240,64% 231,00%
>> shp_alllayers_47 2 20 286,42% 287,07%
>> 294,87% 287,07% 287,23%
>> 277,19% 272,23%
>> shp_alllayers_47 4 20 303,08% 302,15%
>> 168,19% 301,90% 295,90%
>> 462,70% 282,41%
>>
>>
>>
>> PATCH:
>> test threads ops Tavg Tmed stdDev rms
>> Med+Stddev min max
>> boulder_17 1 20 110,196 109,244 0,529
>> 109,246 109,773 108,197
>> 129,327
>> boulder_17 2 40 127,916 127,363 3,899
>> 127,423 131,262 125,262
>> 151,561
>> boulder_17 4 80 213,085 212,268 14,988
>> 212,796 227,256 155,512
>> 334,407
>> shp_alllayers_47 1 20 1139,452 1134,858
>> 5,971 1134,873 1140,829
>> 1125,859 1235,746
>> shp_alllayers_47 2 40 1306,889 1304,598
>> 28,157 1304,902
>> 1332,755 1280,49 1420,351
>> shp_alllayers_47 4 80 2296,487 2303,81
>> 112,816 2306,57 2416,626
>> 1390,31 2631,455
>>
>>
>>
>> REF:
>> test threads ops Tavg Tmed stdDev rms
>> Med+Stddev min max
>> boulder_17 1 20 198,591 197,816 6,274
>> 197,916 204,091 190,805
>> 220,319
>> boulder_17 2 40 234,272 234,09 6,343 234,176
>> 240,433 217,967
>> 257,485
>> boulder_17 4 80 461,579 462,8 52,354 465,751
>> 515,153 267,712
>> 560,254
>> shp_alllayers_47 1 20 2779,133 2778,823
>> 32,119 2779,009
>> 2810,943 2709,285 2854,557
>> shp_alllayers_47 2 40 3743,255 3745,111
>> 83,027 3746,031
>> 3828,138 3549,364 3866,612
>> shp_alllayers_47 4 80 6960,23 6960,948
>> 189,75 6963,533 7150,698
>> 6432,945 7431,541
>>
>>
>> Linux 64 server vm
>> JVM: -Xms128m -Xmx128m (low mem)
>>
>> Laurent
>>
>> 2013/4/14 Andrea Aime <andrea.aime at geo-solutions.it
>> <mailto:andrea.aime at geo-**solutions.it <andrea.aime at geo-solutions.it>>>
>>
>> On Tue, Apr 9, 2013 at 3:02 PM, Laurent Bourgčs
>>
>> <bourges.laurent at gmail.com <mailto:bourges.laurent at gmail.**com<bourges.laurent at gmail.com>>>
>> wrote:
>>
>> Dear Java2D members,
>>
>> Could someone review the following webrev concerning Java2D
>> Pisces to enhance its performance and reduce its memory
>> footprint (RendererContext stored in thread local or concurrent
>> queue):
>> http://jmmc.fr/~bourgesl/**share/java2d-pisces/webrev-1/<http://jmmc.fr/%7Ebourgesl/share/java2d-pisces/webrev-1/>
>> <http://jmmc.fr/%7Ebourgesl/**share/java2d-pisces/webrev-1/<http://jmmc.fr/%7Ebourgesl/share/java2d-pisces/webrev-1/>
>> >
>>
>>
>> FYI I fixed file headers in this patch and signed my OCA 3 weeks
>> ago.
>>
>> Remaining work:
>> - cleanup (comments ...)
>> - statistics to perform auto-tuning
>> - cache / memory cleanup (SoftReference ?): use hints or System
>> properties to adapt it to use cases
>> - another problem: fix clipping performance in Dasher / Stroker
>> for segments out of bounds
>>
>> Could somebody support me ? ie help me working on these tasks or
>> just to discuss on Pisces algorithm / implementation ?
>>
>>
>> Hi,
>> I would like to express my support for this patch.
>> Given that micro-benchmarks have already been run, I took the patch
>> for a spin in a large, real world benchmark instead,
>> the OSGeo WMS Shootout 2010 benchmark, for which you can see the
>> results here:
>> http://www.slideshare.net/**gatewaygeomatics.com/wms-**
>> performance-shootout-2010<http://www.slideshare.net/gatewaygeomatics.com/wms-performance-shootout-2010>
>>
>> The presentation is long, but suffice it to say all Java based
>> implementations took quite the beating due to the
>> poor scalability of Ductus with antialiased rendering of vector data
>> (for an executive summary just look
>> at slide 27 and slide 66, where GeoServer, Oracle MapViewer and
>> Constellation SDI were the
>> Java based ones)
>>
>> I took the same tests and run them again on my machine (different
>> hardware than the tests, don't try to compare
>> the absolute values), using Oracle JDK 1.7.0_17, OpenJDK 8 (a
>> checkout a couple of weeks old) and the
>> same, but with Laurent's patches applied.
>> Here are the results, throughput (in maps generated per second) with
>> the load generator (JMeter) going
>> up from one client to 64 concurrent clients:
>>
>> *Threads* *JDK 1.7.0_17* *OpenJDK 8, vanilla* *OpenJDK 8 +
>> pisces
>> renderer improvements* *Pisces renderer performance gain, %*
>>
>> 1 13,97 12,43 13,03 4,75%
>> 2 22,08 20,60 20,77 0,81%
>> 4 34,36 33,15 33,36 0,62%
>> 8 39,39 40,51 41,71 2,96%
>> 16 40,61 44,57 46,98 5,39%
>> 32 41,41 44,73 48,16 7,66%
>> 64 37,09 42,19 45,28 7,32%
>>
>>
>> Well, first of all, congratulations to the JDK developers, don't
>> know what changed in JDK 8, but
>> GeoServer seems to like it quite a bit :-).
>> That said, Laurent's patch also gives a visible boost, especially
>> when several concurrent clients are
>> asking for the maps.
>>
>> Mind, as I said, this is no micro-benchmark, it is a real
>> application loading doing a lot of I/O
>> (from the operating system file cache), other processing before the
>> data reaches the rendering
>> pipeline, and then has to PNG encode the output BufferedImage (PNG
>> encoding being rather
>> expensive), so getting this speedup from just a change in the
>> rendering pipeline is significant.
>>
>> Long story short... personally I'd be very happy if this patch was
>> going to be integrated in Java 8 :-)
>>
>> Cheers
>> Andrea
>>
>> --
>> ==
>> GeoServer training in Milan, 6th & 7th June 2013! Visit
>> http://geoserver.geo-**solutions.it<http://geoserver.geo-solutions.it>
>> <http://geoserver.geo-**solutions.it/<http://geoserver.geo-solutions.it/>>
>> for more information.
>>
>> ==
>>
>> Ing. Andrea Aime
>> @geowolf
>> Technical Lead
>>
>> GeoSolutions S.A.S.
>> Via Poggio alle Viti 1187
>> 55054 Massarosa (LU)
>> Italy
>> phone: +39 0584 962313
>> fax: +39 0584 1660272
>> mob: +39 339 8844549
>>
>> http://www.geo-solutions.it
>> http://twitter.com/**geosolutions_it<http://twitter.com/geosolutions_it>
>>
>> ------------------------------**-------------------------
>>
>>
>>
More information about the core-libs-dev
mailing list