From sergey.bylokhov at oracle.com Wed Dec 21 14:44:12 2016 From: sergey.bylokhov at oracle.com (Sergey Bylokhov) Date: Wed, 21 Dec 2016 17:44:12 +0300 Subject: [OpenJDK Rasterizer] Marlin #4 In-Reply-To: <5614169C.9040809@oracle.com> References: <560E9409.30206@oracle.com> <5614169C.9040809@oracle.com> Message-ID: <75582676-0857-4AB6-A6C7-BCE3115B1501@oracle.com> Hi, Laurent. Can you please check the next patch: ========== diff -r 8a61c000a194 make/lib/Awt2dLibraries.gmk --- a/make/lib/Awt2dLibraries.gmk Tue Dec 20 09:52:14 2016 -0800 +++ b/make/lib/Awt2dLibraries.gmk Wed Dec 21 17:33:36 2016 +0300 @@ -222,6 +222,7 @@ # applies to debug builds. ifeq ($(TOOLCHAIN_TYPE), gcc) BUILD_LIBAWT_debug_mem.c_CFLAGS := -w + LIBAWT_CFLAGS += -fgcse-after-reload endif $(eval $(call SetupNativeCompilation,BUILD_LIBAWT, \ ========== It seems that this is the simplest version which produce the good performance results and safe enough to be integrated. On my system(Ubuntu gcc5.4) it will speedup default rasterizer from 8.400 to 6.200 ms/op +- 20%. Default rasterizer in OracleJDK 8u112 has 6.500. Fix does not affect the the public jdk9.(which is build by RE on gcc 4.9.2), seems gcc 4.9.2 produce good results w/ and w/o this option. > > We should also be wary of compiler options that are a win on one processor family and a loss on another. Anything that schedules instructions may be specific to a particular generation of CPUs, for instance. Or for i5 vs i7 vs M(obile)... > > ...jim > > On 10/2/15 9:10 AM, Laurent Bourg?s wrote: >> Sergey, >> >> thanks for the information: >> >> I tried your gcc options on my ubuntu 14.4 (v4.8.4) and it is >> actually >> slightly faster: 10% on my fill ellipse test (450ms vs 490ms). >> >> >> I tested by your jmh test, and the difference became bigger on 1400 >> size. >> >> >> Interesting; I will try too. >> >> >> >> Do you know which gcc compiler and options are used to build >> JavaSE EA? >> >> I guess that compiler options in makefile are the same. >> >> plus some default gcc options: >> jdk8: >> gcc (GCC) 4.3.0 20080428 (Red Hat-8) C compiler version 4.3.0-8) >> >> jdk9: >> gcc-4.8.2 - OEL5.5 >> >> >> However the gcc compiler are different: 4.3 vs 4.8.2 ! >> >> So it may be worth comparing their different optimization options; I >> guess somebody already looked at that ! >> >> >> Moreover, the linux distrib may define default options. >> >> I will try to figure out all compiler options (command line + >> defaults) >> on my machine. >> >> >> It is not simple to find an option, which will help for everyone. >> Two options suggested by me is a minimum number from -O3 to get the >> maximum performance, both seems reasonable. Actually if I change the >> -O2 to -O3(OPTIMIZATION := LOW =>> OPTIMIZATION := HIGHEST) >> performance became worse. >> >> >> It is often the case with O3, but your patch seems a good win with only >> 2 enabled options. >> >> >> What is your build environment ? >> >> >> Ubuntu 14.04 gcc 4.8.4 >> >> >> I have the same and I got finally my gcc options: >> gcc -c -Q -O2 --help=common >> >> Here are the difference between O2 vs O3 with gcc 4.8.4: >> >> gcc -c -Q -O3 --help=optimizers > /tmp/O3-opts >> gcc -c -Q -O2 --help=optimizers > /tmp/O2-opts >> diff /tmp/O2-opts /tmp/O3-opts | grep enabled >> >> *> -fgcse-after-reload [enabled] >> *> -finline-functions [enabled] >>> -fipa-cp-clone [enabled] >>> -fpredictive-commoning [enabled] >>> -ftree-loop-distribute-patterns [enabled] >>> -ftree-partial-pre [enabled] >> *> -ftree-vectorize [enabled] >> *> -funswitch-loops [enabled] >>> -fvect-cost-model [enabled] >> >> So we could evaluate some of these options and see what is the best >> compromise for libawt on gcc 4.8 ! >> >> Regards, >> Laurent -------------- next part -------------- An HTML attachment was scrubbed... URL: From bourges.laurent at gmail.com Wed Dec 21 18:01:04 2016 From: bourges.laurent at gmail.com (=?UTF-8?Q?Laurent_Bourg=C3=A8s?=) Date: Wed, 21 Dec 2016 19:01:04 +0100 Subject: [OpenJDK Rasterizer] Marlin #4 In-Reply-To: <75582676-0857-4AB6-A6C7-BCE3115B1501@oracle.com> References: <560E9409.30206@oracle.com> <5614169C.9040809@oracle.com> <75582676-0857-4AB6-A6C7-BCE3115B1501@oracle.com> Message-ID: Hi Sergey, thank you to look at this problem. I confirm that your simple patch improves the performance on my laptop ubuntu 16.4 (gcc 5.4 as yours) with intel i4700 cpu when I run the ellipse JMH test. - ojdk9 without patch: Benchmark (size) Mode Cnt Score Error Units EllipseRdrTest.drawEllipse 100 avgt 6 0,233 ? 0,007 ms/op EllipseRdrTest.drawEllipse 500 avgt 6 1,203 ? 0,004 ms/op EllipseRdrTest.drawEllipse 900 avgt 6 2,361 ? 0,458 ms/op *EllipseRdrTest.drawEllipse 1400 avgt 6 4,023 ? 0,028 ms/op*EllipseRdrTest.fillEllipse 100 avgt 6 0,198 ? 0,010 ms/op EllipseRdrTest.fillEllipse 500 avgt 6 1,858 ? 0,046 ms/op *EllipseRdrTest.fillEllipse 900 avgt 6 4,962 ? 0,393 ms/opEllipseRdrTest.fillEllipse 1400 avgt 6 10,475 ? 0,035 ms/op* - ojdk9 with patch: Benchmark (size) Mode Cnt Score Error Units EllipseRdrTest.drawEllipse 100 avgt 6 0,232 ? 0,006 ms/op EllipseRdrTest.drawEllipse 500 avgt 6 1,203 ? 0,021 ms/op EllipseRdrTest.drawEllipse 900 avgt 6 2,355 ? 0,467 ms/op *EllipseRdrTest.drawEllipse 1400 avgt 6 3,835 ? 0,632 ms/op* EllipseRdrTest.fillEllipse 100 avgt 6 0,191 ? 0,010 ms/op EllipseRdrTest.fillEllipse 500 avgt 6 1,793 ? 0,029 ms/op *EllipseRdrTest.fillEllipse 900 avgt 6 4,741 ? 0,062 ms/opEllipseRdrTest.fillEllipse 1400 avgt 6 8,810 ? 0,100 ms/op* - reference jdk8 with marlin 0.7.4 (comparable): Benchmark (size) Mode Cnt Score Error Units EllipseRdrTest.drawEllipse 100 avgt 6 0,231 ? 0,002 ms/op EllipseRdrTest.drawEllipse 500 avgt 6 1,199 ? 0,013 ms/op EllipseRdrTest.drawEllipse 900 avgt 6 2,282 ? 0,006 ms/op EllipseRdrTest.drawEllipse 1400 avgt 6 3,600 ? 0,133 ms/op EllipseRdrTest.fillEllipse 100 avgt 6 0,189 ? 0,001 ms/op EllipseRdrTest.fillEllipse 500 avgt 6 1,777 ? 0,009 ms/op EllipseRdrTest.fillEllipse 900 avgt 6 4,856 ? 0,110 ms/op EllipseRdrTest.fillEllipse 1400 avgt 6 10,252 ? 0,302 ms/op If you need, I can run against Oracle JDK9 EA builds. Cheers & Happy hollidays, Laurent 2016-12-21 15:44 GMT+01:00 Sergey Bylokhov : > Hi, Laurent. > Can you please check the next patch: > ========== > diff -r 8a61c000a194 make/lib/Awt2dLibraries.gmk > --- a/make/lib/Awt2dLibraries.gmk Tue Dec 20 09:52:14 2016 -0800 > +++ b/make/lib/Awt2dLibraries.gmk Wed Dec 21 17:33:36 2016 +0300 > @@ -222,6 +222,7 @@ > # applies to debug builds. > ifeq ($(TOOLCHAIN_TYPE), gcc) > BUILD_LIBAWT_debug_mem.c_CFLAGS := -w > + LIBAWT_CFLAGS += -fgcse-after-reload > endif > > > $(eval $(call SetupNativeCompilation,BUILD_LIBAWT, \ > ========== > > It seems that this is the simplest version which produce the good > performance results and safe enough to be integrated. On my system(Ubuntu > gcc5.4) it will speedup default rasterizer from 8.400 to 6.200 ms/op +- > 20%. Default rasterizer in OracleJDK 8u112 has 6.500. > Fix does not affect the the public jdk9.(which is build by RE on gcc > 4.9.2), seems gcc 4.9.2 produce good results w/ and w/o this option. > > > > We should also be wary of compiler options that are a win on one processor > family and a loss on another. Anything that schedules instructions may be > specific to a particular generation of CPUs, for instance. Or for i5 vs i7 > vs M(obile)... > > ...jim > > On 10/2/15 9:10 AM, Laurent Bourg?s wrote: > > Sergey, > > thanks for the information: > > I tried your gcc options on my ubuntu 14.4 (v4.8.4) and it is > actually > slightly faster: 10% on my fill ellipse test (450ms vs 490ms). > > > I tested by your jmh test, and the difference became bigger on 1400 > size. > > > Interesting; I will try too. > > > > Do you know which gcc compiler and options are used to build > JavaSE EA? > > I guess that compiler options in makefile are the same. > > plus some default gcc options: > jdk8: > gcc (GCC) 4.3.0 20080428 (Red Hat-8) C compiler version 4.3.0-8) > > jdk9: > gcc-4.8.2 - OEL5.5 > > > However the gcc compiler are different: 4.3 vs 4.8.2 ! > > So it may be worth comparing their different optimization options; I > guess somebody already looked at that ! > > > Moreover, the linux distrib may define default options. > > I will try to figure out all compiler options (command line + > defaults) > on my machine. > > > It is not simple to find an option, which will help for everyone. > Two options suggested by me is a minimum number from -O3 to get the > maximum performance, both seems reasonable. Actually if I change the > -O2 to -O3(OPTIMIZATION := LOW =>> OPTIMIZATION := HIGHEST) > performance became worse. > > > It is often the case with O3, but your patch seems a good win with only > 2 enabled options. > > > What is your build environment ? > > > Ubuntu 14.04 gcc 4.8.4 > > > I have the same and I got finally my gcc options: > gcc -c -Q -O2 --help=common > > Here are the difference between O2 vs O3 with gcc 4.8.4: > > gcc -c -Q -O3 --help=optimizers > /tmp/O3-opts > gcc -c -Q -O2 --help=optimizers > /tmp/O2-opts > diff /tmp/O2-opts /tmp/O3-opts | grep enabled > > *> -fgcse-after-reload [enabled] > *> -finline-functions [enabled] > > -fipa-cp-clone [enabled] > -fpredictive-commoning [enabled] > -ftree-loop-distribute-patterns [enabled] > -ftree-partial-pre [enabled] > > *> -ftree-vectorize [enabled] > *> -funswitch-loops [enabled] > > -fvect-cost-model [enabled] > > > So we could evaluate some of these options and see what is the best > compromise for libawt on gcc 4.8 ! > > Regards, > Laurent > > > -- -- Laurent Bourg?s -------------- next part -------------- An HTML attachment was scrubbed... URL: