[OpenJDK Rasterizer] Marlin #4

Jim Graham james.graham at oracle.com
Thu Sep 24 17:26:34 UTC 2015


Hi Laurent,

You are looking at the wrong loop.  It's tough to explain...

vis_*.c are only ever compiled or used on Solaris.  They convince the 
compiler to emit Sparc's version of MMX instructions.  They are not even 
compiled on any other build except for Solaris.

You were probably confused because they look like the implementations of 
the functions you were looking for and you never saw any other 
implementation of that function.  That's because all of the software 
loops are actually constructed using a very complicated system of 
Macros.  If you look at loops/IntArgbPre.c you will see a bunch of macro 
calls at the top which expand to declaring the functions such as 
"IntArgbPreSrcMaskFill".  Then you will see a structure with a bunch of 
Macro invocations in it which expand to declaring a structure describing 
the loops, one per loop function.  Then you will see a bunch more macro 
invocations, one per line, which surprisingly expand to entire functions 
for each one of them.

You'd have to do some serious tracing of macros to see what the code 
looks like, but most of the macros expand from either IntArgb.h or 
LoopMacros.h...

			...jim

On 9/24/15 7:59 AM, Laurent Bourgès wrote:
> Sergey,
>
> I managed to create a new benchmark with JMH + perfasm profiler:
> http://cr.openjdk.java.net/~lbourges/jmh/ellipse_fill/
>
> See MyBenchMark.java that fills an ellipse with radius in {"100", "500",
> "900", "1400"}
>
> I tested with both Oracle JDK8 and Oracle JDK9 EA b81 ie using the
> ductus rendering engine:
> http://cr.openjdk.java.net/~lbourges/jmh/ellipse_fill/bench_jdk8.log
> http://cr.openjdk.java.net/~lbourges/jmh/ellipse_fill/bench_jdk9.log
>
> JDK8:
> Benchmark                     (size)  Mode  Cnt  Score   Error  Units
> MyBenchmark.fillEllipse          100  avgt    3  0,207 ± 0,034  ms/op
> MyBenchmark.fillEllipse          500  avgt    3  1,931 ± 0,112  ms/op
> MyBenchmark.fillEllipse          900  avgt    3  5,158 ± 0,346  ms/op
> MyBenchmark.fillEllipse         1400  avgt    3  9,628 ± 1,321  ms/op
>
> JDK9:
> Benchmark                     (size)  Mode  Cnt   Score   Error  Units
> MyBenchmark.fillEllipse          100  avgt    3   0,223 ± 0,005  ms/op
> MyBenchmark.fillEllipse          500  avgt    3   2,069 ± 0,044  ms/op
> MyBenchmark.fillEllipse          900  avgt    3   5,393 ± 0,285  ms/op
> MyBenchmark.fillEllipse         1400  avgt    3  12,305 ± 0,104  ms/op
>
> JDK9 is slower ~ 10% in this test.
>
>
> I tried to interpret the profiler info but I just noticed the hotspots
> are located in native code (libawt.so):
>
> JDK8:
>
> ....[Hottest Regions]...............................................................................
>   48,53%   51,78%  [0x7f78197f9ae1:0x7f78197f9b27] in IntArgbPreSrcMaskFill (libawt.so)
>   11,27%   11,68%  [0x7f78197f9900:0x7f78197f9aa6] in IntArgbPreSrcMaskFill (libawt.so)
>    9,91%   11,58%  [0x7f7813bc6527:0x7f7813bc65bd] in writeAlpha8 (libdcpr.so)
>    6,51%    2,73%  [0x7f7813bc5471:0x7f7813bc560a] in processJumpBuffer; processSubBufferInTile (libdcpr.so)
>    2,13%    2,16%  [0x7f7813bc6436:0x7f7813bc6506] in writeAlpha8 (libdcpr.so)
>
>
> JDK9:
> ...[Hottest
> Regions]...............................................................................
>   61,90%   66,72%  [0x7f71ae7f5678:0x7f71ae7f5837] in
> IntArgbPreSrcMaskFill (libawt.so)
>   10,06%    5,40%  [0x7f71acb0aa77:0x7f71acb0afa9] in processJumpBuffer;
> processSubBufferInTile; reset.isra.4 (libdcpr.so)
>    9,23%   10,45%  [0x7f71acb0bb68:0x7f71acb0bc7d] in writeAlpha8
> (libdcpr.so)
>
> So this test is using the software pixel loop [IntArgbPreSrcMaskFill].
>
> I looked at the source code and compared the libawt / java2d / loops /
> vis_IntArgbPre_Mask.c from openjdk8 and openjdk9 but those are the same !
>
> Can it be a JNI issue or a compilation issue (gcc settings ...) with
> that native code ?
>
> Any idea, Sergey ?
>
> Thanks for the tips,
> Laurent
>
> 2015-09-24 4:17 GMT+02:00 Sergey Bylokhov <Sergey.Bylokhov at oracle.com
> <mailto:Sergey.Bylokhov at oracle.com>>:
>
>     On 22.09.15 0:15, Laurent Bourgès wrote:
>
>         Conclusion:
>         The new patch seems promising as it is very close to ductus
>         performance.
>         Filling ellipse seems slower on OpenJDK9 (492 / 437 = 12%
>         slower) ! Any
>         MaskFill changes ?
>
>
>     For such checks I suggest to use JMH + "prof perfasm". It will
>     provide really good info per java methods(before/after compilation)
>     including assemblers, plus the log include the native methods.
>     Example looks like this:
>     http://cr.openjdk.java.net/~shade/jmh/perfasm-sample.log
>
>     http://openjdk.java.net/projects/code-tools/jmh
>
>     It is really good in java2d because sometimes it is unclear where
>     the problem is occurs(java or native or new objects etc), and any
>     java profilers can change the behavior of application.
>
>     --
>     Best regards, Sergey.
>
>


More information about the graphics-rasterizer-dev mailing list