[OpenJDK Rasterizer] Marlin renderer contribution for review

Wed Mar 18 16:17:40 UTC 2015

Jim,

thanks for your review.

Please read my comments below:

2015-03-18 2:13 GMT+01:00 Jim Graham <james.graham at oracle.com>:

> Hi Laurent,
>
> I'm now looking at your webrev.  There are a number of things that I think
> we should look at once it goes in, and I'll mention those in subsequent
> emails, but I want to just deal with basic issues that we should deal with
> before the first commit here...
>

Ok, I agree that many things can be improved, corrected and fixed after the
marlin patch is committed into the graphics rasterizer repository (sandbox
?)

>
> AAShapePipe - the static println in TileState should probably be removed
> (for one thing, it will now be run regardless of whether or not Marlin is
> selected).
>

OK.

>
> Is there anything about the patch to AAShapePipe that is specific to
> Marlin?  Perhaps it should have its own bugid?
>

Not anymore: previously I preallocated the tile array according to the
PiscesCache.TILE_SIZE.

I agree the AAShapePipe concurrency patch should be considered on its own
(bugId, review ...) as it concerns any renderer.

FYI here are MapBench benchmark synthetic results with marlin 0.5.6 (JDK
1.8) with / without the AAShapePipe:
Marlin 0.5.6 TL AAShapePipe   Pct95    132.498    131.176    132.641
133.677
Marlin 0.5.6 TL NoAAShapePipe Pct95    137.443    132.805    136.462
 143.062

As you can see, this patch is important for the scalability (concurrent
rendering threads) : up to 10% faster with 4threads.
I think it is related to both synchronization and GC impact.

> Even though doCleanDirty is turned off, Dasher.java line 153 looks like it
> may overflow if you've ever had to widen the firstSegs array with it turned
> on.
>

Good catch: I will fix it by clearing the all array and not [0...
firstSegUsed) !

>
> Stroker.java, 312 - indent looks off.
>

OK, I will also move all constants at the beginning of the class:

   private final static float C = 0.5522847498307933f;

Use of "fluent API".  In general I think that is a nice way to do thing in
> a builder-style API where a whole chain of related methods all return
> "this" to facilitate the chaining, but identifying isolated methods that
> happen to often be used just prior to a return of the object they were
> executing on seems to me to be more of a break in object encapsulation.  It
> adds a burden to the called method to deal with a return value so that we
> can eliminate a line of code in the caller.  But, in reality the caller
> still has to return something - it just returns a different value (the
> value saved in a local variable instead of the value returned by the called
> method) so you are actually adding code to the interface to make things
> appear simpler.  This isn't a veto of those cases here, but I wanted to
> point out that their value in isolated cases is questionable unlike the
> "chained setter methods" case for which they were initially targeted.
>

I advocate I used the "fluent"-like approach to modify less lines and avoid
too much changes.
In Pisces, several instances were created like "r = new Renderer(params)"
then I replaced by "r = rdrCtx.renderer.init(params)":

-            r = new Renderer(3, 3,-
clip.getLoX(), clip.getLoY(),

+            r = rdrCtx.renderer.init(clip.getLoX(), clip.getLoY(),
                              clip.getWidth(), clip.getHeight(),
                              pi.getWindingRule());

I could rewrite it like:
"r = rdrCtx.renderer;
r.init(params)"

but it costs one line more and it is less "fluent" !
I prefer using local variables for performance and avoid using class
members when possible !

>
> TransformingPathConsumer2D (and any other new code) - All methods should
> have a blank line between them.  This notably applies to the Path2D
> wrapper.  Also, a common style point throughout 2D is that when there is a
> continuation line in an argument list declaration, we place the "{" brace
> on a new line all by itself so that the eye has a good landmark for where
> the body of the method begins.  (See how the old "curveTo()" methods differ
> from the new one created in Path2DWrapper.)
>

OK, I will follow that syntax rule and fix the Path2DWrapper class.

>
> TransformingPathConsumer2D - the if/else style being replaced here was
> actually intentional.  The lack of a trailing "return" statement ensures
> that all cases were specifically handled above in either an if, or its
> associated else block, and that none of the cases ended up falling through
> to a subsequent "catch-all" operation by accident.  Nothing is broken here,
> I just wanted to point out the "whys" of the original code style.  In the
> new code, there are still some bare "} else {" lines left which raises the
> question - what was the specific style goal in whether an "} else {" was
> eliminated or left intact?  The prior style goal was "no fallthrough cases
> anywhere" - the new style seems to be "usually fallthrough, but sometimes
> explicit alternate return values in an else clause".
>

Ok I will revert such changes even if I like the fallthrough approach (more
obvious and readable).

> -------- Here are some "things to consider later" ---------
>
> Is there ever a reason to use doCleanDirty?  It might matter for arrays of
> objects to null out old references so as to not interfere with GC, but
> don't most growable arrays get filled in as they are used so old data isn't
> really processed?  I notice that it is turned off right now so perhaps
> that's an argument in favor of my comment, but it seems to me that having
> to clean out a primitive array would be an odd rare exception and so simply
> adding in a clear where it is needed in the unlikely case that it is needed
> would be fine (and get rid of a lot of cruft in the code).
>

The doCleanDirty is only useful for debugging reasons:  dirty arrays (int,
byte, float) are never filled with zero and always contain junk / old data;
when I want to look at (netbeans debugger) or log an array, I need to use
that flag to ensure it is clean (zero filled) before usage.

> I just realized the main case where we want to deal with cleaning out
> dirty arrays and it would be the crossings accumulation arrays.  It seems
> to me that those are special cases and so we should concentrate the
> cleaning code on those explicitly and not have a global "lets clean all
> arrays everywhere" setting.
>

Agreed.
Let's clarify my approach in marlin: it uses mainly reused arrays (int,
byte, float) => no allocation (GC overhead).
To achieve the best performance, it only clears dirty parts ie used ones
(see the many used mark fields).

Dirty arrays are named "dirty" to indicate that these arrays are used in a
"unsafe" manner ie never zero-filled (always junk). It means the code must
be good enough to always set values and get those that have be set.

The memory footprint of one RendererContext is ~ 512Kb (small enough) that
is stored in either ThreadLocal (few rendering threads case) or
ConcurrentLinkedQueue (web server case, by default) using SoftRereference.

If arrays are too small, then marlin use an ArrayCache (Int, Float, Dirty)
to get larger arrays but also recycle them (and zero-filled).
These caches belong to each RendererContext and may represent a larger
memory footprint (~ few megabytes ?) but it will depend on the drawing
complexity (image size, shape segments, curves, pixel coverage map ...)

I think these cache should use WeakReferences to avoid wasting too much
memory (more volatile), but it remains to do.

> Isn't declaring flatLeafCoefficients as final enough to not need to cache
> it's reference in a local variable?  Also, I would think you would gain
> more by simply deleting the array and storing the coefficients in fields
> instead?  An array index lookup should be no simpler than a local field
> access.
>

In Dasher ? maybe it is minor optimisation but I prefer using local
variables instead of field access for performance reason.

> I was also skeptical of these Iterators that Denis created and thought
> that it may be better to just manually iterate the subdivisions instead of
> creating (or even keeping) an object around to hide the simple job of
> iterating a curve.  I now wonder if inlining the actions of the BreakPoint
> iterator might not improve performance for complex paths and leave less
> state that we need to manage.
>

To be discussed.

> FastMath - In general, if we use "(int) ceil()" a lot, we might want to
> consider offsetting the math so that all values are negative and then use a
> cast to (int).  In particular, if crossings are going to use "ceil(v -
> 0.5)" which is the proper way to do half-open intervals with centers
> included on top and left sides, then simply shifting everything once so
> that the entire range of interesting X,Y are negative would mean that all
> crossings calculations could be done by casting to integer, especially if
> we also offset them all by 0.5 as well.
>

Only normalization does the ceil(v - 0.5) but the crossings are directly
casted to int:

+                        // update crossing ( x-coordinate + last bit
= orientation):+                        cross = (((int) f_curx) << 1)
| _unsafe.getInt(addr + _OFF_YMAX_OR) & 0x1; /* last bit contains
orientation (0 or 1) */

I noted several times to improve that rounding issue with subpixel centers !
But it is less important than improving quality for nearly vertical or
horizontal lines (large subpixel steps): I have ideas related to span
interpolation or using adaptive scanline steps... (in future)

I use FastMath mainly to handle boundaries and coordinates can be either
positive or negative ... so I do not have the choice to deal with negative
values !

Renderer
addLine
  316:  final int firstCrossing = Math.max(FastMath.ceil(y1), _boundsMinY);
  319:  final int lastCrossing  = Math.min(FastMath.ceil(y2),  boundsMaxY);
dispose
  546:  final int iminY =     Math.max(FastMath.ceil(edgeMinY), _boundsMinY)
  548:  final int imaxY = 1 + Math.min(FastMath.ceil(edgeMaxY),  boundsMaxY)
endRendering
1133:  final int spminX = Math.max(FastMath.ceil(edgeMinX), boundsMinX);
1134:  final int spmaxX = Math.min(FastMath.ceil(edgeMaxX), boundsMaxX - 1);
1135:  final int spminY = Math.max(FastMath.ceil(edgeMinY), boundsMinY);
1136:  final int spmaxY = Math.min(FastMath.ceil(edgeMaxY), boundsMaxY - 1);

MarlinRenderingEngine
NormalizingPathIterator
currentSegment
  512:  x_adjust = (float)FastMath.floor(coord) + rval - coord;
  516:  y_adjust = (float)FastMath.floor(coord) + rval - coord;
  520:  x_adjust = (float)FastMath.floor(coord + lval) + rval - coord;
  524:  y_adjust = (float)FastMath.floor(coord + lval) + rval - coord;

FastMath is optional (use -Dsun.java2d.renderer.useFastMath=false) but is ~
5% faster !

> FastMath - if we can prove that no values <0 are interesting (as in, the
> only interesting property about them is that they are less than 0), then we
> can just use a cast to integer.  This is particular true if we clamp a
> value to >= 0 after we convert it to an integer - in that case we might as
> well have used an int cast rather than a floor() computation.
>

Ok let's discuss that later.

> Eventually we will need to make a decision on all LBO and TODO comments.
> Either make a call and change the code and delete the comment, or submit
> the issue as a new bugid and reference the bugid rather than TODO or LBO.
>

Agreed.

> Those were just a few thoughts/questions that came to mind as I was
> reviewing it, but I didn't want to get bogged down too much in the
> specifics that we can tweak later until we can get an implementation
> published for people to test on so I tried to back off and keep my review
> at a higher level.  I may have more comments as I continue to look through
> the code...
>

Thanks for your time and your comments,

I will prepare soon another webrev,

Laurent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/graphics-rasterizer-dev/attachments/20150318/7bfcbe30/attachment-0001.html>