[OpenJDK 2D-Dev] sun.java2D.Pisces renderer Performance and Memory enhancements
Jim Graham
james.graham at oracle.com
Wed May 1 00:43:18 UTC 2013
Hi Laurent,
There's a lot here to go over... ;)
On 4/30/13 9:29 AM, Laurent Bourgès wrote:
> Jim,
>
> here is the latest webrev:
> http://jmmc.fr/~bourgesl/share/java2d-pisces/webrev-3/
I'll answer questions first and then have a look...
> I have few questions regarding renderer edge handling and float ceil calls:
I have a personal philosophy for all rendering engines that I've
produced, but I'm not sure the philosophy was adopted for this code. I
like to use the same rounding for every operation and have all ranges be
specified as semi-open regions - x1 <= x < x2 - for example.
The "insideness" rule is most naturally expressed by semi-open intervals
since pixel centers that fall on a vertical boundary are inside if the
interior region is to their right (i.e. the x1 <= x half of that
equation above) and they are outside if the space to their right is
non-interior (i.e. the x < x2 half of that equation). A similar rule is
also used vertically.
The rounding operation which computes this would be:
xc = computed crossing value for the current scan line;
ceil(xc - 0.5);
This maps (for some integer k):
k+0.4999 to k
k+0.5000 to k
k+0.5001 to k+1
Which is exactly what you want for the beginning of a region. If the
crossing is exactly at the pixel center then that is the furthest right
that you can cross the scan line and still include pixel k. If you just
miss the center of a pixel to the right, then the first pixel to be
included is k+1. That function computes it correctly.
Similarly, if you apply the same formula to the right edge of a region
then you compute "the first pixel to not be included in the region.
Consider that if you miss a pixel center on that right edge by falling
slightly to the left of it, then that pixel is "outside" and it is the
first such pixel to be "outside". If you hit its pixel center exactly,
then the insideness rule also says that it is "outside". Only when you
just miss it by falling slightly to the right of its center would that
pixel still be included. The above formula computes that value if you
take its answer to be the "first pixel not included in the region".
This lets you perform the same rounding on every value and treat every
pair of "inside trigger" to "outside trigger" as a half-open interval.
Note that otherwise you'd have to compute every floating point crossing,
then run the EvenOdd vs. ZeroWinding rule on them, and finally only do
the rounding after you knew which were left edges and which were right
(which you don't know until you've computed every crossing on the scan
line). Since this rounding rule is symmetric given half-open intervals
then you can round everything before you've analyzed the results of the
winding rules.
I then extend that philosophy to every set of coordinates I deal with,
be it horizontal ranges, vertical ranges, shape rasterized results, or
clip results, and the math usually works out in the simplest and most
elegant way possible.
For an AA renderer, it isn't clear that the sub-pixel computations
should be as sub-sub-pixely accurate, but I'd like to think that the
philosophy should be kept down to the lowest layers if we can,
especially as this code can be tuned down to a single sub-pixel per real
pixel and then it definitely should be as accurate as it can with the
insideness rule (even though I doubt it will ever be set up that way).
Another consideration is the vertices on an enclosing polygon. You
compute one segment to where it ends at xe1,ye1 and start computing the
next segment from where it starts at xs2,ys2, but for adjacent segments
in an outline, they start and end on a shared common point so xe1==xs2
and ye1==ys2. If you compute crossings for both segments at that shared
point then you have a chance of recording the crossings twice for that
tiny chunk of the outline. You could special case the last point in a
segment, but that gets into lots of tricky tests. On the other hand, if
all intervals everywhere are half-open then anything you compute for
xe1,ye1 would represent "the first line/pixel that segment 1 does not
cause to be included" and computing the same values for xs2,ys2 would
naturally generate "the first line/pixel that segment 2 causes to be
included" and only one of them would induce inclusion on that scan line
or pixel. This gets even more valuable when you consider all cases of
segment 1 going down to xe1,ye1 and segment 2 going back up from that
point, or up/down or up/up or down/down - and also left/left or
left/right or right/right or right/left. All possible cases of how one
segment arrives at xe1,ye1 and the next segment leaves from that same
point just naturally fall out from using symmetric rounding and
half-open intervals.
I wanted to get that out to see if we were on the same page or if there
was another way of doing the math that you are familiar with that might
also make things easier.
I'm pretty sure that most of the code in the current version of Pisces
uses that same philosophy, but I seem to recall that there were a couple
of places that Denis was more comfortable with fully closed intervals.
I don't remember the details, but be on the watch for it. Also, I don't
recall if it used a half-sub-pixel bias anywhere or simply punted on it
as not being visible enough to worry about strict "center of sub-pixel"
comparisons.
Now on to the specific questions:
> private void addLine(float x1, float y1, float x2, float y2) {
> ...
> // LBO: why ceil ie upper integer ?
> final int firstCrossing = Math.max(ceil(y1), boundsMinY); //
> upper integer
> final int lastCrossing = Math.min(ceil(y2), boundsMaxY); //
> upper integer
Perhaps lastCrossing is misnamed. I think it represents the end of the
interval which is half-open. Does that match the rest of the operations
done on it?
If every coordinate has already been biased by the -0.5 then ceil is
just the tail end of the rounding equation I gave above.
If not, then the half-open considerations still apply if you use the
same rounding for both the top and the bottom of the segment. If
lastCrossing represents the "bottom of the shape" then it should not be
included. If the next segment continues on from it, then its y1 will be
computed in the first equation and will represent the first scanline
that the next segment participates in.
> => firstCrossing / lastCrossing should use lower and upper integer
> values (rounding issues) ?
Do they make sense now given my explanation above? Perhaps they should
be renamed to indicate their half-openness?
> boolean endRendering() {
> // TODO: perform shape clipping to avoid dealing with segments
> out of bounding box
>
> // Ensure shape edges are within bbox:
> if (edgeMinX > edgeMaxX || edgeMaxX < 0f) {
> return false; // undefined X bounds or negative Xmax
> }
> if (edgeMinY > edgeMaxY || edgeMaxY < 0f) {
> return false; // undefined Y bounds or negative Ymax
> }
I'd use min >= max since if min==max then I think nothing gets generated
as a result of all edges having both the in and out crossings on the
same coordinate. Also, why not test against the clip bounds instead?
The code after that will clip the edgeMinMaxXY values against the
boundsMinMax values. If you do this kind of test after that clipping is
done (on spminmaxxy) then you can discover if all of the coordinates are
outside the clip or the region of interest.
> // why use upper integer for edge min values ?
Because an edge falling between sample points includes only the sample
point "above" it. (Or "un-includes" it if it is a right/bottom edge.)
> => Here is the question: why use ceil instead of floor ?
>
> final int eMinX = ceil(edgeMinX); // upper positive int
> if (eMinX > boundsMaxX) {
> return false; // Xmin > bbox maxX
> }
>
> final int eMinY = ceil(edgeMinY); // upper positive int
> if (eMinY > boundsMaxY) {
> return false; // Ymin > bbox maxY
> }
>
> int spminX = Math.max(eMinX, boundsMinX);
> int spmaxX = Math.min(ceil(edgeMaxX), boundsMaxX); // upper
> positive int
> int spminY = Math.max(eMinY, boundsMinY);
> int spmaxY = Math.min(ceil(edgeMaxY), boundsMaxY); // upper
> positive int
>
> int pminX = spminX >> SUBPIXEL_LG_POSITIONS_X;
> int pmaxX = (spmaxX + SUBPIXEL_MASK_X) >> SUBPIXEL_LG_POSITIONS_X;
> int pminY = spminY >> SUBPIXEL_LG_POSITIONS_Y;
> int pmaxY = (spmaxY + SUBPIXEL_MASK_Y) >> SUBPIXEL_LG_POSITIONS_Y;
>
> // store BBox to answer ptg.getBBox():
> this.cache.init(pminX, pminY, pmaxX, pmaxY);
>
> In PiscesCache:
> void init(int minx, int miny, int maxx, int maxy) {
> ...
> // LBO: why add +1 ??
> bboxX1 = maxx /* + 1 */;
> bboxY1 = maxy /* + 1 */;
I think the values passed in are "the smallest and largest values we
encountered" and so adding one computes the first coordinate that is
"not inside the sample region" as per the half-open philosophy.
I tend to use and encourage "min/max" naming for closed-interval values
and xy0/xy1 or xy1/xy2 for half-open-interval values.
> => piscesCache previously added +1 to maximum (upper loop condition y < y1)
> but the pmaxX already uses ceil (+1) and (spmaxY + SUBPIXEL_MASK_Y)
> ensures the last pixel is reached.
spmaxY + SUBPIXEL_MASK_Y computes the last sub-pixel piece of the last
pixel. It is essentially equivalen to "lastrow.9999999". So, the first
coordinate not included would be "lastrow+1" which is simply adding +1
to the sub-pixel "max" value that was given.
> I fixed these limits to have correct rendering due to dirty rowAARLE reuse.
If there was a bug before I'd prefer to have it fixed within the
philosophy of half-open intervals rather than trying to translate into
closed intervals - it keeps all of the code on the same page.
> 2013/4/24 Jim Graham <james.graham at oracle.com <mailto:james.graham at oracle.com>>
> Do you have any faster implementation for Math.ceil/floor ? I now use
> casting 1 + (int) / (int) but I know it is only correct for positive
> numbers (> 0).
ceil(integer) == integer, but your technique would return "integer+1".
I don't remember a decent technique for doing ceil with a cast.
> - clipping issues:
> for very large dashed rectangle, millions of segments are emited from
> dasher (x1) to stroker (x4 segments) to renderer (x4 segments).
>
> However, I would like to add a minimal clipping algorithm but the
> rendering pipeline seems to be handling affine transforms between stages:
> /*
> * Pipeline seems to be:
> * shape.getPathIterator
> * -> inverseDeltaTransformConsumer
> * -> Dasher
> * -> Stroker
> * -> deltaTransformConsumer OR transformConsumer
> * -> pc2d = Renderer (bounding box)
> */
If we could teach the stroker to be able to apply an elliptical pen then
we could do the transforms once up front and simplify that quite a lot.
That optimization remains TBD. We currently at least detect uniform
scales and those can be handled with a single transform up front and
multiplying all dash segment values and the line width by the uniform
scale factor. But, anything that transforms a circle into an ellipse
requires us to use the multi-stage silliness above.
> It is complicated to perform clipping in the Renderer and maybe to late
> as a complete stroker's segment must be considered as a whole (4
> segments without caps ...).
>
> Could you give me advices how to hack / add clipping algorithm in the
> rendering pipeline (dasher / stroker / renderer) ?
> Should I try to derive a bounding box for the dasher / stroker depending
> on the Affine transforms ?
Yes, that sounds right so far, but it's pretty high-level and I'm not
sure where your level of understanding here falls short...? I guess
what I would do is:
- compute an expanded "clip box" for the stroker based on the original
clip, the transform, the line width, and the miter limit. Note that the
transform applied to the stroke coordinates varies depending on whether
we use the "uniform scaling" optimization or not
- test coordinates for being in the expanded clip box in the
PathConsumer2D methods of Stroker. If the segment is rejected, be sure
to update all of the internal computations (normal values, last moveto
coords, etc.) that might be needed for stroking the following segment
- unfortunately this means that dasher doesn't get the advantages, you
could add similar code to the dasher, but then you have to be sure an
entire dash is omitted before you can omit the data going on, otherwise
the stroker will not get proper incoming coordinates from "the previous
segment that was out of bounds". That could be managed with proper
communication between the dasher and stroker, but then it gets more
complicated.
Those are the caveats that I would keep in mind while I was designing a
solution - hope it helps some...
...jim
More information about the 2d-dev
mailing list