[OpenJDK 2D-Dev] sun.java2D.Pisces renderer Performance and Memory enhancements
Laurent Bourgès
bourges.laurent at gmail.com
Thu May 2 08:48:25 UTC 2013
Jim,
thanks for this long explanations and I agree you approach using semi-open
intervals.
I have the feeling that pisces was not so accurate implementing it.
My comments in the text:
I have few questions regarding renderer edge handling and float ceil calls:
>>
>
> I have a personal philosophy for all rendering engines that I've produced,
> but I'm not sure the philosophy was adopted for this code. I like to use
> the same rounding for every operation and have all ranges be specified as
> semi-open regions - x1 <= x < x2 - for example.
>
> The "insideness" rule is most naturally expressed by semi-open intervals
> since pixel centers that fall on a vertical boundary are inside if the
> interior region is to their right (i.e. the x1 <= x half of that equation
> above) and they are outside if the space to their right is non-interior
> (i.e. the x < x2 half of that equation). A similar rule is also used
> vertically.
>
> The rounding operation which computes this would be:
>
> xc = computed crossing value for the current scan line;
> ceil(xc - 0.5);
>
Agreed; however, pisces code never performs such rounding: for me, ceil(x -
0.5) means (int) Math.round(x).
>
> This maps (for some integer k):
>
> k+0.4999 to k
> k+0.5000 to k
> k+0.5001 to k+1
>
> Which is exactly what you want for the beginning of a region. If the
> crossing is exactly at the pixel center then that is the furthest right
> that you can cross the scan line and still include pixel k. If you just
> miss the center of a pixel to the right, then the first pixel to be
> included is k+1. That function computes it correctly.
>
> Similarly, if you apply the same formula to the right edge of a region
> then you compute "the first pixel to not be included in the region.
> Consider that if you miss a pixel center on that right edge by falling
> slightly to the left of it, then that pixel is "outside" and it is the
> first such pixel to be "outside". If you hit its pixel center exactly,
> then the insideness rule also says that it is "outside". Only when you
> just miss it by falling slightly to the right of its center would that
> pixel still be included. The above formula computes that value if you take
> its answer to be the "first pixel not included in the region".
>
> This lets you perform the same rounding on every value and treat every
> pair of "inside trigger" to "outside trigger" as a half-open interval. Note
> that otherwise you'd have to compute every floating point crossing, then
> run the EvenOdd vs. ZeroWinding rule on them, and finally only do the
> rounding after you knew which were left edges and which were right (which
> you don't know until you've computed every crossing on the scan line).
> Since this rounding rule is symmetric given half-open intervals then you
> can round everything before you've analyzed the results of the winding
> rules.
>
> I then extend that philosophy to every set of coordinates I deal with, be
> it horizontal ranges, vertical ranges, shape rasterized results, or clip
> results, and the math usually works out in the simplest and most elegant
> way possible.
>
> For an AA renderer, it isn't clear that the sub-pixel computations should
> be as sub-sub-pixely accurate, but I'd like to think that the philosophy
> should be kept down to the lowest layers if we can, especially as this code
> can be tuned down to a single sub-pixel per real pixel and then it
> definitely should be as accurate as it can with the insideness rule (even
> though I doubt it will ever be set up that way).
>
> Another consideration is the vertices on an enclosing polygon. You
> compute one segment to where it ends at xe1,ye1 and start computing the
> next segment from where it starts at xs2,ys2, but for adjacent segments in
> an outline, they start and end on a shared common point so xe1==xs2 and
> ye1==ys2. If you compute crossings for both segments at that shared point
> then you have a chance of recording the crossings twice for that tiny chunk
> of the outline. You could special case the last point in a segment, but
> that gets into lots of tricky tests. On the other hand, if all intervals
> everywhere are half-open then anything you compute for xe1,ye1 would
> represent "the first line/pixel that segment 1 does not cause to be
> included" and computing the same values for xs2,ys2 would naturally
> generate "the first line/pixel that segment 2 causes to be included" and
> only one of them would induce inclusion on that scan line or pixel. This
> gets even more valuable when you consider all cases of segment 1 going down
> to xe1,ye1 and segment 2 going back up from that point, or up/down or up/up
> or down/down - and also left/left or left/right or right/right or
> right/left. All possible cases of how one segment arrives at xe1,ye1 and
> the next segment leaves from that same point just naturally fall out from
> using symmetric rounding and half-open intervals.
>
> I wanted to get that out to see if we were on the same page or if there
> was another way of doing the math that you are familiar with that might
> also make things easier.
>
Perfect and very clear. I will then check pisces rounding either on pixel
coordinates (px, py) or AA (virtual) pixel coordinates (spx, spy).
> I'm pretty sure that most of the code in the current version of Pisces
> uses that same philosophy, but I seem to recall that there were a couple of
> places that Denis was more comfortable with fully closed intervals. I don't
> remember the details, but be on the watch for it. Also, I don't recall if
> it used a half-sub-pixel bias anywhere or simply punted on it as not being
> visible enough to worry about strict "center of sub-pixel" comparisons.
>
I am perplex and I am going to check pisces code against your given
approach.
> Now on to the specific questions:
>
>
> private void addLine(float x1, float y1, float x2, float y2) {
>> ...
>> // LBO: why ceil ie upper integer ?
>> final int firstCrossing = Math.max(ceil(y1), boundsMinY); //
>> upper integer
>> final int lastCrossing = Math.min(ceil(y2), boundsMaxY); //
>> upper integer
>>
>
> Perhaps lastCrossing is misnamed. I think it represents the end of the
> interval which is half-open. Does that match the rest of the operations
> done on it?
>
> If every coordinate has already been biased by the -0.5 then ceil is just
> the tail end of the rounding equation I gave above.
>
That's not the case => buggy: x1, y1 and x2, y2 are directly the point
coordinates as float values.
FYI, firstCrossing is used to compute the first X intersection for that Y
coordinate and determine the bucketIdx:
_edges[ptr+CURX] = x1 + (firstCrossing - y1) * slope;
...
final int bucketIdx = firstCrossing - boundsMinY;
...
_edges[ptr+NEXT] = edgeBuckets[bucketIdx];
lastCrossing is used to give the YMax edge coordinate and edgedBucketCounts:
_edges[ptr+YMAX] = lastCrossing;
...
edgeBucketCounts[lastCrossing - boundsMinY] |= 1;
I think rounding errors can lead to pixel / shape rasterization
deformations ... ?
> If not, then the half-open considerations still apply if you use the same
> rounding for both the top and the bottom of the segment. If lastCrossing
> represents the "bottom of the shape" then it should not be included. If
> the next segment continues on from it, then its y1 will be computed in the
> first equation and will represent the first scanline that the next segment
> participates in.
>
>
> => firstCrossing / lastCrossing should use lower and upper integer
>> values (rounding issues) ?
>>
>
> Do they make sense now given my explanation above? Perhaps they should be
> renamed to indicate their half-openness?
I am a bit embarrassed to verify maths performed in ScanLineIterator.next()
which use edges and edgeBucketCounts arrays ... could you have a look ?
Apparently, it uses the following for loop that respects the semi-open
interval philosophy:
for (int i = 0, ecur, j, k; i < count; i++) {
...
However, I am not sure if the edges "linked list" is traversed correctly ...
boolean endRendering() {
>> // TODO: perform shape clipping to avoid dealing with segments
>> out of bounding box
>>
>> // Ensure shape edges are within bbox:
>> if (edgeMinX > edgeMaxX || edgeMaxX < 0f) {
>> return false; // undefined X bounds or negative Xmax
>> }
>> if (edgeMinY > edgeMaxY || edgeMaxY < 0f) {
>> return false; // undefined Y bounds or negative Ymax
>> }
>>
>
> I'd use min >= max since if min==max then I think nothing gets generated
> as a result of all edges having both the in and out crossings on the same
> coordinate. Also, why not test against the clip bounds instead? The code
> after that will clip the edgeMinMaxXY values against the boundsMinMax
> values. If you do this kind of test after that clipping is done (on
> spminmaxxy) then you can discover if all of the coordinates are outside the
> clip or the region of interest.
I tried here to perform few "fast" checks before doing float to int
conversions (costly because millions are performed): I think it can be
still improved: edgeMinX > edgeMaxX only ensures edgeMinX is defined and
both are positive !
TODO: improve boundary checks here.
> // why use upper integer for edge min values ?
>>
>
> Because an edge falling between sample points includes only the sample
> point "above" it. (Or "un-includes" it if it is a right/bottom edge.)
>
>
> => Here is the question: why use ceil instead of floor ?
>>
>> final int eMinX = ceil(edgeMinX); // upper positive int
>> if (eMinX > boundsMaxX) {
>> return false; // Xmin > bbox maxX
>> }
>>
>> final int eMinY = ceil(edgeMinY); // upper positive int
>> if (eMinY > boundsMaxY) {
>> return false; // Ymin > bbox maxY
>> }
>>
>> int spminX = Math.max(eMinX, boundsMinX);
>> int spmaxX = Math.min(ceil(edgeMaxX), boundsMaxX); // upper
>> positive int
>> int spminY = Math.max(eMinY, boundsMinY);
>> int spmaxY = Math.min(ceil(edgeMaxY), boundsMaxY); // upper
>> positive int
>>
>> int pminX = spminX >> SUBPIXEL_LG_POSITIONS_X;
>> int pmaxX = (spmaxX + SUBPIXEL_MASK_X) >>
>> SUBPIXEL_LG_POSITIONS_X;
>> int pminY = spminY >> SUBPIXEL_LG_POSITIONS_Y;
>> int pmaxY = (spmaxY + SUBPIXEL_MASK_Y) >>
>> SUBPIXEL_LG_POSITIONS_Y;
>>
>> // store BBox to answer ptg.getBBox():
>> this.cache.init(pminX, pminY, pmaxX, pmaxY);
>>
>> In PiscesCache:
>> void init(int minx, int miny, int maxx, int maxy) {
>> ...
>> // LBO: why add +1 ??
>> bboxX1 = maxx /* + 1 */;
>> bboxY1 = maxy /* + 1 */;
>>
>
> I think the values passed in are "the smallest and largest values we
> encountered" and so adding one computes the first coordinate that is "not
> inside the sample region" as per the half-open philosophy.
>
> I tend to use and encourage "min/max" naming for closed-interval values
> and xy0/xy1 or xy1/xy2 for half-open-interval values.
>
>
> => piscesCache previously added +1 to maximum (upper loop condition y <
>> y1)
>> but the pmaxX already uses ceil (+1) and (spmaxY + SUBPIXEL_MASK_Y)
>> ensures the last pixel is reached.
>>
>
> spmaxY + SUBPIXEL_MASK_Y computes the last sub-pixel piece of the last
> pixel. It is essentially equivalen to "lastrow.9999999". So, the first
> coordinate not included would be "lastrow+1" which is simply adding +1 to
> the sub-pixel "max" value that was given.
I understand but I figured out that pmaxX or pmaxY (pixel coordinates) may
lead to upper integer:
- spmaxY >> SUBPIXEL_LG_POSITIONS_Y
=> 6
- (spmaxY + SUBPIXEL_MASK_Y) >> SUBPIXEL_LG_POSITIONS_Y => 7
So adding +1 in piscesCache will give maxY = 8 !!
> I fixed these limits to have correct rendering due to dirty rowAARLE
>> reuse.
>>
>
> If there was a bug before I'd prefer to have it fixed within the
> philosophy of half-open intervals rather than trying to translate into
> closed intervals - it keeps all of the code on the same page.
>
I think it was not a real bug as new rowAARLE[][] was big enough and zero
filled so it led to only larger boundaries and tiles.
>
> 2013/4/24 Jim Graham <james.graham at oracle.com <mailto:
>> james.graham at oracle.com>>
>>
>> Do you have any faster implementation for Math.ceil/floor ? I now use
>> casting 1 + (int) / (int) but I know it is only correct for positive
>> numbers (> 0).
>>
>
> ceil(integer) == integer, but your technique would return "integer+1". I
> don't remember a decent technique for doing ceil with a cast.
I disagree:
- ceil(float) => the smallest (closest to negative infinity) floating-point
value that is greater than or equal to the argument and is equal to a
mathematical integer.
- floor(float) => the largest (closest to positive infinity) floating-point
value that less than or equal to the argument and is equal to a
mathematical integer.
i.e. floor(x) < x < ceil(x)
Here are numbers comparing the Math.ceil / floor operations and my casting
implementation (only valid for >0):
floats = [-134758.4, -131.5, -17.2, -1.9, -0.9, -1.0E-8, -1.0E-23,
-0.0, 0.0, 134758.4, 131.5, 17.2, 1.9, 0.9, 1.0E-8, 1.0E-23]
mathCeil = [-134758, -131, -17, -1, 0, 0, 0, 0, 0, 134759, 132, 18, 2,
1, 1, 1]
fastMathCeil = [-134757, -130, -16, 0, 1, 1, 1, 1, 1, 134759, 132, 18, 2,
1, 1, 1]
mathFloor = [-134759, -132, -18, -2, -1, -1, -1, 0, 0, 134758, 131, 17,
1, 0, 0, 0]
fastMathFloor= [-134758, -131, -17, -1, 0, 0, 0, 0, 0, 134758, 131, 17, 1,
0, 0, 0]
Micro benchmark results:
# FastMathCeil: run duration: 10 000 ms
4 threads, Tavg = 44,95 ns/op (σ = 0,39 ns/op), Total ops
= 890974137 [ 44,54 (224799696), 44,76 (223695800), 45,59
(219626233), 44,93 (222852408)]
# MathCeil: run duration: 10 000 ms
4 threads, Tavg = 125,59 ns/op (σ = 1,71 ns/op), Total ops
= 318415673 [ 125,81 (79406107), 128,11 (78061562), 125,22
(79862351), 123,33 (81085653)]
>
> - clipping issues:
>> for very large dashed rectangle, millions of segments are emited from
>> dasher (x1) to stroker (x4 segments) to renderer (x4 segments).
>>
>> However, I would like to add a minimal clipping algorithm but the
>> rendering pipeline seems to be handling affine transforms between stages:
>> /*
>> * Pipeline seems to be:
>> * shape.getPathIterator
>> * -> inverseDeltaTransformConsumer
>> * -> Dasher
>> * -> Stroker
>> * -> deltaTransformConsumer OR transformConsumer
>> * -> pc2d = Renderer (bounding box)
>> */
>>
>
> If we could teach the stroker to be able to apply an elliptical pen then
> we could do the transforms once up front and simplify that quite a lot.
> That optimization remains TBD. We currently at least detect uniform
> scales and those can be handled with a single transform up front and
> multiplying all dash segment values and the line width by the uniform scale
> factor. But, anything that transforms a circle into an ellipse requires us
> to use the multi-stage silliness above.
Could you give me more details about "elliptical pen" ie shearing shapes ?
Another idea to be discussed: rounding caps are quite expensive (curve
break into lines) and an optimization could determine if the cap is
invisible (out of the clip bounds) or visually useless = too small ie half
circle spans over less than 1 pixels (true pixel or AA pixels): any idea ?
maybe I should only check if the line width is too small to renderer caps
in such situations...
> It is complicated to perform clipping in the Renderer and maybe to late
>> as a complete stroker's segment must be considered as a whole (4
>> segments without caps ...).
>>
>> Could you give me advices how to hack / add clipping algorithm in the
>> rendering pipeline (dasher / stroker / renderer) ?
>>
>
> Should I try to derive a bounding box for the dasher / stroker depending
>> on the Affine transforms ?
>>
>
> Yes, that sounds right so far, but it's pretty high-level and I'm not sure
> where your level of understanding here falls short...? I guess what I
> would do is:
>
> - compute an expanded "clip box" for the stroker based on the original
> clip, the transform, the line width, and the miter limit. Note that the
> transform applied to the stroke coordinates varies depending on whether we
> use the "uniform scaling" optimization or not
>
> - test coordinates for being in the expanded clip box in the
> PathConsumer2D methods of Stroker. If the segment is rejected, be sure to
> update all of the internal computations (normal values, last moveto coords,
> etc.) that might be needed for stroking the following segment
>
that's the plan but I would like to do it twice: in dasher and in stroker
... maybe it should be coded once in a new PiscesClipUtils class or Helpers
...
>
> - unfortunately this means that dasher doesn't get the advantages, you
> could add similar code to the dasher, but then you have to be sure an
> entire dash is omitted before you can omit the data going on, otherwise the
> stroker will not get proper incoming coordinates from "the previous segment
> that was out of bounds". That could be managed with proper communication
> between the dasher and stroker, but then it gets more complicated.
>
Agreed but maybe it should be analyzed more deeply to evaluate if it is not
too hard ...
>
> Those are the caveats that I would keep in mind while I was designing a
> solution - hope it helps some...
>
> ...jim
>
PS: I have more free time until sunday (children are away), so I would like
to talk to you using skype if it is possible ?
If yes, could you give me your skype account and work times (GMT time
please) ?
I am living in Grenoble, France (GMT+1)
Thanks a lot,
Laurent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/2d-dev/attachments/20130502/8f890813/attachment.html>
More information about the 2d-dev
mailing list