[OpenJDK Rasterizer] Fwd: Re: Fwd: RFR: Marlin renderer #3
Laurent Bourgès
bourges.laurent at gmail.com
Mon Jul 6 10:28:16 UTC 2015
Jim,
I have made the mentioned tests: it means I modified addLine() and
endRendering methods:
1/ use proper and consistent ceil(coord - 0.5) as you did in openpisces
(FX) :
Renderer.USE_CORRECT_RND=true
The output images are different from Pisces ones but are now closer to
Ductus ones = more accurate.
Of course, it is slower up to 15% on the very complex map:
REF:
dc_boulder_2013-13-30-06-13-17.ser 1 93 112.996
113.297 112.805 0.507 111.459 113.508 93
dc_shp_alllayers_2013-00-30-07-00-43.ser 1 246 42.791
43.483 42.926 0.283 42.648 43.816 246
dc_shp_alllayers_2013-00-30-07-00-47.ser 1 25 762.882
764.781 763.136 0.715 762.219 765.110 25
test_z_625k.ser 1 61 168.745
169.238 168.780 0.216 168.423 169.417 61
PROPER_ROUND:
dc_boulder_2013-13-30-06-13-17.ser 1 90 115.722
116.187 115.756 0.196 115.497 116.691 90
dc_shp_alllayers_2013-00-30-07-00-43.ser 1 230 45.598
45.816 45.620 0.105 45.467 46.182 230
dc_shp_alllayers_2013-00-30-07-00-47.ser 1 25 877.221
878.020 877.272 0.558 876.277 878.890 25
test_z_625k.ser 1 60 173.377
173.729 173.411 0.191 173.108 174.188 60
2/ use fixed point approach (longer work) to only use integer maths in
Marlin rendering loop (crossings):
Renderer.USE_CORRECT_RND=true and Renderer.USE_FP=true
I simply made a port of ShapeSpanIterator.c (bumpx, bumperr, error) as you
can see below in the given patch.
It works well and the output images are close to ductus too (hope to be
equals to previous test).
=> faster (no float to int conversions ?)
It is faster than previous test (float + proper round) but not faster yet
than current Marlin (float + cast): ~ 5% slower max.
USE_FP:
dc_boulder_2013-13-30-06-13-17.ser 1 89 117.544
117.900 117.564 0.173 117.306 118.287 89
dc_shp_alllayers_2013-00-30-07-00-43.ser 1 231 45.338
45.502 45.347 0.126 45.155 46.458 231
dc_shp_alllayers_2013-00-30-07-00-47.ser 1 25 808.432
809.665 808.602 0.723 807.456 810.553 25
test_z_625k.ser 1 61 170.808
171.272 170.886 0.231 170.566 171.789 61
However, the performance gap is very small and it can be further optimized:
remove Unsafe usage that is no more required:
=> edge array will then only contain int[] and Unsafe usage is no more
necessary
To conclude, these tests improved the output quality (better rounding) and
the fixed-point approach is promising: it is quite fast and allows to get
rid of Unsafe usage => simpler / safe and edge array will use again array
caches (like others).
I will try going further during the week ...
Laurent
PS: Here is a (quick and dirty) patch on Renderer to *illustrate* my
changes and let you see what I did:
# This patch file was generated by NetBeans IDE
# It uses platform neutral UTF-8 encoding and \n newlines.
--- HEAD
+++ Modified In Working Tree
@@ -37,6 +37,23 @@
import sun.misc.Unsafe;
final class Renderer implements PathConsumer2D, MarlinConst {
+
+ final static boolean USE_CORRECT_RND = true;
+
+ final static boolean USE_FP = true && USE_CORRECT_RND;
+
+ /*
+#define ERRSTEP_MAX (0x7fffffff)
+#define FRACTTOJINT(f) ((jint) ((f) * (double) ERRSTEP_MAX))
+ */
+ final static int ERR_STEP_MAX = 0x7fffffff;
+ final static double ERR_STEP_MAX_DBL = (double)ERR_STEP_MAX;
+
+ static int fractToInt(final float f) {
+ return (int) (f * ERR_STEP_MAX_DBL);
+ }
+
+
// unsafe reference
final static Unsafe unsafe;
// array offset
@@ -102,9 +119,6 @@
static final int INITIAL_BUCKET_ARRAY
= INITIAL_PIXEL_DIM * SUBPIXEL_POSITIONS_Y;
- // initial edges (16 bytes) = 32K [ints/floats] = 128K
- static final int INITIAL_EDGES_CAPACITY = INITIAL_ARRAY_16K << 3;
-
public static final int WIND_EVEN_ODD = 0;
public static final int WIND_NON_ZERO = 1;
@@ -114,11 +128,17 @@
public static final int OFF_F_CURX = 0;
public static final int OFF_SLOPE = OFF_F_CURX + SIZE;
// integer values:
+ public static final int OFF_CURX = 0;
+ public static final int OFF_ERROR = OFF_CURX + SIZE;
+
public static final int OFF_NEXT = OFF_SLOPE + SIZE;
public static final int OFF_YMAX_OR = OFF_NEXT + SIZE;
+ public static final int OFF_BUMP_X = OFF_YMAX_OR + SIZE;
+ public static final int OFF_BUMP_ERR= OFF_BUMP_X + SIZE;
+
// size of one edge in bytes
- public static final int SIZEOF_EDGE_BYTES = OFF_YMAX_OR + SIZE;
+ public static final int SIZEOF_EDGE_BYTES = ((USE_FP) ? OFF_BUMP_ERR :
OFF_YMAX_OR) + SIZE;
// curve break into lines
// cubic bind length (dx or dy) = 20 to decrement step
@@ -175,6 +195,7 @@
private final int[] edgePtrs_initial = new int[INITIAL_SMALL_ARRAY +
1]; // 4K
// merge sort initial arrays (large enough to satisfy most usages)
(1024)
private final int[] aux_crossings_initial = new
int[INITIAL_SMALL_ARRAY]; // 4K
+ // +1 to avoid recycling in Helpers.widenArray()
private final int[] aux_edgePtrs_initial = new
int[INITIAL_SMALL_ARRAY + 1]; // 4K
//////////////////////////////////////////////////////////////////////////////
@@ -344,14 +365,26 @@
/* TODO: improve accuracy using correct float rounding to int
ie use ceil(x - 0.5f) */
+ float y1_cor;
+ int firstCrossing, lastCrossing;
+ if (USE_CORRECT_RND) {
+ // convert subpixel coordinates (float) into pixel positions (int)
+ // upper integer (inclusive)
+ y1_cor = y1 - 0.5f;
+ firstCrossing = Math.max(FloatMath.ceil(y1 - 0.5f), _boundsMinY);
+ // note: use boundsMaxY (last Y exclusive) to compute correct
coverage
+ // upper integer (exclusive ?)
+ lastCrossing = Math.min(FloatMath.ceil(y2 - 0.5f), boundsMaxY);
+ } else {
// convert subpixel coordinates (float) into pixel positions (int)
// upper integer (inclusive)
- final int firstCrossing = Math.max(FloatMath.ceil(y1),
_boundsMinY);
+ firstCrossing = Math.max(FloatMath.ceil(y1), _boundsMinY);
// note: use boundsMaxY (last Y exclusive) to compute correct
coverage
// upper integer (exclusive ?)
- final int lastCrossing = Math.min(FloatMath.ceil(y2),
boundsMaxY);
+ lastCrossing = Math.min(FloatMath.ceil(y2), boundsMaxY);
+ }
/* skip horizontal lines in pixel space and clip edges
out of y range [boundsMinY; boundsMaxY] */
@@ -399,6 +432,8 @@
final int edgePtr = _edges.used;
if (_edges.length < edgePtr + _SIZEOF_EDGE_BYTES) {
+ // suppose _edges.length > _SIZEOF_EDGE_BYTES
+ // so doubling size is enough to add needed bytes
// double size:
final int edgeNewSize = edgePtr << 1;
if (doStats) {
@@ -412,8 +447,54 @@
final long addr = _edges.address + edgePtr;
// float values:
+ if (USE_CORRECT_RND) {
+ if (USE_FP) {
+ // First, how far does y bump to get to next HPC?
+ // final float ystartbump = firstCrossing - y1 + 0.5f;
+ // Now, bump the float x coordinate to get X sample at that
HPC.
+// x1 += (firstCrossing - y1 + 0.5f) * slope;
+ final float x1_cor = x1 - 0.5f + (firstCrossing - y1_cor) *
slope;
+ // Now calculate the integer coordinate that such a span
starts at.
+ // NOTE: Span inclusion is based on vertical pixel centers
(VPC).
+ // istartx = (jint) ceil(x0 - 0.5f);
+// final int istartx = FloatMath.ceil(x1_cor - 0.5f);
+ int istartx = FloatMath.ceil(x1_cor);
+ _unsafe.putInt(addr, istartx);
+
+ // Finally, find out how far the x coordinate can go before
next VPC.
+ // error = FRACTTOJINT(x0 - (istartx - 0.5f));
+// final int error = fractToInt(x1 - (istartx -
0.5f));
+// final int error = (int) ((x1 - (istartx - 0.5f)) *
ERR_STEP_MAX_DBL);
+ istartx -= 1;
+ _unsafe.putInt(addr + OFF_ERROR,
+// (int) ((x1 - istartx + 0.5f) *
ERR_STEP_MAX_DBL));
+ (int) ((x1_cor - istartx) *
ERR_STEP_MAX_DBL));
+
+ // What is the lower bound of the per-scanline change in the X
coord?
+ // bumpx = (jint) floor(slope);
+ final float floor_slope = FloatMath.floor(slope);
+// final int bumpx = (int)floor_slope;
+ _unsafe.putInt(addr + OFF_BUMP_X,
+ (int)floor_slope);
+
+ // What is the subpixel amount by which the bumpx is off?
+ // bumperr = FRACTTOJINT(slope - floor(slope));
+// final int bumperr = fractToInt(slope - floor_slope);
+// final int bumperr = (int) ((slope - floor_slope) *
ERR_STEP_MAX_DBL);
+ _unsafe.putInt(addr + OFF_BUMP_ERR,
+ (int) ((slope - floor_slope) *
ERR_STEP_MAX_DBL));
+
+ } else {
+ // x1 + (firstCrossing + 0.5f - y1) * slope;
+ _unsafe.putFloat(addr, x1 - 0.5f + (firstCrossing
- y1 + 0.5f) * slope);
+ }
+ } else {
_unsafe.putFloat(addr, x1 + (firstCrossing - y1) *
slope);
+ }
+
+ if (!USE_FP) {
_unsafe.putFloat(addr + OFF_SLOPE, slope);
+ }
// each bucket is a linked list. this method adds ptr to the
@@ -687,7 +768,7 @@
// clean alpha array (zero filled)
private int[] alphaLine;
// 2048 (pixelsize) pixel large
- private final int[] alphaLine_initial = new int[INITIAL_AA_ARRAY]; //
16K
+ private final int[] alphaLine_initial = new int[INITIAL_AA_ARRAY]; //
8K
private void _endRendering(final int ymin, final int ymax) {
@@ -720,6 +801,12 @@
final int _OFF_NEXT = OFF_NEXT;
final int _OFF_YMAX_OR = OFF_YMAX_OR;
+ final int _OFF_ERROR = OFF_ERROR;
+ final int _OFF_BUMP_X = OFF_BUMP_X;
+ final int _OFF_BUMP_ERR= OFF_BUMP_ERR;
+
+ final int _ERR_STEP_MAX= ERR_STEP_MAX;
+
// unsafe I/O:
final Unsafe _unsafe = unsafe;
final long addr0 = _edges.address;
@@ -754,7 +841,7 @@
int bucketcount, i, j, ecur, lowx, highx;
int cross, lastCross;
float f_curx;
- int x0, x1, tmp, sum, prev, curx, curxo, crorientation;
+ int x0, x1, tmp, sum, prev, curx, curxo, crorientation, err;
int pix_x, pix_xmaxm1, pix_xmax;
int low, high, mid, prevNumCrossings;
@@ -913,22 +1000,53 @@
// get the pointer to the edge
ecur = _edgePtrs[i];
- // random access so use unsafe:
- addr = addr0 + ecur; // ecur + OFF_F_CURX
- f_curx = _unsafe.getFloat(addr);
-
/* convert subpixel coordinates (float) into pixel
positions (int) for coming scanline */
/* note: it is faster to always update edges even
if it is removed from AEL for coming or last
scanline */
+
// random access so use unsafe:
+ addr = addr0 + ecur; // ecur + OFF_F_CURX
+
+ if (USE_FP) {
+ // get current crossing and error:
+ curx = _unsafe.getInt(addr);
+ err = _unsafe.getInt(addr + _OFF_ERROR);
+
+ // update crossing with orientation at last bit:
+ cross = (curx << 1)
+ | _unsafe.getInt(addr + _OFF_YMAX_OR) &
0x1;
+
+ // Increment x using DDA (fixed point):
+ // x0 = seg->curx + seg->bumpx
+ curx += _unsafe.getInt(addr + _OFF_BUMP_X);
+ // err = seg->error + seg->bumperr
+ err += _unsafe.getInt(addr + _OFF_BUMP_ERR);
+ // x0 -= (err >> 31);
+// curx -= (err >> 31);
+ _unsafe.putInt(addr, curx - (err >> 31));
+
+ // err &= ERRSTEP_MAX;
+// err &= _ERR_STEP_MAX;
+ _unsafe.putInt(addr + _OFF_ERROR, err &
_ERR_STEP_MAX);
+
+ } else {
+ f_curx = _unsafe.getFloat(addr);
+ // random access so use unsafe:
_unsafe.putFloat(addr,
f_curx + _unsafe.getFloat(addr +
_OFF_SLOPE)); // ecur + _SLOPE
// update crossing ( x-coordinate + last bit =
orientation (0 or 1)):
+ if (USE_CORRECT_RND) {
+ // ceil(curx - 0.5f) : TODO: push - 0.5 in edge
+ cross = (FloatMath.ceil(f_curx) << 1)
+ | _unsafe.getInt(addr + _OFF_YMAX_OR) &
0x1;
+ } else {
cross = (((int) f_curx) << 1)
| _unsafe.getInt(addr + _OFF_YMAX_OR) &
0x1;
+ }
+ }
if (doStats) {
RendererContext.stats.stat_rdr_crossings_updates
@@ -1008,22 +1126,53 @@
// get the pointer to the edge
ecur = _edgePtrs[i];
- // random access so use unsafe:
- addr = addr0 + ecur; // ecur + OFF_F_CURX
- f_curx = _unsafe.getFloat(addr);
-
/* convert subpixel coordinates (float) into pixel
positions (int) for coming scanline */
/* note: it is faster to always update edges even
if it is removed from AEL for coming or last
scanline */
+
// random access so use unsafe:
+ addr = addr0 + ecur; // ecur + OFF_F_CURX
+
+ if (USE_FP) {
+ // get current crossing and error:
+ curx = _unsafe.getInt(addr);
+ err = _unsafe.getInt(addr + _OFF_ERROR);
+
+ // update crossing with orientation at last bit:
+ cross = (curx << 1)
+ | _unsafe.getInt(addr + _OFF_YMAX_OR) &
0x1;
+
+ // Increment x using DDA (fixed point):
+ // x0 = seg->curx + seg->bumpx
+ curx += _unsafe.getInt(addr + _OFF_BUMP_X);
+ // err = seg->error + seg->bumperr
+ err += _unsafe.getInt(addr + _OFF_BUMP_ERR);
+ // x0 -= (err >> 31);
+// curx -= (err >> 31);
+ _unsafe.putInt(addr, curx - (err >> 31));
+
+ // err &= ERRSTEP_MAX;
+// err &= _ERR_STEP_MAX;
+ _unsafe.putInt(addr + _OFF_ERROR, err &
_ERR_STEP_MAX);
+
+ } else {
+ f_curx = _unsafe.getFloat(addr);
+ // random access so use unsafe:
_unsafe.putFloat(addr,
f_curx + _unsafe.getFloat(addr +
_OFF_SLOPE)); // ecur + _SLOPE
// update crossing ( x-coordinate + last bit =
orientation (0 or 1)):
+ if (USE_CORRECT_RND) {
+ // ceil(curx - 0.5f) : TODO: push - 0.5 in edge
+ cross = (FloatMath.ceil(f_curx) << 1)
+ | _unsafe.getInt(addr + _OFF_YMAX_OR) &
0x1;
+ } else {
cross = (((int) f_curx) << 1)
| _unsafe.getInt(addr + _OFF_YMAX_OR) &
0x1;
+ }
+ }
if (doStats) {
RendererContext.stats.stat_rdr_crossings_updates
@@ -1250,21 +1399,34 @@
/* TODO: improve accuracy using correct float rounding to int
ie use ceil(x - 0.5f) */
+ final int _boundsMinY = boundsMinY;
+ final int _boundsMaxY = boundsMaxY;
+
// bounds as inclusive intervals
- final int spminX = Math.max(FloatMath.ceil(edgeMinX), boundsMinX);
- final int spmaxX = Math.min(FloatMath.ceil(edgeMaxX), boundsMaxX -
1);
+ int spminX, spmaxX, spminY, spmaxY;
+ int maxY;
- final int _boundsMinY = boundsMinY;
- final int _boundsMaxYm1 = boundsMaxY - 1;
+ if (USE_CORRECT_RND) {
+ spminX = Math.max(FloatMath.ceil(edgeMinX - 0.5f), boundsMinX);
+ spmaxX = Math.min(FloatMath.ceil(edgeMaxX - 0.5f), boundsMaxX - 1);
- final int spminY = Math.max(FloatMath.ceil(edgeMinY), _boundsMinY);
- final int spmaxY;
- int maxY = FloatMath.ceil(edgeMaxY);
- if (maxY <= _boundsMaxYm1) {
+ spminY = Math.max(FloatMath.ceil(edgeMinY - 0.5f), _boundsMinY);
+
+ maxY = FloatMath.ceil(edgeMaxY - 0.5f);
+ } else {
+ spminX = Math.max(FloatMath.ceil(edgeMinX), boundsMinX);
+ spmaxX = Math.min(FloatMath.ceil(edgeMaxX), boundsMaxX - 1);
+
+ spminY = Math.max(FloatMath.ceil(edgeMinY), _boundsMinY);
+
+ maxY = FloatMath.ceil(edgeMaxY);
+ }
+
+ if (maxY <= _boundsMaxY - 1) {
spmaxY = maxY;
} else {
- spmaxY = _boundsMaxYm1;
- maxY = _boundsMaxYm1 + 1;
+ spmaxY = _boundsMaxY - 1;
+ maxY = _boundsMaxY;
}
buckets_minY = spminY - _boundsMinY;
buckets_maxY = maxY - _boundsMinY;
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/graphics-rasterizer-dev/attachments/20150706/b2dbee9e/attachment-0001.html>
More information about the graphics-rasterizer-dev
mailing list