[OpenJDK Rasterizer] Fwd: Re: Fwd: RFR: Marlin renderer #3
Laurent Bourgès
bourges.laurent at gmail.com
Mon Aug 10 21:15:35 UTC 2015
Jim,
Here is the new webrev including your proposals to use 32.31 fixed-point
maths and double-precision in addLine() as it is definitely better:
http://cr.openjdk.java.net/~lbourges/marlin/marlin-s3.3/
I also fixed all findbugs warnings or added a comment / TODO to few of them.
Changes:
- ArrayCache: fixed concurrency issue with stats (added incXXX methods)
- Curve: return instead of break in switch cases, added missing default
cases, renamed field Ts to ts
- Dasher:
return instead of break in switch cases, added missing default cases
* added "TODO: compare float values using epsilon:": to be discussed*
- FloatMath: CHECK_OVERFLOW and CHECK_NAN enabled, removed ceil() and
floor() methods, added powerOfTwoD(int) copied from Math
- Helpers: return instead of break in switch cases, added missing default
cases
- MarlinCache: moved few constants to MarlinConst to fix circular dependency
- MarlinProperties: new class to get marlin system properties (code from
MarlinRenderingEngine)
- MarlinConst:
use the new MarlinProperties class to fix circular dependency
moved several constants used by several classes to fix circular
dependency (SUBPIXEL_LG_POSITIONS_X, SUBPIXEL_LG_POSITIONS_Y,
SUBPIXEL_POSITIONS_X, SUBPIXEL_POSITIONS_Y, NORM_SUBPIXELS, MAX_AA_ALPHA,
TILE_SIZE_LG, TILE_SIZE)
- MarlinRenderingEngine:
Improved getNormalizingPathIterator() to return the original path
iterator if normalization is disabled that simplifies the 4 usages
Improved thin stoke handling: getMinimumAAPenSize() returns
MIN_PEN_SIZE ~ 1/8f
added missing default cases in switch cases
moved code dealing with marlin system properties in MarlinProperties
- Renderer:
removed fractToInt()
added double POWER_2_TO_32 = FloatMath.powerOfTwoD(32) used to
compute scalb(double, 32) inlined for performance
removed several constants in MarlinConst
added comments about cubic / quad inc/dec bind lengths: QUAD_COUNT_LG
removed
fixed initial count = 1 in quadBreakIntoLinesAndAdd and maxDD
(missing abs calls)
improved addLine(): added comments, use double-precision for improved
accuracy and fixed point 32.32 format
- RendererContext: use ArrayCache incXXX methods
- RendererStats: fixed concurrency warning related to the RendererStats
singleton field
- Stroker: added missing default cases in switch cases
- TransformingPathConsumer2D: fixed field names (lower case as expected by
findbugs)
- Version: increased version to 0.7.0
My answers below:
2015-07-29 5:11 GMT+02:00 Jim Graham <james.graham at oracle.com>:
>
> NormalizingIterator.NearestPixel should probably be renamed
> NearestPixelQuarter to reflect its functionality (and
> RenderCtx.nPQPathIterator).
>
done.
> getNormalizing() - why not just return src for mode OFF? That way you
> don't even have to test in the calling function, though it does add an
> extra method call in the non-normalizing case, it simplifies the code - and
> hopefully the method is simple enough to be inlined.
>
done.
>
> Renderer.java, line 45: fractToInt() is never used.
>
removed.
>
> Renderer.java, line 203: typo in comment -> 96K?
>
fixed.
>
> Renderer.java, line 379: Instead, mention that y values are already
> shifted by -0.5 in tosubpixy()?
> (The "note" comment makes it appear that that is work that is left to be
> done, but it is already done)
>
I've taken your comments in your more recent email.
>
> Renderer.java, line 461: What happens if x1_cor was already at VPC? The
> error should be 0, but you end up setting it to ERR_STEP_MAX.
>
Fixed using your 32.31 approach using double-precision maths (slope) as it
seems slightly faster ~ 1%.
"
final double x1d = x1;
final double y1d = y1;
final double slope = (x2 - x1d) / (y2 - y1d);
...
final double x1_intercept = x1d + (firstcrossing - y1d) * slope;
final long x1_fixed_biased = ((long) scalb(x1_intercept, 32)) + 0x7fffffffL;
store(CURX, (int) (x1_fixed_biased >> 32));
store(ERRX, ((int) x1_fixed_biased) >>> 1);
....
final long slope_fixed = (long) scalb(slope, 32);
store(BUMPX, (int) (slope_fixed >> 32));
store(BUMPXERR, ((int) slope_fixed) >>> 1);
"
> Renderer.java, line 792: Do you need to copy ERR_STEP_MAX into a local?
> It is a statically initialized final int - that should already be a
> constant that gets copied into the code.
>
I always copy constants used in loops to avoid repeated constant lookups.
> Renderer.java, line 1350-ish: mention in here somewhere that edgeMinMaxY
> values have already been adjusted by -0.5 in tosubpixy(), but edgeMinMaxX
> values have not yet been adjusted? (It took me a moment to realize that.)
>
done.
>
> CeilAndFloorTests - whoa, hex floating point literals? That's a new one
> for me... ;)
>
> I'm guessing that p22 has 1 bit of fractional mantissa and p23 has no
> fractional mantissa digits?
>
Yes; I got it from an existing test :
/jdk/test/java/lang/Math/CeilAndFloorTests.java
but adapted to float constants.
Sorry for my late reply, but it took me some time to make all changes,
test, benchmark and tune the code.
Cheers,
Laurent
PS: Here is the latest Marlin results (OpenJDK9 but not up-to-date) that I
hope we will confirm soon with J2DBench MT:
Test Threads Ops Med
Pct95 Avg StdDev Min Max TotalOps [ms/op]
dc_boulder_2013-13-30-06-13-17.ser 1 115 90.712
91.734 90.871 0.426 90.253 92.586 115
dc_boulder_2013-13-30-06-13-17.ser 2 230 91.540
91.736 91.544 0.131 91.211 92.278 230
dc_boulder_2013-13-30-06-13-17.ser 4 460 92.652
92.807 92.647 0.108 92.234 93.074 460
dc_boulder_2013-13-30-06-13-20.ser 1 217 47.605
48.383 47.726 0.313 47.394 48.468 217
dc_boulder_2013-13-30-06-13-20.ser 2 434 49.083
49.813 48.928 0.497 48.080 50.125 434
dc_boulder_2013-13-30-06-13-20.ser 4 868 49.726
50.255 49.631 0.524 48.741 55.242 868
dc_shp_alllayers_2013-00-30-07-00-43.ser 1 251 39.892
40.081 39.879 0.131 39.543 40.186 251
dc_shp_alllayers_2013-00-30-07-00-43.ser 2 502 40.873
41.588 40.899 0.604 39.935 41.773 502
dc_shp_alllayers_2013-00-30-07-00-43.ser 4 1004 41.578
42.974 41.732 0.792 40.440 43.934 1004
dc_shp_alllayers_2013-00-30-07-00-47.ser 1 25 776.726
776.985 776.729 0.177 776.365 777.169 25
dc_shp_alllayers_2013-00-30-07-00-47.ser 2 50 784.572
785.425 784.423 0.836 781.791 785.673 50
dc_shp_alllayers_2013-00-30-07-00-47.ser 4 100 784.838
789.145 785.661 2.856 783.250 799.128 100
dc_spearfish_2013-11-30-06-11-15.ser 1 819 12.785
12.867 12.799 0.044 12.756 13.272 819
dc_spearfish_2013-11-30-06-11-15.ser 2 1638 12.868
12.950 12.866 0.054 12.805 13.323 1638
dc_spearfish_2013-11-30-06-11-15.ser 4 3276 12.915
13.008 12.921 0.065 12.855 13.600 3276
dc_spearfish_2013-11-30-06-11-19.ser 1 1593 6.574
6.826 6.613 0.088 6.557 7.008 1593
dc_spearfish_2013-11-30-06-11-19.ser 2 3186 6.619
6.856 6.631 0.086 6.545 7.021 3186
dc_spearfish_2013-11-30-06-11-19.ser 4 6372 6.629
6.881 6.657 0.094 6.573 8.548 6372
dc_topp:states_2013-11-30-06-11-06.ser 1 845 12.418
12.483 12.426 0.029 12.329 12.600 845
dc_topp:states_2013-11-30-06-11-06.ser 2 1690 12.464
12.504 12.461 0.027 12.343 12.615 1690
dc_topp:states_2013-11-30-06-11-06.ser 4 3380 12.460
12.547 12.465 0.084 12.346 16.590 3380
dc_topp:states_2013-11-30-06-11-07.ser 1 1382 7.574
7.636 7.578 0.025 7.484 7.963 1382
dc_topp:states_2013-11-30-06-11-07.ser 2 2764 7.592
7.620 7.591 0.026 7.447 7.725 2764
dc_topp:states_2013-11-30-06-11-07.ser 4 5528 7.601
7.635 7.601 0.029 7.475 8.238 5528
test_z_625k.ser 1 63 163.317
163.761 163.372 0.198 163.149 164.151 63
test_z_625k.ser 2 126 165.225
165.391 165.241 0.092 165.028 165.704 126
test_z_625k.ser 4 252 166.399
166.602 166.417 0.140 166.076 167.473 252
Scores:
Tests 27 9 9 9
Threads 4 1 2 4
Pct95 130.241 128.973 130.432 131.317
Performance and scalability is very good and always faster than ductus:
Test Threads Ops Med
Pct95 Avg StdDev Min Max TotalOps [ms/op]
dc_boulder_2013-13-30-06-13-17.ser 1 93 111.932
112.526 111.971 0.315 111.268 112.869 93
dc_boulder_2013-13-30-06-13-17.ser 2 186 163.722
167.732 164.146 2.335 149.394 181.485 186
dc_boulder_2013-13-30-06-13-17.ser 4 372 306.995
319.968 311.124 29.269 244.478 547.289 372
dc_boulder_2013-13-30-06-13-20.ser 1 188 56.128
56.771 56.194 0.294 55.684 57.111 188
dc_boulder_2013-13-30-06-13-20.ser 2 376 94.648
95.344 94.677 0.777 93.045 103.992 376
dc_boulder_2013-13-30-06-13-20.ser 4 752 181.442
187.893 182.491 13.768 115.947 392.881 752
dc_shp_alllayers_2013-00-30-07-00-43.ser 1 219 48.030
48.336 48.003 0.220 47.441 48.697 219
dc_shp_alllayers_2013-00-30-07-00-43.ser 2 438 72.272
74.389 72.496 1.023 70.410 78.780 438
dc_shp_alllayers_2013-00-30-07-00-43.ser 4 876 136.748
144.031 136.785 4.872 120.915 169.034 876
dc_shp_alllayers_2013-00-30-07-00-47.ser 1 25 1032.341
1041.777 1032.706 6.047 1019.945 1044.672 25
dc_shp_alllayers_2013-00-30-07-00-47.ser 2 50 1552.967
1596.590 1550.010 31.675 1484.548 1604.336 50
dc_shp_alllayers_2013-00-30-07-00-47.ser 4 100 3714.267
4140.517 3765.731 239.272 2611.855 4231.175 100
dc_spearfish_2013-11-30-06-11-15.ser 1 617 16.671
17.103 16.691 0.210 16.351 17.269 617
dc_spearfish_2013-11-30-06-11-15.ser 2 1234 24.840
25.503 24.862 1.226 23.007 42.378 1234
dc_spearfish_2013-11-30-06-11-15.ser 4 2468 47.588
50.606 49.165 11.989 16.473 170.363 2468
dc_spearfish_2013-11-30-06-11-19.ser 1 1388 7.685
7.876 7.685 0.132 7.448 7.975 1388
dc_spearfish_2013-11-30-06-11-19.ser 2 2776 11.604
11.970 11.587 0.660 10.692 27.675 2776
dc_spearfish_2013-11-30-06-11-19.ser 4 5552 21.594
24.148 21.926 2.862 7.449 78.296 5552
dc_topp:states_2013-11-30-06-11-06.ser 1 625 16.883
16.969 16.888 0.042 16.733 17.023 625
dc_topp:states_2013-11-30-06-11-06.ser 2 1250 21.450
21.939 21.451 0.457 16.804 28.310 1250
dc_topp:states_2013-11-30-06-11-06.ser 4 2500 36.366
38.736 36.475 2.401 20.357 82.360 2500
dc_topp:states_2013-11-30-06-11-07.ser 1 942 11.117
11.186 11.113 0.057 10.947 11.626 942
dc_topp:states_2013-11-30-06-11-07.ser 2 1884 13.785
14.154 13.807 0.565 11.128 29.540 1884
dc_topp:states_2013-11-30-06-11-07.ser 4 3768 21.545
22.764 21.691 3.340 11.165 121.082 3768
test_z_625k.ser 1 50 211.271
214.208 211.679 1.548 208.421 215.475 50
test_z_625k.ser 2 100 345.324
356.823 347.145 6.430 335.767 369.519 100
test_z_625k.ser 4 200 680.916
948.729 711.352 83.520 616.722 978.978 200
Scores:
Tests 27 9 9 9
Threads 4 1 2 4
Pct95 361.800 169.639 262.716 653.043
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/graphics-rasterizer-dev/attachments/20150810/7610fa26/attachment-0001.html>
More information about the graphics-rasterizer-dev
mailing list