[OpenJDK Rasterizer] Fwd: Re: Fwd: RFR: Marlin renderer #3
Laurent Bourgès
bourges.laurent at gmail.com
Fri Jul 3 20:51:47 UTC 2015
Jim,
Here is an updated webrev:
http://cr.openjdk.java.net/~lbourges/marlin/marlin-s3.1/
Changes:
- enabled CHECK_NAN and CHECK_OVERFLOW to be correct for now
- renamed faster alternatives as int ceil_int(float) and float
floor_int(float) that are faster in the integer domain
- restored ceil_f / floor_f (float) methods that are strictly correct as
(float) StrictMath.ceil/floor(double)
- made FloatMath class and its methods public to be available for tests and
maybe more general use in graphics / java2d ...
It is still faster than previous FloatMath and Marlin is a bit faster too:
see results at then end !
Here are few comments on joe's proposal:
> >> I could propose my implementations of float ceil/floor (float) that are
> >> exactly giving the same results than (float)StrictMath.ceil/floor
> (double).
> >> According to my benchmarks, it is 25% faster.
>
>
> >
> > I don't think we need to limit ourselves to either StrictMath or Math.
> We simply need something predictable that has properties which work for our
> needs.
> >
> I was just proposing the 2 methods float ceil/floor (float) (derived from
> StrictMath) to be included the core libs if it is useful for general use
> (25% faster).
>
Joe, are you interested by ceil_f / floor_f variants (25% faster than
StrictMath) ?
>> So, you can *almost* get away with
>>
>> int ceil_returning_int(floor f) {
>> if (f > 0.0)
>> return - ((int)(-f))
>> else
>> return (int) f;
>> }
>>
>> int floor_returning_int(floor f) {
>> if (f < 0.0)
>> return - ((int)(-f))
>> else
>> return (int) f;
>> }
>>
>> I tried joe's proposal but it does not work:
>> Round to zero is not equivalent to ceil or floor !
>
>
> In what way do Joe's techniques fail? Integer casts should be a truncate
operation (is that what you refer to as "round to zero"?) and should be the
same as floor() for non-negative numbers and -((int)(-v)) should be the
same as floor for negative numbers...
>>
I tried and it does not work
ceil (1.2)=2
But (int)(-1.2)=-1 (round to zero).
So the result is 1 and not 2 !
That's why my variant adds/substract 1 !
But it make infinity / nan handling more painful and a bit costly.
Jim, I will next make tests:
1/ use proper and consistent ceil(coord - 0.5) as you did in openpisces (FX)
2/ use fixed point approach (longer work) to only use integer maths in
Marlin rendering loop (crossings)
=> faster (no float to int conversions ?) but also more scalable on
hyperThreading CPU ?
=> edge array will then only contain int[] and Unsafe usage is no more
necessary
Cheers,
Laurent
PS: Here are some benchmark results made on values only in the integer
domain:
>> JVM START: 1.8.0_60-ea [Java HotSpot(TM) 64-Bit Server VM 25.60-b18]
floats = [-2.13422758E9, -1.37992608E8, -134758.4, -131.5, -17.2,
-1.9, -0.9, -1.0E-4, -1.0E-8, -1.0E-23, -100.0, -3.0, -1.0, -0.0, 0.0, 0.0,
1.0, 3.0, 100.0, 131.5, 17.2, 1.9, 0.9, 1.0E-4, 1.0E-8, 1.0E-23,
2.13422758E9, 1.37992608E8, 134758.4]
strictMathCeil_f = [-2.13422758E9, -1.37992608E8, -134758.0, -131.0, -17.0,
-1.0, -0.0, -0.0, -0.0, -0.0, -100.0, -3.0, -1.0, -0.0, 0.0, 0.0, 1.0, 3.0,
100.0, 132.0, 18.0, 2.0, 1.0, 1.0, 1.0, 1.0, 2.13422758E9, 1.37992608E8,
134759.0]
floatMathCeil = [-2134227584, -137992608, -134758, -131, -17, -1, 0, 0,
0, 0, -100, -3, -1, 0, 0, 0, 1, 3, 100, 132, 18, 2, 1, 1, 1, 1, 2134227584,
137992608, 134759]
FloatMathCeil_f = [-2.13422758E9, -1.37992608E8, -134758.0, -131.0, -17.0,
-1.0, -0.0, -0.0, -0.0, -0.0, -100.0, -3.0, -1.0, 0.0, 0.0, 0.0, 1.0, 3.0,
100.0, 132.0, 18.0, 2.0, 1.0, 1.0, 1.0, 1.0, 2.13422758E9, 1.37992608E8,
134759.0]
strictMathFloor_f = [-2.13422758E9, -1.37992608E8, -134759.0, -132.0,
-18.0, -2.0, -1.0, -1.0, -1.0, -1.0, -100.0, -3.0, -1.0, -0.0, 0.0, 0.0,
1.0, 3.0, 100.0, 131.0, 17.0, 1.0, 0.0, 0.0, 0.0, 0.0, 2.13422758E9,
1.37992608E8, 134758.0]
floatMathFloor = [-2.13422758E9, -1.37992608E8, -134759.0, -132.0,
-18.0, -2.0, -1.0, -1.0, -1.0, -1.0, -100.0, -3.0, -1.0, 0.0, 0.0, 0.0,
1.0, 3.0, 100.0, 131.0, 17.0, 1.0, 0.0, 0.0, 0.0, 0.0, 2.13422758E9,
1.37992608E8, 134758.0]
floatMathFloor_f = [-2.13422758E9, -1.37992608E8, -134759.0, -132.0,
-18.0, -2.0, -1.0, -1.0, -1.0, -1.0, -100.0, -3.0, -1.0, 0.0, 0.0, 0.0,
1.0, 3.0, 100.0, 131.0, 17.0, 1.0, 0.0, 0.0, 0.0, 0.0, 2.13422758E9,
1.37992608E8, 134758.0]
- Benchmarks ---
# Calib: run duration: 5 000 ms
4 threads, Tavg = 3,03 ns/op (σ = 0,02 ns/op), Total ops
= 6616663415 [ 3,07 (1633532581), 3,02 (1656839291), 3,01
(1662195896), 3,01 (1664095647)]
#
#-------------------------------------------------------------
*# StrictMathCeil_f*: run duration: 5 000 ms
*float = (float) StrictMath.ceil(f)*
1 threads, Tavg = 112,46 ns/op (σ = 0,00 ns/op), Total ops
= 44462614 [ 112,46 (44462614)]
2 threads, Tavg = 112,53 ns/op (σ = 0,20 ns/op), Total ops
= 88864503 [ 112,74 (44351706), 112,33 (44512797)]
3 threads, Tavg = 112,75 ns/op (σ = 0,31 ns/op), Total ops
= 133042189 [ 112,67 (44379882), 112,42 (44478562), 113,17
(44183745)]
* 4 threads, Tavg = 113,61 ns/op (σ = 1,18 ns/op),* Total
ops = 176004512 [ 115,63 (43242922), 113,27 (44144214), 112,59
(44409190), 113,01 (44208186)]
#
#-------------------------------------------------------------
*# FloatMathCeil_f:* run duration: 5 000 ms
*float = FloatMath.ceil_f(f)*
1 threads, Tavg = 85,42 ns/op (σ = 0,00 ns/op), Total ops
= 58534818 [ 85,42 (58534818)]
2 threads, Tavg = 85,56 ns/op (σ = 0,18 ns/op), Total ops
= 116880361 [ 85,74 (58318655), 85,38 (58561706)]
3 threads, Tavg = 85,49 ns/op (σ = 0,11 ns/op), Total ops
= 175469910 [ 85,64 (58386401), 85,42 (58535723), 85,40
(58547786)]
* 4 threads, Tavg = 86,10 ns/op (σ = 0,86 ns/op), *Total
ops = 232739544 [ 87,59 (57200792), 85,61 (58519538), 85,47
(58617544), 85,79 (58401670)]
#
#-------------------------------------------------------------
*# FloatMathCeil:* run duration: 5 000 ms
*int = FloatMath.ceil(f)*
1 threads, Tavg = 56,72 ns/op (σ = 0,00 ns/op), Total ops
= 88153017 [ 56,72 (88153017)]
2 threads, Tavg = 56,90 ns/op (σ = 0,16 ns/op), Total ops
= 175737994 [ 57,06 (87626873), 56,75 (88111121)]
3 threads, Tavg = 56,82 ns/op (σ = 0,15 ns/op), Total ops
= 264003134 [ 57,02 (87684429), 56,76 (88087214), 56,67
(88231491)]
* 4 threads, Tavg = 57,16 ns/op (σ = 0,57 ns/op),* Total
ops = 350060098 [ 58,12 (86072473), 56,74 (88161260), 56,68
(88251450), 57,12 (87574915)]
#
#-------------------------------------------------------------
*# StrictMathFloor_f:* run duration: 5 000 ms
*float = (float) StrictMath.floor(f)*
1 threads, Tavg = 108,69 ns/op (σ = 0,00 ns/op), Total ops
= 46005419 [ 108,69 (46005419)]
2 threads, Tavg = 108,87 ns/op (σ = 0,25 ns/op), Total ops
= 91856264 [ 109,11 (45824174), 108,62 (46032090)]
3 threads, Tavg = 108,66 ns/op (σ = 0,01 ns/op), Total ops
= 138046291 [ 108,65 (46019660), 108,68 (46008068), 108,65
(46018563)]
* 4 threads, Tavg = 109,99 ns/op (σ = 1,00 ns/op),* Total
ops = 182162538 [ 111,63 (44870853), 109,77 (45631259), 108,90
(45994047), 109,69 (45666379)]
#
#-------------------------------------------------------------
*# FloatMathFloor_f: *run duration: 5 000 ms
*float = FloatMath.floor_f(f)*
1 threads, Tavg = 79,60 ns/op (σ = 0,00 ns/op), Total ops
= 62816917 [ 79,60 (62816917)]
2 threads, Tavg = 79,44 ns/op (σ = 0,15 ns/op), Total ops
= 125890579 [ 79,58 (62827873), 79,29 (63062706)]
3 threads, Tavg = 79,38 ns/op (σ = 0,15 ns/op), Total ops
= 188968096 [ 79,59 (62823628), 79,23 (63107367), 79,32
(63037101)]
* 4 threads, Tavg = 79,88 ns/op (σ = 0,83 ns/op),* Total
ops = 250828233 [ 81,31 (61604026), 79,60 (62930953), 79,32
(63149634), 79,33 (63143620)]
#
#-------------------------------------------------------------
*# FloatMathFloor:* run duration: 5 000 ms
*float = FloatMath.floor(f)*
1 threads, Tavg = 70,20 ns/op (σ = 0,00 ns/op), Total ops
= 71226367 [ 70,20 (71226367)]
2 threads, Tavg = 70,35 ns/op (σ = 0,16 ns/op), Total ops
= 142141053 [ 70,51 (70910131), 70,20 (71230922)]
3 threads, Tavg = 70,26 ns/op (σ = 0,08 ns/op), Total ops
= 213504247 [ 70,20 (71225449), 70,38 (71046834), 70,19
(71231964)]
* 4 threads, Tavg = 70,67 ns/op (σ = 0,60 ns/op), *Total
ops = 283376128 [ 70,24 (71279973), 70,58 (70931050), 70,20
(71320272), 71,68 (69844833)]
#
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/graphics-rasterizer-dev/attachments/20150703/e96792ff/attachment-0001.html>
More information about the graphics-rasterizer-dev
mailing list