[OpenJDK Rasterizer] Fwd: Re: Fwd: RFR: Marlin renderer #3

Tue Aug 25 21:28:04 UTC 2015

Jim,

2015-08-25 0:08 GMT+02:00 Jim Graham <james.graham at oracle.com>:

> Hi Laurent,
>
> I'm feeling much better this week (and can even start typing with my bad
> hand in short spurts now), so I'll hopefully be more on the ball now.
>

Great ! Hope your hand is not too bad.

>
> Short version - looks good to go, consider the following input at your own
> discretion and go ahead and push.
>

Excellent.

Here are my comments:

>
> Reviewing s3.4:
>
> On 8/17/15 2:34 PM, Laurent Bourgès wrote:
>
>> Changes:
>> - Renderer: remove unused D_ERR_STEP_MAX and 0x1 replaced by 1 for
>> readability
>>
>
> Renderer, line 319 - it makes sense to me that the scale changes would be
> based off of the same metric.  Basically the constraint is keeping the
> speed (dxy) between INC and DEC bounds.  I'll admit, though, that I haven't
> been through the actual derivations of the math here, but I thought I'd
> give my 2 cents on your comment there.
>

I left a comment as changing the metric will require adjusting the bounds +
tests (postponed later).

According to my understandings, the error between the linear segment and
the curve is <= 8 x | acceleration(t) | (Graphics Gem I) so I would prefer
using the ddx|y. Besides, I would use th squared norm = ddx*ddx + ddy*ddy
instead of abs(ddx).
I think the speed dx|y describes the curve slope so it does not correspond
to the approximation error, but it seems you already use such constraints
in another renderer (ProcessPath.c).

Another point: filled curves (handled by Renderer directly) are not
monotonic so this error approximation is then not valid !
Maybe we should split the curve as the Stroker does (at cups, inflexion
points ...)

> Renderer, line 479 - I often put extra parens in for cases like this
> because it isn't common knowledge whether cast or add is higher in
> precedence.  I generally assume no better knowledge than the basic PEMDAS
> rules that we learn in grade school. But that's just my style (I also don't
> have the OOO table memorized, but maybe that's just my senility ;).
>

Done.

> 0x1 vs 1 - I usually use 0x1 when I'm using numbers for their bit
> patterns, so | and & typically use 0x constants in my style philosophy.  In
> the case of "1" it's a degenerate case that doesn't seem to matter, but the
> added 0x tells me to think in terms of bits, not cardinal values.
>

Fixed

>
> Renderer, note for future investigation - we store the direction flag in
> Y_MAX, but then we need to do manipulation to process Y_MAX and we also
> need more manipulation to transfer that bit to crossing values.  What if
> instead we stored it in the lowest bit of curx?  We'd then have 31.31 fixed
> point, we'd have to process slope so that its whole part was doubled (to
> skip affecting the LSbit of curx) and we'd have to do the carry using
> "((error >> 31) << 1)", but we'd get the transfer of the direction bit to
> the crossing for free and all in all, the total operations may be reduced
> (at the cost of 1 bit of X coordinate range).
>

I seems a very good idea to evaluate soon: it will save several shift
operations (+ 1 Unsafe access).
I would also make error, bump_x, bump_error as 31.31 FP (LSB=0) so it would
be consistent and may simplify even more computations.

Hope it is good now, I plan to work soon on the Array Cache & dispose code.
>>
>
> Looks great!  Consider the style issues I mentioned above if you wish and
> otherwise it is good to go.  I don't need to see any further webrevs on
> those issues.
>

Thanks. I started working on RLE / pixel coverage encoding but still expect
to improve Array Cache next.

>
> One more suggestion for optimization below can be considered in subsequent
> work:
>
>     For CHECK_NAN, how fast is Math.isNaN?  Is it worth first comparing
>>     the intpart to the value that NaN converts to before calling isNaN?
>>     The majority case, by a wide margin, is that the value is not NaN.
>>
>>
>> That's the aim: I test first the most probable case (a <= intpart) for
>> ceil() and then check possible overflow and finally check for NaN values
>> to avoid adding / substracting 1 !
>> I made benchmarks and this implementation optimized for the integer
>> domain works very well: 2x faster.
>>
>
> I guess I wasn't clear.  I understand why the 3 parts of the test are in
> the order that they are in, but I was referring only to the "CHECK_NAN &&
> Float.isNaN(a)" clause.  Is that worth adding one more test for "CHECK_NAN
> && intpart == NAN_TO_INT_VAL && Float.isNaN(a)" in that one clause?
>
> For the ceil_int case, a very common case is that a is non-negative and
> non-integer.  In that case, "a <= intpart" will fail and you will do the
> CHECK_NAN test before you can return "intpart+1".
>
> For the floor_int case, you would only get to the CHECK_NAN case on
> negative-non-integers which we don't expect a lot of, but in general, the
> added test might make it slightly faster.
>

I quickly tried but sorry it provides no gains: Float.isNaN() is just
implemented as (a != a) so it is faster with only 1 condition (a != a) than
2 (intval == 0 && (a != a)).

    But the more important question may be around how much benefit these
>>     bring compared to the additional testing overhead of so many
>>     combinations ?
>>
>>
>> I advocate it is not easy to test all combinations but it seems not a
>> reasonable testing objective.
>>
>> As I said, it is mainly tuning or quality settings but only few are
>> important (subpixels X / Y, tile size, initial pixel size).
>>
>
> Another thing to consider is that customers would have to do lots of
> testing of performance of tweaking these values and so they would discover
> the problems and report them to us before they become reliant on them.
> However, down the road they may accept version+1 of Marlin into their
> existing environment without fully testing and we may introduce a bug that
> might take them by surprise.
>

Let's discuss / review that production aspect when Marlin will pass tests &
benchmarks. I think several tuning settings may become useless at that time.

Laurent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/graphics-rasterizer-dev/attachments/20150825/3008a871/attachment.html>