[OpenJDK Rasterizer] Fwd: Re: Fwd: RFR: Marlin renderer #3
Jim Graham
james.graham at oracle.com
Thu Jul 30 01:54:56 UTC 2015
Here are some comments that might explain all of this:
// The index of the pixel that holds the next HPC is at ceil(trueY - 0.5)
// Since y1 and y2 are biased by -0.5 in tosubpixy(), this is simply
ceil(y1) or ceil(y2)
final int firstcrossing = Math.max(Math.ceil_int(...)...);
...
// The x value must be bumped up to its position at the next HPC we will
evaluate.
// "firstcrossing" is the (sub)pixel number where the next crossing occurs
// thus, the actual coordinate of the next HPC is "firstcrossing + 0.5"
// so the Y distance we cover is "firstcrossing + 0.5 - trueY".
// Note that since y1 (and y2) are already biased by -0.5 in
tosubpixy(), we have
// y1 = trueY - 0.5
// trueY = y1 + 0.5
// firstcrossing + 0.5 - trueY = firstcrossing + 0.5 - (y1 + 0.5)
// = firstcrossing - y1
// The x coordinate at that HPC is then:
// x1_intercept = x1 + (firstcrossing - y1) * slope
// The next VPC is then given by:
// VPC index = ceil(x1_intercept - 0.5), or alternately
// VPC index = floor(x1_intercept - 0.5 + 1 - epsilon)
// epsilon is hard to pin down in floating point, but easy in fixed
point, so if
// we convert to fixed point then these operations get easier:
// long x1_fixed = x1_intercept * 2^32; (fixed point 32.32 format)
// curx = next VPC = fixed_floor(x1_fixed - 2^31 + 2^32 - 1)
// = fixed_floor(x1_fixed + 2^31 - 1)
// = fixed_floor(x1_fixed + 0x7fffffff)
// and error = fixed_fract(x1_fixed + 0x7fffffff)
final ?float? x1_intercept = x1 + (firstcrossing - y1) * slope;
final long x1_fixed_biased = ((long) scalb(x1_intercept, 32)) + 0x7fffffffL;
store(CURX, (int) (x1_fixed_biased >> 32));
store(ERRX, ((int) x1_fixed_biased) >>> 1);
...jim
On 7/29/15 6:28 PM, Jim Graham wrote:
> And, one further simplification.
>
> x1_intercept = x1 + (firstcrossing - y1) * slope;
> long x1_fixed = ((long) scalb(x1_intercept, 32)) + 0x7fffffffL;
> store(CURX, (int) (x1_fixed >> 32));
> store(ERRX, ((int) x1_fixed) >>> 1);
>
> This combines the "- 0.5f" with the "+ 1 - epsilon" into a single
> addition to the fixed point long, then we simply split into whole and
> fract parts to get our curx and error terms...
>
> ...jim
>
> On 7/28/15 9:54 PM, Jim Graham wrote:
>> My "1 minor functional issue" in my last email was actually in error...
>>
>> On 7/28/15 8:11 PM, Jim Graham wrote:
>>> Renderer.java, line 461: What happens if x1_cor was already at VPC? The
>>> error should be 0, but you end up setting it to ERR_STEP_MAX.
>>
>> Ignore this. I see now. At VPC, any fractional movement forward should
>> bump you by another pixel so the error really is ERR_STEP_MAX.
>>
>> Basically error is "how far we've progressed since the previous VPC
>> crossing" and at a VPC crossing you've crossed about as far as you can
>> go before you are going to default over to the next pixel crossing.
>>
>> Instead I'd change the comment on 461 to something like:
>> // Crossings bump by a whole pixel just after (to the right of) a
>> VPC.
>> // Error is how far since we last bumped the crossing.
>> And/or:
>> // Error is ERR_STEP_MAX right on a VPC where we are about to bump
>> 1 more pixel
>> // and 0 just after VPC where we have just recently bumped
>>
>> It's a minor nit, but your calculation is still slightly off in that we
>> would not necessarily compute a value of 0 right after the VPC because
>> (x1_cor - istartx) would be slightly greater than 0.0f. The calculation
>> you are using is correctly placing ERR_STEP_MAX at "on a VPC", but it is
>> also placing 0.0 at "on the previous VPC" when the correct placement for
>> 0.0 is just after the previous VPC. The error is thus almost always
>> slightly larger than it should be (but by a very very tiny amount as in
>> 1/ERR_STEP_MAX).
>>
>> It should be:
>>
>> err(VPC(n-1)) = ERR_STEP_MAX
>> err(VPC(n-1)+epsilon) = 0
>> ...
>> err(VPC(n)) = ERR_STEP_MAX
>>
>> but your calculation is:
>>
>> err(VPC(n-1)) = 0
>> err(VPC(n-1)+epsilon) = 1
>> err(VPC(n)) = ERR_STEP_MAX
>>
>> This calculation might be ever so slightly more accurate, I'm not sure
>> if it is any faster:
>>
>> long x1_fixed = (long) Math.scalb(x1_cor, 32);
>> // x1_fixed is now 32.32 representation
>> x1_fixed -= 1;
>> // x1_fixed is now reduced by "epsilon"
>> int x1_whole = ((int) (x1_fixed >> 32));
>> int x1_fract = (((int) (x1_fixed)) >>> 1);
>> // x1_whole/fract are now a split 32.31 representation (of x1_fixed -
>> epsilon)
>> x1_whole += 1;
>> // x1_whole is now ceil(x1_cor)
>>
>> Or, more compactly:
>>
>> long x1_fixed_m_eps = ((long) Math.scalb(x1_cor, 32)) - 1;
>> store(CURX, ((int) (x1_fixed_m_eps >> 32)) + 1);
>> store(ERRX, ((int) (x1_fixed_m_eps)) >>> 1);
>>
>> It performs ceil() in fixed point (32.31+1) math as floor(v + 1 -
>> epsilon) and ends up with the error/fract term being exactly
>> ERR_STEP_MAX on a whole number and the same err/fract term to be 0
>> exactly at the next possible fixed point step after a whole number. Note
>> that since x1_cor is only single-precision it is unlikely to ever see a
>> fract value of "0" since the float doesn't have enough precision to
>> represent a number that is "1/ERR_STEP_MAX" greater than an integer
>> unless the integer happens to be 0. But if it could represent that
>> number, it would be scaled exactly to an error value of 0. It would be
>> slightly more accurate if the x1_cor value was calculated in double (to
>> keep as many bits of fractional value as possible). A double mantissa
>> could maintain 31 bits of sub-pixel precision for coordinates in the
>> range of +/- a million or so, though it isn't clear how much value that
>> would have since all of the calculations up to this point have been done
>> in single precision.
>>
>> So, I present those equations more in case they might be faster than
>> because they may be "more accurate". Note that there is no call to
>> ceil_int() here at all, only a cast to long...
>>
>> ...jim
More information about the graphics-rasterizer-dev
mailing list