[OpenJDK 2D-Dev] sun.java2D.Pisces renderer Performance and Memory enhancements

Tue Jul 2 02:17:47 UTC 2013

I'd have to look, but I think we could raise our precision in the X 
direction without costing too much in performance since it just affects 
the scale of our sub-pixel contributions per pixel.  But, raising it in 
the Y direction is where the performance starts to hurt us because it 
means more differencing calculations per pixel-line.

In your examples were you raising both the X and Y sub-pixel resolutions 
to 256, or just the X resolution?  If you raised both to 256 then I 
would think that you would end up with a better result...?  I don't 
think the X resolution will help much for "nearly horizontal" lines.

I agree that it would be nice if we had a more precise calculation of 
sub-pixel coverages that did not rely on multi-sampling, and I believe 
that a decent algorithm to calculate the trapezoidal coverage of a given 
pixel would give us much better answers without any sample-count 
performance drain.

I am skeptical about using fixed point for desktop situations.  Often 
the FPU section of the processor sits idle most of the time and the 
integer unit is busy with performing calculations for the logic side of 
the process.  So, converting FP instructions to integer instructions 
just increases that bottleneck rather than sharing it.  On the other 
hand, if you are constantly converting between FP and integer then you 
have a different problem.  We sort of run into that latter problem 
because of the Java homogenous arrays which require us to store many 
integer values in our single floating point array, but I think you had a 
patch to stop doing that which would help there.

Also, 24.8 is a poor choice for fixed point.  A 256 pixel long line 
could be off by a full pixel by the end of tracing its points.  If you 
are doing 8-sub-sample AA then you would be off by a pixel after only 32 
pixels.

When I've done fixed point in the past, I've usually used longs and done 
32.32 processing.  That would require 4 billion pixels before you are 
off by a pixel.  I think the conversion to "int" is also fairly fast 
because it simply involves using the Upper Word of the result without 
much need for shifting...

			...jim

On 6/17/13 8:18 AM, Laurent Bourgès wrote:
> Jim,
>
> I think I found the source of the 'poor' quality of line rendering:
> the alpha coverage is only computed for the 2 sub pixels (x0, x1) at the
> current x-coordinate of an edge ie it does not take into account the span
> of a line having a very flat slope:
>
>                  for (i = 0, sum = 0, prev = bboxx0; i < numCrossings; i++) {
>                      curxo = _crossings[i];
>                      curx = curxo >> 1;
>
>                      // LBO: TODO: explain alpha computation: Jim, please ?
> ...
>                      if ((sum & mask) != 0) {
>                          x0 = (prev > bboxx0) ? prev : bboxx0; //
> Math.max(prev, bboxx0);
>                          x1 = (curx < bboxx1) ? curx : bboxx1; //
> Math.min(curx, bboxx1);
>
>                          if (x0 < x1) {
>                              x0 -= bboxx0; // turn x0, x1 from coords to
> indices
>                              x1 -= bboxx0; // in the alpha array.
>
>                              pix_x = x0 >> _SUBPIXEL_LG_POSITIONS_X;
>                              pix_xmaxm1 = (x1 - 1) >>
> _SUBPIXEL_LG_POSITIONS_X;
>
>                              if (pix_x == pix_xmaxm1) {
>                                  // Start and end in same pixel
>                                  tmp = (x1 - x0); // number of subpixels
>                                  _alpha[pix_x] += tmp;
>                                  _alpha[pix_x + 1] -= tmp;
>                              } else {
>                                  tmp = (x0 & _SUBPIXEL_MASK_X);
> *                                _alpha[pix_x] += _SUBPIXEL_POSITIONS_X -
> tmp;
> * *                                _alpha[pix_x + 1] += tmp;
> *
>                                  pix_xmax = x1 >> _SUBPIXEL_LG_POSITIONS_X;
>                                  tmp = (x1 & _SUBPIXEL_MASK_X);
> *                                _alpha[pix_xmax] -= _SUBPIXEL_POSITIONS_X
> - tmp;
>                                  _alpha[pix_xmax + 1] -= tmp;
> *                             }
>                          }
>                      }
>
>                      // to turn {0, 1} into {-1, 1}, multiply by 2 and
> subtract 1.
> //                    int crorientation = ((curxo & 0x1) << 1) - 1;
>                      sum += ((curxo & 0x1) << 1) - 1; // crorientation;
>                      prev = curx;
>                  }
>              }
>
> Here is a line test using GeneralPath(Line2D.float) to use pisces instead
> of FillParallelogram renderer:
> - pisces (8x8):
> http://jmmc.fr/~bourgesl/share/java2d-pisces/linetest/LineTest_3.png
> - pisces (256x256):
> http://jmmc.fr/~bourgesl/share/java2d-pisces/linetest/LineTest_8.png
>
> The artefacts comes from the fact that the line spans over several
> subpixels and the slope and the span width is not used at all !
>
> I think it is possible to compute a better coverage for all alpha pixels
> implied in a span (trapezoid):
> for each edge at scanline y: it only needs to have curx and previous curx
> (to know how many subpixel the span crosses)
>
> http://upload.wikimedia.org/wikipedia/commons/3/38/PolygonFillTrapezoidExample.png
>
> Comments are welcome ...
>
> Two more comments:
>> - coordinate conversions: float or integer computations (DDA) related to
>> subpixel coordinates: ceil(), floor() ...
>>        Pisces uses 3x3 subpixels but it provides poor quality: many
>> research papers are using 4x4 (1/16 error) or 8x8 (1/64 error) subpixel
>> masks to increase the coverage precision (ratio of the pixel covered by the
>> polygon)
>>        Moreover, Pisces does not take into account the distance / error
>> between the mathematical edge position and the pixel grid.
>>        Ideally the subpixel mask should be 16x16 => 1/256 coverage error
>> but it will lead to higher processing time.
>>
>
> I misunderstood the code: pisces uses 8x8 subpixel grid (1 << 3) so every
> coordinate has a 1/8 precision (low) far from 1/256 (like AGG) which is the
> ultimate precision => many rasterizer uses 24.8 (24 bits for integer
> coordinates, 8 bits for 1/256 precision) => DDA (32 bits integer
> computations)
>
> I will try soon to use 24.8 fixed point DDA to compute x-coordinates of
> edge segments.
>
> Laurent
>