[OpenJDK 2D-Dev] X11 uniform scaled wide lines and dashed lines; STROKE_CONTROL in Pisces

Wed Nov 3 01:00:56 UTC 2010

Hi Denis,

I had a bit of luck with the following next() method:

         private int next() {
             // TODO: make function that convert from y value to bucket idx?
             int bucket = nextY - boundsMinY;
             for (int ecur = edgeBuckets[bucket]; ecur != NULL; ecur = 
(int)edges[ecur+NEXT]) {
                 edgePtrs = LilaHelpers.widenArray(edgePtrs, edgeCount, 1);
                 edgePtrs[edgeCount++] = ecur;
                 // REMIND: Adjust start Y if necessary
             }
             int crossingCount = edgeCount;
             crossings = LilaHelpers.widenArray(crossings, 0, 
crossingCount);
             nextY++;
             for (int i = 0; i < edgeCount; i++) {
                 int ecur = edgePtrs[i];
                 float curx = edges[ecur+CURX];
                 int cross = ((int) curx) << 1;
                 edges[ecur+CURX] = curx + edges[ecur+SLOPE];
                 if (edges[ecur+OR] > 0) {
                     cross |= 1;
                 }
                 int j = i;
                 while (--j >= 0) {
                     int jcross = crossings[j];
                     if (jcross <= cross) {
                         break;
                     }
                     crossings[j+1] = jcross;
                     edgePtrs[j+1] = edgePtrs[j];
                 }
                 crossings[j+1] = cross;
                 edgePtrs[j+1] = ecur;
             }
             int newCount = 0;
             for (int i = 0; i < edgeCount; i++) {
                 int ecur = edgePtrs[i];
                 if (edges[ecur+YMAX] > nextY) {
                     edgePtrs[newCount++] = ecur;
                 }
             }
             edgeCount = newCount;
             return crossingCount;
         }

This allowed me to:

- delete a lot of the bucket helper functions.
- get rid of the back pointers
- pare an edge down to 5 values (YMAX, CURX, OR, SLOPE, and NEXT)

I also used the following addLine() method:

     private void addLine(float x1, float y1, float x2, float y2, int or) {
         final int firstCrossing = (int)Math.max(Math.ceil(y1), boundsMinY);
         if (firstCrossing >= boundsMaxY) {
             return;
         }
         final int ptr = numEdges * SIZEOF_EDGE;
         final float slope = (x2 - x1) / (y2 - y1);
         edges = LilaHelpers.widenArray(edges, ptr, SIZEOF_EDGE);
         numEdges++;
         edges[ptr+OR] = or;
         edges[ptr+CURX] = x1 + (firstCrossing - y1) * slope;
         edges[ptr+SLOPE] = slope;
         edges[ptr+YMAX] = y2;
         final int bucketIdx = firstCrossing - boundsMinY;
         addEdgeToBucket(ptr, bucketIdx);
     }

How does that fare for you?

			...jim

On 11/2/2010 4:10 PM, Denis Lila wrote:
> Hi Jim.
>
> I implemented a middle ground between what I sent yesterday and
> what we have now: using the array of linked lists only to replace
> the quicksort (I think this was your original suggestion).
>
> Unfortunately, this didn't work out (at least according to the
> benchmarks). Curves were no different than what we used to have,
> while the performance on lines (especially horizontal ones) decreased
> significantly.
>
> It's not obvious to me why this happened, so I think now I will put
> this type of optimization aside and convert to JNI, where profiling
> will be easier (for me - I haven't been able to make OProfile work
> for java yet).
>
> Regards,
> Denis.
>
> ----- "Jim Graham"<james.graham at oracle.com>  wrote:
>
>> Hi Denis,
>>
>> A generic suggestion - make all of your classes final - that gives the
>>
>> compiler the maximum flexibility to inline any methods you write.
>>
>> With respect to the algorithm choices:
>>
>> I think they key is that the X sorting rarely has any work to do.  The
>>
>> first test of "does this edge need to be swapped with the next lower
>> edge" is probably 99.999% guaranteed to be false.  Thus, trying to
>> optimize anything else to simplify swapping is likely a step in the
>> wrong direction.
>>
>> The casting may be hurting a bit more, and it is hurting on every
>> access
>> to an edge.
>>
>> I'm guessing the best model to use would be to write the code to
>> assume
>> no swapping is necessary at all.  Get that code as simple and as fast
>> as
>> can be.  Then add as little perturbation of that code to manage
>> swapping
>> when it is necessary, and that will likely be the optimal
>> implementation.  I think you could probably even do some benchmarking
>> on
>> a path that is carefully tested to process lots of edges without any X
>>
>> sorting and get that as fast as you can without any swap code, and
>> then
>> add the swap code that impacts the performance of that operation as
>> little as possible.  The key is that the swap code have as little
>> impact
>> on the performance of the case that never needs any swapping at all
>> first and foremost - then make swapping faster within that
>> constraint...
>>
>> 			...jim
>>
>> On 11/1/10 3:13 PM, Denis Lila wrote:
>>> Hi Jim.
>>>
>>> I implemented your bucket sort idea. I'm not just using the buckets
>>> to remove the y-sort. I use them in the iteration through the
>> scanlines
>>> too. What happens is that on any iteration, the active list is the
>>> doubly linked list buckets[nextY-boundsMinY]. I did this because I
>> thought
>>> less memory would need to be moved around compared to when we just
>> kept
>>> the active list "pointers" in an array. For example, with doubly
>> linked
>>> lists we can implement insertion sort with O(1) writes. With arrays
>> we
>>> have to use O(n) writes. This also allows us to get rid of the the
>> edgePtrs
>>> array.
>>>
>>> I ran some benchmarks, and unfortunately I was wrong, somehwere. All
>> the tests
>>> are at least 10% slower than the insertion sort version of what we
>> have now.
>>> I can't immediately see why this is. The only thing I can think of
>> is that
>>> there are a lot of float ->   int casts, but then again, I don't know
>> how
>>> slow this operation is. It would be good if it's because of the
>> casts because
>>> it would no longer be an issue when we convert to native.
>>>
>>> I alo made an unrelated change: I now find the orientation and
>> update x0,y0
>>> much earlier than we used to. The way I was doing it before was
>> silly.
>>>
>>> Regards,
>>> Denis.
>>>
>>> ----- "Jim Graham"<james.graham at oracle.com>   wrote:
>>>
>>>> Hi Denis,
>>>>
>>>> Good news!
>>>>
>>>> On 10/28/2010 3:27 PM, Denis Lila wrote:
>>>>>> If we moved to a Curve class or some other way to
>>>>>> consolidate the 3 lists (may be easier in native code), this
>> might
>>>> win
>>>>>> in more cases...
>>>>>
>>>>> Does that mean you no longer think we should flatten every curve
>> as
>>>> soon
>>>>> as we get it?
>>>>
>>>> No, I was just discussion the feasibility of that one suggestion
>> in
>>>> the
>>>> context of all of the possibilities of whether or not we took the
>>>> other
>>>> choices.  If you think that flattening will pay off then we don't
>> have
>>>>
>>>> to worry about the 3 lists.  It was just that when I hacked it in,
>> I
>>>> still had your 3 lists and so the pros and cons of horizontal edge
>>>> sorting were relative to that version of the renderer...
>>>>
>>>> 			...jim