[OpenJDK Rasterizer] RFR: Marlin renderer #2

Wed Jun 10 21:00:07 UTC 2015

Jim,

Here is the new webrev:
http://cr.openjdk.java.net/~lbourges/marlin/marlin-s2.2/

CollinearSimplifier:
>
>>
>>     Looks good.  I'm still curious why you didn't use an enum - is the
>>     code more efficient just dealing with basic numbers rather than enum
>>     instances?
>>
>>
>> I did not compare their performance but I prefer using switch(int) vs
>> switch(enum) as I would expect it faster. I could try and evaluate the
>> performance impact.
>>
>
> switch (enum) uses a lookup table.  All enum values are internally
> assigned an ordinal so it is identical to using a switch on integers (other
> than the fact that it has one method call to get the ordinal of the test
> value, so it's really identical to "switch (foo.getOrdinal())".
>

Fixed: I use again an enum (a bit slower ~ 3%).

>  I did understand but there is a subtle difference: 0 is an integer so it
>> is handle by the previous case:
>>
>
> Ah, I see it now.  Perhaps it would be worth adding a line to the break
> down comment in case anyone else didn't get it:
>
>     // 0 handled above as an integer
>     // sign: 1 for ...
>     // add : 0 for ...
>

Fixed.

>  During last week end, I implemented another ceil / floor alternative
>> implementations that are definitely faster and exact.
>>
>> To conclude, do you accept the new floor / ceil alternatives ?
>>
>
> Perhaps we should move them to a follow-on fix?  Just to get the array
> stuff in sooner rather than later...
>

Ok, I keep these improvements for  the next step.

>
>      Ah, I see it now.  On the other hand, now you end up with the clunky
>>     "Yminus1 + 1" later on.  Since this code is executed once per
>>     rendering cycle, I don't see any performance concerns here to
>>     require loading the fields into local variables.  Just using the raw
>>     fields for the 2 simple calculations may be clearer on the other hand.
>>
>>
>> I wanted to factorize the substraction and avoid repeating it 2 times. I
>> agree it is tricky.
>>
>
> Because a nanosecond on an operation that takes several milliseconds is
> worth making the code obscure?  ;)
>
> Also, factoring out the subtraction has a side affect of requiring you to
> insert a new "+1" that didn't use to be there.
>
> I appreciate the attention to detail on some of these calculations, but I
> think there are a lot of opportunities to simplify the algorithms to make
> even bigger impacts.  If a calculation was being factored out of a few
> hundred iterations then it might be worth some level of obscurity in the
> code, but saving 1 instruction per rendering sequence doesn't seem worth
> any amount of disruption to the setup code.
>

Fixed but I kept the local variable as it has a measurable impact (760ms vs
800ms) with a medium complex map (135 000 shapes).

Thanks, I am maybe too much focused on performance: even very minor code
change must be revalidated by few benchmark runs as it is very sensitive.
During these 2 years working on Marlin, I accumulated lots of minor gains
(few percents) that finally made an important gain (~20% to 30%)

I hope this second round is now OK to go in.

What do you like me to work on in the step #3 ?
- DDA ?
- new Array Cache hierarchy ?
- Stroker improvements to reduce collinear segments (caps, joins) ?

Any Other ideas ?

What about performance testing by the SQE team or anybody else ?

Regards,
Laurent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/graphics-rasterizer-dev/attachments/20150610/0cf67970/attachment-0001.html>