[OpenJDK 2D-Dev] [9] request for review: 8087201: OGL: rendering of lcd text is slow

Sergey Bylokhov Sergey.Bylokhov at oracle.com
Thu Jul 9 18:24:11 UTC 2015


swingmark also shows double improvement on the retina.

On 25.06.15 13:40, Andrew Brygin wrote:
> Hello Sergey,
>
> 24/06/15 22:45, Sergey Bylokhov wrote:
>> Hi, Andrew.
>> Thanks for this report. As far as I understand it in case of retina 
>> the lcd text is drawing faster after the fix than aa before the fix, 
>> which means that we will not get a new regressions. So the fix looks 
>> fine.
>>
>> But on non retina our results still not so good, lcd text is slow: 
>> 485(was 16.4) vs 16508..... and the window for optimizations still 
>> exists.
> I agree that there is a room for further optimizations.
>
> However, I do not think that it is possible to achieve the same
> level of performance of the lcd rendering, as the aa rendering,
> because of more complex nature of the lcd text rendering.
>
> Thanks,
> Andrew
>> global.dest=VolatileImg(Opaque),text.opts.font.fsize=6.0,text.opts.graphics.textaa=LCD_HRGB:
>> 9-8087201-v00: 485.2052560 (var=0.57%) (2955.82%)
>> **|*********************************************************
>> **|*********************************************************
>> **|*********************************************************
>> global.dest=VolatileImg(Opaque),text.opts.font.fsize=6.0,text.opts.graphics.textaa=On:
>> 9-8087201-v00: 16508.76580 (var=0.66%) (99.69%)
>> ************************************************************|
>> ************************************************************|
>> *********************************************************** |
>>
>>
>> On 19.06.15 15:54, Andrew Brygin wrote:
>>> Hello Sergey,
>>>
>>>  the only part of the fix affects the performance of AA case: the 
>>> cache cell size.
>>>  In a case of retina, 13pt and 20pt glyphs do not fit the 16x16 
>>> cache cells,
>>>  so these benchmarks show better performance:
>>>  13pt: 40-80 times faster
>>>  20pt: 7-13 times faster
>>>
>>>  6pt shows the same results, because it fits the cache in any case.
>>>
>>>  Full benchmark results:
>>> http://cr.openjdk.java.net/~bae/8087201/9/ogl-lcd-aa.res
>>>
>>>  Regarding the suggestion with creating a separate method for the fast
>>>  path possibility check: please note that we do this check and 
>>> calculate
>>>  the dstTextureID only once per whole glyph vector, but use the 
>>> dstTextureID
>>>  as an indicator for every glyph. So such change will affect 
>>> performance for
>>>  sure.
>>>  Probably we can masquerade  the 'dstTextureID == 0' condition with 
>>> some
>>>  sort of a macro, like canReadDestinationDirectly() or something 
>>> like this.
>>>  Are you OK with this?
>>>
>>> Thanks,
>>> Andrew
>>>
>>> 19/06/15 13:57, Sergey Bylokhov wrote:
>>>> Hi, Andrew.
>>>> Can you additionally provide the bench data about aa(before/after 
>>>> the fix) vs new lcd lcd?
>>>>
>>>> Probably it well be more obvious if the code in OGLTextRenderer
>>>> 1007     if (OGLC_IS_CAP_PRESENT(oglc, CAPS_EXT_TEXBARRIER) &&
>>>> 1008         dstOps->textureTarget == GL_TEXTURE_2D)
>>>>
>>>> Will be moved to the separate method and the check to the 
>>>> possibility of fast blit will be clarified instead of:
>>>> if (dstTextureID == 0) {
>>>>
>>>> Also your review request contains useful information like 
>>>> fast/slow/read-after-write etc. I think this information can be 
>>>> useful as a comments in the code.
>>>>
>>>> On 18.06.15 17:39, Andrew Brygin wrote:
>>>>> Hello,
>>>>>
>>>>>  could you please review a fix for 8087201?
>>>>>
>>>>>  The root of the problem is that we have to supply a content of
>>>>>  destination surface to lcd shader to compose the lcd glyph 
>>>>> correctly.
>>>>>  In order to do this, we have to copy a sub-image from destination
>>>>>  buffer to an intermediate texture using glCopyTexSubImage2D() 
>>>>> routine.
>>>>>  Unfortunately, this routine is quite slow on majority of systems, 
>>>>> and it
>>>>>  dramatically reduces the overall speed of lcd text rendering.
>>>>>
>>>>>  The main idea of the fix is to use a texture associated with the 
>>>>> destination
>>>>>  surface if it exists. In this case we have a chance to completely 
>>>>> abandon the
>>>>>  data copying. However, we have to avoid read-after-write in order 
>>>>> to get
>>>>>  correct results in this case. Fortunately, it can be achieved by 
>>>>> using the
>>>>>  GL_NV_texture_barrier extension:
>>>>>
>>>>> https://www.opengl.org/registry/specs/NV/texture_barrier.txt
>>>>>
>>>>> Beside this, suggested fix introduces following changes in OGL 
>>>>> text renderer:
>>>>>
>>>>> * Separate accelerated caches for LCD and AA glyphs
>>>>>    We have a single cache which is initialized ether for LCD or 
>>>>> for AA glyphs.
>>>>>    If application mixes these types of font smoothing from some 
>>>>> reasons, we
>>>>>    have got a significant performance degradation.
>>>>>    For example, if we use J2DBench in GUI mode, then swing GUI 
>>>>> initializes the
>>>>>    accelerated cache for AA,  and subsequent rendering of LCD text 
>>>>> always
>>>>>    uses 'no-cache' code path.
>>>>>
>>>>> * Increase dimension of the glyph cache cell from 16x16 to 32x32.
>>>>>    This change gives significant performance boost on systems with 
>>>>> retina
>>>>>   (because of average size of rendered glyphs).
>>>>>    However, on systems where the fast path with destination 
>>>>> texture is not
>>>>>    possible for any reasons, this change may cause a performance 
>>>>> degradation
>>>>>    because of more extenceive usage of glCopyTexSubImage2D.
>>>>>   So, we probably may want to get a means to configure the cell 
>>>>> dimension
>>>>>   depending on system capabilities.
>>>>>
>>>>> Performance results overview:
>>>>> * MBP with Intel Iris (retina, texture barrier is available):
>>>>> http://cr.openjdk.java.net/~bae/8087201/9/mbp-intel-iris.txt
>>>>>
>>>>> * iMac with AMD HD6750M (no retina, texture barrier is available):
>>>>> http://cr.openjdk.java.net/~bae/8087201/9/imac-amd-hd6750m.txt
>>>>>
>>>>> * MBP with OSX10.8, NV GF9600M (no retina, no texture barrier):
>>>>> http://cr.openjdk.java.net/~bae/8087201/9/mbp-10.8-NVGF9600M.txt
>>>>>
>>>>> Please take a look.
>>>>>
>>>>> Thanks,
>>>>> Andrew
>>>>
>>>>
>>>
>>
>>
>> -- 
>> Best regards, Sergey.
>


-- 
Best regards, Sergey.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/2d-dev/attachments/20150709/aeac443d/attachment.html>


More information about the 2d-dev mailing list