[OpenJDK 2D-Dev] [9] request for review: 8087201: OGL: rendering of lcd text is slow
Sergey Bylokhov
Sergey.Bylokhov at oracle.com
Thu Jul 9 18:24:11 UTC 2015
swingmark also shows double improvement on the retina.
On 25.06.15 13:40, Andrew Brygin wrote:
> Hello Sergey,
>
> 24/06/15 22:45, Sergey Bylokhov wrote:
>> Hi, Andrew.
>> Thanks for this report. As far as I understand it in case of retina
>> the lcd text is drawing faster after the fix than aa before the fix,
>> which means that we will not get a new regressions. So the fix looks
>> fine.
>>
>> But on non retina our results still not so good, lcd text is slow:
>> 485(was 16.4) vs 16508..... and the window for optimizations still
>> exists.
> I agree that there is a room for further optimizations.
>
> However, I do not think that it is possible to achieve the same
> level of performance of the lcd rendering, as the aa rendering,
> because of more complex nature of the lcd text rendering.
>
> Thanks,
> Andrew
>> global.dest=VolatileImg(Opaque),text.opts.font.fsize=6.0,text.opts.graphics.textaa=LCD_HRGB:
>> 9-8087201-v00: 485.2052560 (var=0.57%) (2955.82%)
>> **|*********************************************************
>> **|*********************************************************
>> **|*********************************************************
>> global.dest=VolatileImg(Opaque),text.opts.font.fsize=6.0,text.opts.graphics.textaa=On:
>> 9-8087201-v00: 16508.76580 (var=0.66%) (99.69%)
>> ************************************************************|
>> ************************************************************|
>> *********************************************************** |
>>
>>
>> On 19.06.15 15:54, Andrew Brygin wrote:
>>> Hello Sergey,
>>>
>>> the only part of the fix affects the performance of AA case: the
>>> cache cell size.
>>> In a case of retina, 13pt and 20pt glyphs do not fit the 16x16
>>> cache cells,
>>> so these benchmarks show better performance:
>>> 13pt: 40-80 times faster
>>> 20pt: 7-13 times faster
>>>
>>> 6pt shows the same results, because it fits the cache in any case.
>>>
>>> Full benchmark results:
>>> http://cr.openjdk.java.net/~bae/8087201/9/ogl-lcd-aa.res
>>>
>>> Regarding the suggestion with creating a separate method for the fast
>>> path possibility check: please note that we do this check and
>>> calculate
>>> the dstTextureID only once per whole glyph vector, but use the
>>> dstTextureID
>>> as an indicator for every glyph. So such change will affect
>>> performance for
>>> sure.
>>> Probably we can masquerade the 'dstTextureID == 0' condition with
>>> some
>>> sort of a macro, like canReadDestinationDirectly() or something
>>> like this.
>>> Are you OK with this?
>>>
>>> Thanks,
>>> Andrew
>>>
>>> 19/06/15 13:57, Sergey Bylokhov wrote:
>>>> Hi, Andrew.
>>>> Can you additionally provide the bench data about aa(before/after
>>>> the fix) vs new lcd lcd?
>>>>
>>>> Probably it well be more obvious if the code in OGLTextRenderer
>>>> 1007 if (OGLC_IS_CAP_PRESENT(oglc, CAPS_EXT_TEXBARRIER) &&
>>>> 1008 dstOps->textureTarget == GL_TEXTURE_2D)
>>>>
>>>> Will be moved to the separate method and the check to the
>>>> possibility of fast blit will be clarified instead of:
>>>> if (dstTextureID == 0) {
>>>>
>>>> Also your review request contains useful information like
>>>> fast/slow/read-after-write etc. I think this information can be
>>>> useful as a comments in the code.
>>>>
>>>> On 18.06.15 17:39, Andrew Brygin wrote:
>>>>> Hello,
>>>>>
>>>>> could you please review a fix for 8087201?
>>>>>
>>>>> The root of the problem is that we have to supply a content of
>>>>> destination surface to lcd shader to compose the lcd glyph
>>>>> correctly.
>>>>> In order to do this, we have to copy a sub-image from destination
>>>>> buffer to an intermediate texture using glCopyTexSubImage2D()
>>>>> routine.
>>>>> Unfortunately, this routine is quite slow on majority of systems,
>>>>> and it
>>>>> dramatically reduces the overall speed of lcd text rendering.
>>>>>
>>>>> The main idea of the fix is to use a texture associated with the
>>>>> destination
>>>>> surface if it exists. In this case we have a chance to completely
>>>>> abandon the
>>>>> data copying. However, we have to avoid read-after-write in order
>>>>> to get
>>>>> correct results in this case. Fortunately, it can be achieved by
>>>>> using the
>>>>> GL_NV_texture_barrier extension:
>>>>>
>>>>> https://www.opengl.org/registry/specs/NV/texture_barrier.txt
>>>>>
>>>>> Beside this, suggested fix introduces following changes in OGL
>>>>> text renderer:
>>>>>
>>>>> * Separate accelerated caches for LCD and AA glyphs
>>>>> We have a single cache which is initialized ether for LCD or
>>>>> for AA glyphs.
>>>>> If application mixes these types of font smoothing from some
>>>>> reasons, we
>>>>> have got a significant performance degradation.
>>>>> For example, if we use J2DBench in GUI mode, then swing GUI
>>>>> initializes the
>>>>> accelerated cache for AA, and subsequent rendering of LCD text
>>>>> always
>>>>> uses 'no-cache' code path.
>>>>>
>>>>> * Increase dimension of the glyph cache cell from 16x16 to 32x32.
>>>>> This change gives significant performance boost on systems with
>>>>> retina
>>>>> (because of average size of rendered glyphs).
>>>>> However, on systems where the fast path with destination
>>>>> texture is not
>>>>> possible for any reasons, this change may cause a performance
>>>>> degradation
>>>>> because of more extenceive usage of glCopyTexSubImage2D.
>>>>> So, we probably may want to get a means to configure the cell
>>>>> dimension
>>>>> depending on system capabilities.
>>>>>
>>>>> Performance results overview:
>>>>> * MBP with Intel Iris (retina, texture barrier is available):
>>>>> http://cr.openjdk.java.net/~bae/8087201/9/mbp-intel-iris.txt
>>>>>
>>>>> * iMac with AMD HD6750M (no retina, texture barrier is available):
>>>>> http://cr.openjdk.java.net/~bae/8087201/9/imac-amd-hd6750m.txt
>>>>>
>>>>> * MBP with OSX10.8, NV GF9600M (no retina, no texture barrier):
>>>>> http://cr.openjdk.java.net/~bae/8087201/9/mbp-10.8-NVGF9600M.txt
>>>>>
>>>>> Please take a look.
>>>>>
>>>>> Thanks,
>>>>> Andrew
>>>>
>>>>
>>>
>>
>>
>> --
>> Best regards, Sergey.
>
--
Best regards, Sergey.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/2d-dev/attachments/20150709/aeac443d/attachment.html>
More information about the 2d-dev
mailing list