8054203: add regression tests for JDK vs ICU layout

Fri Aug 15 22:23:42 UTC 2014

On 08/15/2014 03:20 PM, Doug Felt wrote:
> Is there some official way to say 'I approve?' according to the
> official process?  Here at google we say LGTM.
>
> LGTM. :-)

I think we should debate LGTM vs SGTM in the context of Accessibility
and cultural standards. Just kidding.

I think that works. I'll push the code!

>
> I'm assuming the internal change involves also switching to HarfBuzz.
> I believe that will break this test in multiple ways, and I don't
> think it can be resolved by changing the implementation-- the test
> will have to change. Fuzzing the test has issues, as we discussed,
> though it can be useful as a start.  So, that's the motivation for my
> comments.
Good.
My first step is to just try switching out ICU for
ICU-linked-another-way. So that should pass the same test.
When HB breaks this test the results will be informative. At least it is
a test that you are using HardBuzz.

> If after some experimentation the other folks agree that it is not
> possible to align HarfBuzz's output with ICU's exactly, then we can
> discuss how close we need to get. My preference would be to allow for
> more differences than are theoretically possible if we can make the
> code simpler (e.g. use HB's output more directly, rather than pushing
> nin-rendering filluer glyphs into it) and more useful (make it easier
> to identify grapheme cluster boundaries for folks who want to fiddle
> directly with GlyphVectors rather than rely on TextLayout for
> everything).  But that's in the future, and contingent on accepting
> that some differences are unavoidable. Maybe they're not.
>
I think this test could give us a tool to start comparing those differences.

By the way, I've created a filter to look at tickets tagged 'harfbuzz'. 
It's here:
    https://bugs.openjdk.java.net/issues/?filter=21041

I'll file some more to cover things we've discussed. Please feel free to
fill in.

-s

>
> On Fri, Aug 15, 2014 at 3:10 PM, Steven R. Loomis <srl at icu-project.org
> <mailto:srl at icu-project.org>> wrote:
>
>     On 08/15/2014 02:32 PM, Doug Felt wrote:
>     > On Fri, Aug 15, 2014 at 1:33 PM, Steven R. Loomis
>     <steven.loomis at oracle.com <mailto:steven.loomis at oracle.com>>
>     > wrote:
>     >
>     >> On 08/15/2014 11:41 AM, Doug Felt wrote:
>     >>> Thanks for getting the ball rolling, Steven!
>     >> Welcome!
>     >>
>     >>> I'm not sure what the format is for code reviews, so I'll just
>     send
>     >>> this email with general comments and maybe someone can tell me the
>     >>> right way to do it.  These comments are a bit more high level
>     and most
>     >>> don't focus on this code in particular, anyway.
>     >>>
>     >>> 1) it would be nice if we used some more structured web-based
>     tool to
>     >>> review the code. But I don't know how hard it is to set one up.
>     >> There's always rietveld :)
>     >> I was just trying to follow the openjdk playbook.
>     >>
>     > Yeah, rietveld was what I thought of, though I couldn't remember
>     the name,
>     > and there might be other open-source alternatives available
>     these days.  I'm
>     > not sure if what android uses is open sourced, but that could be too
>     > involved
>     > for what we need. I've just spent so much time doing code
>     reviews with
>     > these kinds of tools, I'm kind of spoiled for anything else.
>     Not sure. Maybe others more current on OpenJDK processes can chime in
>     here. I don't know what the options are for openjdk.
>
>     > [snip some great discussion]
>     >>> 4) Harfbuzz uses FreeType to get kerning values, while ICU uses
>     >>> kerning values directly from the kerning table in the font. 
>     Freetype
>     >>> applies heuristics to adjust the kerning values for smaller point
>     >>> sizes (like, under 25 pt), and rounds the scaled kerning values to
>     >>> design units (I think, might be an option). This means ICU and
>     >>> HarfBuzz kern differently, and this changes the advances. This
>     makes
>     >>> it difficult to use images as a regression tool.
>     >>> I think it will be difficult to get full fidelity to the glyph
>     >>> positions. I expect, since most clients (on Linux) use FreeType
>     >>> kerning values directly, that we might be better off just
>     going with
>     >>> FreeType's kerning values. But we probably want to see what other
>     >>> platforms do.
>     >> Yes, and as you of all people should know :), one of the places
>     where
>     >> ICU/ICUJDK diverges is in the kerning table management.
>     >>
>     >> Perhaps it is a good case for turning kerning *off* for some
>     types of
>     >> tests, and using it with a lot of fuzzing when it is on?
>     >>
>     > My gut feeling is that fuzzing should be particular to what's
>     being measured
>     > and what we expect, rather than a generic 'loosening of the
>     acceptable
>     > bounds'.
>     > It's pretty easy to loosen the check to a point where it's not
>     really
>     > testing
>     > enough to catch real errors.  Something more tailored to the
>     expected output
>     > is more likely to catch a real issue.  On the other hand, if one
>     runs
>     > 1000's of
>     > tests than anything to limit what needs direct investigation helps.
>     Right. Changes in how calculation is done might lead to rounding
>     differences/etc.
>
>     Behdad, what have you been doing for comparisons w/ Uniscribe?
>
>
>     >
>     >
>     >>> 5) HarfBuzz does its computations in integer device units, with
>     >>> rounding to 16.16 or 24.8 or 26.6 values (though iculehb does
>     some in
>     >>> floating point). ICU makes more use of native float units. 
>     I've not
>     >>> been able to track down what exactly happens, but it does seem
>     that
>     >>> advances might differ between ICU and HB even if kerning is not
>     >>> applied. The main place I've seen suggestions of this is with
>     scaling
>     >>> based on common fractions (e.g. 1/10, etc.), native float
>     units can
>     >>> represent common fractions much better than fixed point
>     power-of-two
>     >>> units can, and small differences can accumulate over the
>     course of a
>     >>> line of text. Occasionally this trips over a pixel and glyph
>     images
>     >>> change.
>     >>>
>     >>> So I guess I think we need to first figure out what degree of
>     >>> compatibility is achievable, and what we want, and then design our
>     >>> regression/metrics tests around that.
>     >> OK.
>     >>
>     >> Maybe I should rephrase this particular ticket - it is for very
>     basic
>     >> compatibility, to first verify if  embedded-ICU vs external-ICU is
>     >> compatible, and then secondly to compare embedded-ICU with
>     >> external-ICU-really-HarfBuzz.
>     >>
>     > Sure, for different ways to access ICU this is of course fine,
>     and that's
>     > a perfectly good place to start.  I was just focusing on the
>     switch to HB.
>     I think you are looking at perhaps a more agressive internal
>     change than
>     I was trying to do at first.  Plus I am over optimistic about the
>     kinds
>     of changes needed.
>
>     OK.  So I'll open another ticket for follow on conformance testing.
>
>     Will you sign off on this test as approved for now then? :)
>
>     -s
>
>     --
>
>     IBMer but all opinions are mine.
>     https://www.ohloh.net/accounts/srl295 // fingerprint @
>     https://ssl.icu-project.org/trac/wiki/Srl
>
>
>
>
>
> -- 
>
> Doug Felt | 	 Software Engineer | 	 dougfelt at google.com
> <https://mail.google.com/mail/?view=cm&fs=1&tf=1&to=dougfelt@google.com> |
> 	 1-650-253-2089
>
>

-- 

IBMer but all opinions are mine.
https://www.ohloh.net/accounts/srl295 // fingerprint @ https://ssl.icu-project.org/trac/wiki/Srl