8054203: add regression tests for JDK vs ICU layout
Doug Felt
dougfelt at google.com
Fri Aug 15 18:41:43 UTC 2014
Thanks for getting the ball rolling, Steven!
I'm not sure what the format is for code reviews, so I'll just send this
email with general comments and maybe someone can tell me the right way to
do it. These comments are a bit more high level and most don't focus on
this code in particular, anyway.
1) it would be nice if we used some more structured web-based tool to
review the code. But I don't know how hard it is to set one up.
2) The tool captures the glyph vectors from TextLayout, but expects only
one. If the goal is to test TextLayout, then it probably needs to handle
any text and any TextLayout output. If the goal is to test HarfBuzz, then
constructing a GlyphVector (via layoutGlyphVector) is more direct.
3) I suggest we decide up front to go with HarfBuzz's native glyph/ordering
output. ICU uses filler glyphs and tries to maintain a close relationship
between each glyph and the original character(s) it corresponds to. In
practice, this close relationship is not needed or used, and Harfbuzz does
not provide it. Instead, people are most interested in 'grapheme clusters'
which are groups of glyphs that 1) might be positioned along a path as a
group, and 2) might have tracking space added/subtracted between them (some
folks do this manually, though it would be better to do it through styles).
Harfbuzz provides this information more directly.
This has consequences for regression/conformance tests that expect to match
the glyph output and glyph to char index output. Basically, they can't do
it. Even with the iculehb modifications that introduce filler glyphs to
convert HarfBuzz's output to an approximation of ICU's, the numbering and
positioning of the filler glyphs differs from ICU's. So the tests still
fail.
Rather than try to change HarfBuzz to adopt ICU's output, I think we should
prefer HarfBuzz's output and break exact compatibility w.r.t filler glyphs
and glyph-to-char mapping.
4) Harfbuzz uses FreeType to get kerning values, while ICU uses kerning
values directly from the kerning table in the font. Freetype applies
heuristics to adjust the kerning values for smaller point sizes (like,
under 25 pt), and rounds the scaled kerning values to design units (I
think, might be an option). This means ICU and HarfBuzz kern differently,
and this changes the advances. This makes it difficult to use images as a
regression tool.
I think it will be difficult to get full fidelity to the glyph positions. I
expect, since most clients (on Linux) use FreeType kerning values directly,
that we might be better off just going with FreeType's kerning values. But
we probably want to see what other platforms do.
5) HarfBuzz does its computations in integer device units, with rounding to
16.16 or 24.8 or 26.6 values (though iculehb does some in floating point).
ICU makes more use of native float units. I've not been able to track down
what exactly happens, but it does seem that advances might differ between
ICU and HB even if kerning is not applied. The main place I've seen
suggestions of this is with scaling based on common fractions (e.g. 1/10,
etc.), native float units can represent common fractions much better than
fixed point power-of-two units can, and small differences can accumulate
over the course of a line of text. Occasionally this trips over a pixel and
glyph images change.
So I guess I think we need to first figure out what degree of compatibility
is achievable, and what we want, and then design our regression/metrics
tests around that.
On Thu, Aug 14, 2014 at 6:36 PM, Steven R. Loomis <steven.loomis at oracle.com>
wrote:
> I have posted some code for review here:
>
> http://cr.openjdk.java.net/~srl/8054203/webrev.00/
>
> (testing out the process)
> Steven
>
> On 08/01/2014 10:38 PM, Steven R. Loomis wrote:
> > https://bugs.openjdk.java.net/browse/JDK-8054203
> >
> > (Phil, others - I didn't see a HarfBuzz component, so I hope this is
> > right- I created a label "harfbuzz")
> > subcomponent 2d, label harfbuzz
> >
> > Anyways, I have code for this already that I used when doing other fixes
> > to the layout code.
> > I'd like to take this one.
> >
> > Note some interesting things:
> >
> > * the generator for the data lives in ICU right now. TBD document how
> > to create it
> >
> > * It's font dependent. We should use create it not just against
> > "interesting" fonts that devs won't have access to, but also against
> > fonts that either ship with JDK and/or are easily available (Google noto
> > come to mind).
> >
> >
> >
>
>
--
Doug Felt | Software Engineer | dougfelt at google.com
<https://mail.google.com/mail/?view=cm&fs=1&tf=1&to=dougfelt@google.com> |
1-650-253-2089
More information about the harfbuzz-dev
mailing list