RFR: 8270265: LineBreakMeasurer calculates incorrect line breaks with zero-width characters [v3]

Daniel Gredler dgredler at openjdk.org
Fri Feb 21 21:13:16 UTC 2025


> When a string contains zero-width characters, `LineBreakMeasurer` calculates line breaks incorrectly.
> 
> The root cause appears to be that `LineBreakMeasurer` eventually calls into `StandardGlyphVector.getGlyphInfo()`, which derives the glyph advances from the glyph IDs. However, HarfBuzz's default treatment of zero-width characters is to provide the glyph ID of the space character (`U+0020`) combined with an artificial zero advance (not the font's space glyph advance). Unaware of HarfBuzz's sleight of hand, `StandardGlyphVector.getGlyphInfo()` retrieves the actual advances of the space glyph (since that was the glyph ID returned) and provides these back up the call chain to `LineBreakMeasurer` et al.
> 
> I think the correct fix is to use `hb_buffer_set_invisible_glyph` to register `0xFFFF` as the invisible glyph ID with HarfBuzz (matching `CharToGlyphMapper.INVISIBLE_GLYPH_ID`).
> 
> I haven't seen any unwanted side effects, but there is a risk, since this is changing the global HarfBuzz configuration.
> 
> For more information on HarfBuzz's behavior in this area, see: https://harfbuzz.github.io/setting-buffer-properties.html

Daniel Gredler has updated the pull request incrementally with one additional commit since the last revision:

  Update copyright year

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/23603/files
  - new: https://git.openjdk.org/jdk/pull/23603/files/16143307..b9b707ae

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=23603&range=02
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23603&range=01-02

  Stats: 5 lines in 5 files changed: 0 ins; 0 del; 5 mod
  Patch: https://git.openjdk.org/jdk/pull/23603.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/23603/head:pull/23603

PR: https://git.openjdk.org/jdk/pull/23603


More information about the client-libs-dev mailing list