RFR: JDK-8314215 Trailing Spaces before Line Breaks Affect the Center Alignment of Text

John Hendrikx jhendrikx at openjdk.org
Mon Oct 30 19:59:43 UTC 2023


On Mon, 30 Oct 2023 19:00:08 GMT, Andy Goryachev <angorya at openjdk.org> wrote:

>> There are a number of tickets open related to text rendering:
>> 
>> https://bugs.openjdk.org/browse/JDK-8314215
>> 
>> https://bugs.openjdk.org/browse/JDK-8145496
>> 
>> https://bugs.openjdk.org/browse/JDK-8129014
>> 
>> They have in common that wrapped text is taking the trailing spaces on each wrapped line into account when calculating where to wrap.  This looks okay for text that is left aligned (as the spaces will be trailing the lines and generally aren't a problem, but looks weird with CENTER and RIGHT alignments.  Even with LEFT alignment there are artifacts of this behavior, where a line like `AAA  BBB  CCC` (note the **double** spaces) gets split up into `AAA  `, `BBB  ` and `CCC`, but if space reduces further, it will wrap **too** early because the space is taken into account (ie. `AAA` may still have fit just fine, but `AAA  ` doesn't, so the engine wraps it to `AA` + `A  ` or something).
>> 
>> The fix for this is two fold; first the individual lines of text should not include any trailing spaces into their widths; second, the code that is taking the trailing space into account when wrapping should ignore all trailing spaces (currently it is ignoring all but one trailing space).  With these two fixes, the layout in LEFT/CENTER/RIGHT alignments all look great, and there is no more early wrapping due to a space being taking into account while the actual text still would have fit (this is annoying in tight layouts, where a line can be wrapped early even though it looks like it would have fit).
>> 
>> If it were that simple, we'd be done, but there may be another issue here that needs solving: wrapped aligned TextArea's.
>> 
>> TextArea don't directly support text alignment (via a setTextAlignment method like Label) but you can change it via CSS.
>> 
>> For Left alignment + wrapping, TextArea will ignore any spaces typed before a line that was wrapped.  In other words, you can type spaces as much as you want, and they won't show up and the cursor won't move.  The spaces are all getting appended to the previous line.  When you cursor through these spaces, the cursor can be rendered out of the control's bounds.  To illustrate, if you have the text `AAA                 BBB CCC`, and the text gets wrapped to `AAA`, `BBB`, `CCC`, typing spaces before `BBB` will not show up.  If you cursor back, the cursor may be outside the control bounds because so many spaces are trailing `AAA`.
>> 
>> The above behavior has NOT changed, is pretty standard for wrapped text controls,...
>
> From https://www.unicode.org/reports/tr14-4/
> 
> 
> 5.6 Break opportunity after characters (A)
> Breaking Spaces
> SPACE (SP) � U+0020
> 
> The space characters are explicit break opportunities, but spaces at the end of a line are not measured for fit. If there is a sequence of space characters, and breaking after any of the space characters would result in the same visible line, the line breaking position after the last space character in the sequence is the locally most optimal one. In other words, since the last character measured for fit is BEFORE the space character, any number of space characters are kept together invisibly on the previous line and the first non-space character starts the next line.
> 
> It is sometimes convenient to use SP, but not the other breaking spaces to override context based behavior of other characters under the "anywhere, except where prohibited" style of line breaking (context analysis style 2).
> 
> EN QUAD � U+2000
> EM QUAD � U+2001
> EN SPACE � U+2002
> EM SPACE � U+2003
> THREE-PER-EM SPACE � U+2004
> FOUR-PER-EM SPACE � U+2005
> SIX-PER-EM SPACE � U+2006
> PUNCTUATION SPACE � U+2008
> THIN SPACE � U+2009
> HAIR SPACE � U+200A
> 
> The preceding list of characters all have a specific width, but behave otherwise as breaking spaces .
> 
> ZERO WIDTH SPACE (ZWSP) � U+200B
> 
> This character does not have width. It is used in a style 2 context analysis to provide additional (invisible) break opportunities.
> 
> IDEOGRAPHIC SPACE � U+3000
> 
> This character has the width of an ideograph but like ZWSP is fully subject to the style 2 context analysis.
> 
> 
> A quick check with (the latest?) MS Word 2208 on Windows shows that, at least with EN QUAD U+2000 it is treated as a regular character (i.e. it is always "displayed" even if right aligned).

@andy-goryachev-oracle 

> The space characters are explicit break opportunities, but spaces at the end of a line are not measured for fit. If there is a sequence of space characters, and breaking after any of the space characters would result in the same visible line, the line breaking position after the last space character in the sequence is the locally most optimal one. In other words, since the last character measured for fit is BEFORE the space character, any number of space characters are kept together invisibly on the previous line and the first non-space character starts the next line.

That's certainly a good description of how the breaking should be handled.

-------------

PR Comment: https://git.openjdk.org/jfx/pull/1236#issuecomment-1785940681


More information about the openjfx-dev mailing list