RFR: 8285308: Win: Japanese logical fonts are drawn with wrong size [v2]

Wed May 4 15:39:28 UTC 2022

On Tue, 3 May 2022 21:42:35 GMT, Phil Race <prr at openjdk.org> wrote:

>> Toshio Nakamura has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Moved the fix to WFontConfiguration
>
> It looks to me as if we specify a latin font as the text component font, some windows fall back behaviour insists
> on a minimum size for the Japanese fallback font.
> And the way to avoid that is to specify a locale (Japanese) font instead which is what used to happen.
> 
> -------
> Naoto suggested :
> -sequence.allfonts.UTF-8.ja=alphabetic,japanese,dingbats,symbol
> +sequence.allfonts.UTF-8.ja=japanese,alphabetic,dingbats,symbol
> 
> This did't work for me because it isn't picking up that line anyway
> 
> So what I see is that MS Mincho isn't even in the list of names it is considering !
> Because we are finding this :-
> sequence.allfonts=alphabetic/default,dingbats,symbol
> 
> I see Toshio says he saw the UTF-8 entry being used, but I don't see that.
> So I need to understand why not the UTF-8 entry - note that I have set my system locale to JA now.
> The consequence of this is that the fallback sequence is what provides Japanese and
> so it is from the Chinese MingLiu-ExtB font which I do have installed.
> 
> 
> Toshio is right that what matters here for the native text component is what is picked up in
> the logic around WFontConfiguration.getTextComponentFontName()
> 
> The helper method for getTextComponentFontName() is findFontWithCharset()
> 
> That has a bit of a questionable behaviour in that it returns the *last* font in the
> list that matches the charset it wants.
> So *hypothetically* if we had the charset as DEFAULT_CHARSET
> MS Mincho,DEFAULT_CHARSET
> Times New Roman,DEFAULT_CHARSET
> and  if we had
> Times New Roman,DEFAULT_CHARSET
> MS Mincho,SHIFTJIS_CHARSET
> 
> then in both cases we'd get Times and still have the problem
> The latter case seems to actually happen - and so even though the font is there, we ignore it.
> Clearly what we want is the "locale" font, and we are using encoding to identify any one
> that matches but this breaks down in UTF8.
> Toshio pointed out that code in WFontConfiguration initTables() basically says
> if we found a font tagged as "japanese" then its subsetCharMap entry is SHIFTJIS_CHARSET
> and this used to work because this also mapped windows-31j to SHIFTJIS_CHARSET.
> But what do you map UTF-8 to ? The current code maps it to DEFAULT_CHARSET.
> There needs to be a different way of doing this for UTF-8 locales.
> 
> So this fix is a "band aid" on the problem that in the UTF 8 locale we don't seem to be picking
> up the entry we should. 
> If Toshio confirms for SURE he's seeing the UTF-8 one picked up it would be a useful data point.
> I still need to debug why I am not getting it.
> 
> UPDATE: pilot error on my part - I set lang to jp .. not ja .. 
> 
> So back to just the encoding case .. 
> 
> Regarding what Toshio pointed out that we can't have both
> sequence.allfonts.UTF-8.zh.CN=alphabetic,chinese-ms936,dingbats,symbol,chinese-ms936-extb
> sequence.allfonts.UTF-8.zh.CN=alphabetic,chinese-gb18030,dingbats,symbol,chinese-gb18030-extb
> I think that's just a fact. Once you choose UTF-8 you have to decide which of these you want.

Hi @prrace 

Yes, my system also picked up "UTF-8.ja" line. "ja" can be specified by locale data, such as "-Duser.language=ja".

However, I was not able to recreate the wrong size issue on the system which changed the primary language from English to Japanese. There may be differences between pure Japanese Windows and English Windows changed the primary language to Japanese.

Unicode (UTF-8) is language independent. So, we need to use a locale data.
I created a trial patch to use locale data. If you prefer this way, I'll also adjust fontconfig file and test some environments I can prepare.

> sequence.allfonts=alphabetic/default,dingbats,symbol

"alphabetic/default" assigned to "DEFAULT_CHARSET", but it's only used on this line.

> sequence.allfonts.UTF-8.ja=alphabetic,japanese,dingbats,symbol

"alphabetic" assigned to "ANSI_CHARSET". So, if we had "DEFAULT_CHARSET", nothing was matched. Then, the first one was used. (WFontConfiguration.java, l.165)

-------------

PR: https://git.openjdk.java.net/jdk/pull/8329