<AWT Dev> [OpenJDK 2D-Dev] PING: RFR: 8249215: JFrame::setVisible crashed with -Dfile.encoding=UTF-8

Tue Jul 28 00:34:17 UTC 2020

This did start when we updated the fontconfiguration file but I think 
there was nothing wrong with the update
and I found it could happen with the previous  version if we just remove 
"devanagari" from this line in the OLD version.

sequence.allfonts.UTF-8.ja=alphabetic,japanese,devanagari,dingbats,symbol

Removing that mimics what happened in the new version and is the first 
piece of the puzzle.

I don't know why devanagari is even there. Possibly it is because that 
line was derived from this one :-
sequence.allfonts.UTF-8.hi=alphabetic/1252,devanagari,dingbats,symbol
since hindi was the first UTF-8 locale that was supported and someone 
just edited it to create the JA entry.

But it indicates to me that this is quite fragile and could easily have 
crashed a long time ago if Devanagari were
not there as one of the "core fonts" for UTF-8.ja

Then in WFontConfiguration.initTables() a few things happen

first this

subsetCharsetMap.put("devanagari","DEFAULT_CHARSET");
subsetCharsetMap.put("japanese","SHIFTJIS_CHARSET");

[for devanagari JDK specifies the Mangal font.]

the subsetEncodinging map has this for Japanese
  subsetEncodingMap.put("japanese", "windows-31j");

then this for UTF-8 for textInputCharset
}else if ("UTF-8".equals(defaultEncoding)) {
     textInputCharset ="DEFAULT_CHARSET";

whereas for the old ms932/windows-31j code page we would have had this

}else if ("windows-31j".equals(defaultEncoding)) {
     textInputCharset ="SHIFTJIS_CHARSET";

it then calls makeAWTFontName("MS Gothic", "japanese");
which looks like this :

WFontConfiguration.makeAWTFontName(String platformFontName, String characterSubsetName) {
     String windowsCharset = subsetCharsetMap.get(characterSubsetName);
     if (windowsCharset ==null) {
         windowsCharset ="DEFAULT_CHARSET";
     }
     return platformFontName +"," + windowsCharset;
}

For "japanese", the result of
subsetCharsetMap.get(characterSubsetName);

will always be"SHIFTJIS_CHARSET"

So the method will return "MS Gothic,SHIFTJIS_CHARSET"
and this will get stored in the FontDescriptor

The other core entries for Japanese map to ANSI_CHARSET and SYMBOL_CHARSET.

When in the old fontconfig file  is called for "devanagari", it will return "Mangal,DEFAULT_CHARSET".

Without that, there is no DEFAULT_CHARSET mapped for any font in the core Japanese fonts.

This all becomes important when WFontConfiguration.getTextComponentFontName() is called from native code.

It has this line
String fontName = findFontWithCharset(fontDescriptors, textInputCharset);

from above we know that for UTF-8 :
     textInputCharset ="DEFAULT_CHARSET";

but as just noted above there are NO fonts tagged with that

So the look up fails. The code retries : -
if (fontName ==null) {
     fontName = findFontWithCharset(fontDescriptors,"DEFAULT_CHARSET");
}

but that was pointless since DEFAULT_CHARSET is what was already tried.

Now back to the windows-31j locale, there we had

     textInputCharset ="SHIFTJIS_CHARSET";

so that finds the match "MS Gothic,SHIFTJIS_CHARSET".

  getTextComponentFontName() has the comment "May return null." which is true, but not very helpful to the native caller, which bails out, leaving the
native font structs uninitialised and ready to cause a crash.

That's the kind of analysis I was hoping for !

Now, the question is, is what you propose the right fix for this ?
But I am not sure it can even work.

931 descriptors[i] = new FontDescriptor(awtFontName, enc, 
exclusionRanges, encoding.equals("default")); seems like it will never 
pass true in my testing. Then the whole fix falls apart. Can you show 
some evidence ? -phil

On 7/27/2020 3:50 PM, Yasumasa Suenaga wrote:
> Hi Phil,
>
> I confirmed WFontConfiguration::findFontWithCharset cannot find if 
> -Dfile.encoding=UTF-8 is passed.
> I guess one of the cause is the definitions in 
> make/data/fontconfig/windows.fontconfig.properties, but also 
> DEFAULT_CHARSET does not work at this point.
>
> If we do not pass -Dfile.encoding=UTF-8, `charset` in 
> WFontConfiguration::findFontWithCharset is set to "windows-31j" and it 
> can find out valid font when Windows is set to Japanese locale.
>
> I can share minidump for further investigation. What should I do / share?
>
>
> Thanks,
>
> Yasumasa
>
>
> On 2020/07/28 0:02, Philip Race wrote:
>> Hi,
>>
>> You're avoiding a crash but I don't yet know what *exactly* caused 
>> the crash.
>> Some Java code not handling DEFAULT_CHARSET is obviously not the 
>> exact cause.
>> This just starts it and something bad presumably happens later in 
>> native code.
>>
>> And I don't yet understand why (we think) this started happening when 
>> some
>> additional fonts were added to the file.
>>
>> Knowing exactly what is wrong will help decide if this is the right fix.
>>
>> -phil
>>
>> On 7/24/20, 5:59 AM, Yasumasa Suenaga wrote:
>>> Hi Jay,
>>>
>>> I share you hs_err log of this issue.
>>> `chcp` on my console shows "932" (MS932). It is Japanese locale.
>>>
>>> I can share you if you want to know.
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/07/24 20:59, Jayathirth D V wrote:
>>>> Hi Yasumasa,
>>>>
>>>> I tried after changing the locale to Japanese but I don’t see the 
>>>> issue.
>>>>
>>>> Also tried to reproduce the issue by enabling/disabling setting 
>>>> "Beta:Use Unicode UTF-8 for worldwide language support" in my 
>>>> locale setting.
>>>>
>>>> @Others : Can somebody else try to reproduce this issue?
>>>>
>>>> Thanks,
>>>> Jay
>>>>
>>>> -----Original Message-----
>>>> From: Yasumasa Suenaga <suenaga at oss.nttdata.com>
>>>> Sent: Thursday, July 23, 2020 5:41 PM
>>>> To: Jayathirth D v <jayathirth.d.v at oracle.com>
>>>> Cc: 2d-dev <2d-dev at openjdk.java.net>; awt-dev at openjdk.java.net
>>>> Subject: Re: [OpenJDK 2D-Dev] PING: RFR: 8249215: 
>>>> JFrame::setVisible crashed with -Dfile.encoding=UTF-8
>>>>
>>>> Hi Jay,
>>>>
>>>> On 2020/07/23 19:09, Jayathirth D v wrote:
>>>>> Hi,
>>>>>
>>>>> I tried reproducing the issue in my Windows 10 machine with UTF-8 
>>>>> encoding and test file mentioned in the bug, I don’t see any crash.
>>>>> Am I missing something?
>>>>
>>>> OS locale may be affecting.
>>>>
>>>> My laptop has been set Japanese (CP932 / Windows-31J), so 
>>>> WFontConfiguration attempt to find Japanese font by default.
>>>> However WFontConfiguration cannot find out the font of 
>>>> "DEFAULT_CHARSET" when -Dfile.encoding=UTF-8 is passed.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>>> Also I think this should be in awt-dev so adding the mailing list.
>>>>>
>>>>> Thanks,
>>>>> Jay
>>>>>
>>>>>> On 20-Jul-2020, at 12:59 PM, Yasumasa Suenaga 
>>>>>> <suenaga at oss.nttdata.com> wrote:
>>>>>>
>>>>>> PING: could you review it?
>>>>>>
>>>>>>>     JBS: https://bugs.openjdk.java.net/browse/JDK-8249215
>>>>>>>     webrev: 
>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8249215/webrev.00/
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>> On 2020/07/11 17:39, Yasumasa Suenaga wrote:
>>>>>>> Hi all,
>>>>>>> Please review this change:
>>>>>>>     JBS: https://bugs.openjdk.java.net/browse/JDK-8249215
>>>>>>>     webrev: 
>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8249215/webrev.00/
>>>>>>> I tried to run Sample.java in JDK-8236161 with 
>>>>>>> -Dfile.encoding=UTF-8, but JVM crashed due to internal error on 
>>>>>>> fastdebug VM. I saw same call stack with JDK-8236161 in hs_err log.
>>>>>>> I investigated it, then I found out current implementation 
>>>>>>> cannot handle default charset.
>>>>>>> If charset is set to UTF-8, it would be handled as 
>>>>>>> "DEFAULT_CHARSET" in WFontConfiguration::initTables. However it 
>>>>>>> does not affect native font name, so we cannot find valid font.
>>>>>>> This change has passed all tests on submit repo 
>>>>>>> (mach5-one-ysuenaga-JDK-8249215-20200711-0655-12566039)
>>>>>>> Thanks,
>>>>>>> Yasumasa
>>>>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/awt-dev/attachments/20200727/043a270a/attachment-0001.htm>