<AWT Dev> [OpenJDK 2D-Dev] PING: RFR: 8249215: JFrame::setVisible crashed with -Dfile.encoding=UTF-8

Philip Race philip.race at oracle.com
Wed Jul 29 00:49:40 UTC 2020



On 7/28/20, 5:35 PM, Yasumasa Suenaga wrote:
> Hi Phil,
>
> Thanks for explanation.
>
> findFontWithCharset() does not have comments, so I cannot evaluate my 
> proposal whether it breaks important behavior, but I almost agree with 
> your change.
>
> However...
>
>> sequence.allfonts.UTF-8.ja=japanese,dingbats,symbol
>> instead of the current
>> sequence.allfonts.UTF-8.ja=alphabetic,japanese,dingbats,symbol
>>
>> then I am not sure your fix in-line below will find anything and 
>> we'll still crash.
>
> then isn't it dangerous "Arial,ANSI_CHARSET" is set to fontName forced?

Nope, because the problem is that we may to return anything from this 
method (ie
return null instead). This is obviously not null so is fine.

-phil.


>
>>>> if (fontName ==null) {
>>>>      if (fontDescriptors.length >0) {
>>>>        return fontDescriptors[0].getNativeName();
>>>>      }else {
>>>>          fontName ="Arial,ANSI_CHARSET";
>>>>     }
>>>> }
>
> The direct cause of the crash which I saw is 
> WComponentPeer::defaultFont was null.
> So I think we need to assertion to check WComponentPeer::defaultFont 
> is not null as a safeguard.
>
> How about this change?
>
> ```
> diff -r 1a722ad6e23d 
> src/java.desktop/windows/classes/sun/awt/windows/WComponentPeer.java
> --- 
> a/src/java.desktop/windows/classes/sun/awt/windows/WComponentPeer.java      
> Tue Jul 28 09:05:36 2020 +0200
> +++ 
> b/src/java.desktop/windows/classes/sun/awt/windows/WComponentPeer.java      
> Wed Jul 29 09:32:00 2020 +0900
> @@ -56,6 +56,7 @@
>  import java.awt.image.VolatileImage;
>  import java.awt.peer.ComponentPeer;
>  import java.awt.peer.ContainerPeer;
> +import java.util.Objects;
>
>  import sun.awt.AWTAccessor;
>  import sun.awt.PaintEventDispatcher;
> @@ -579,7 +580,12 @@
>      }
>
>      // fallback default font object
> -    static final Font defaultFont = new Font(Font.DIALOG, Font.PLAIN, 
> 12);
> +    static final Font defaultFont;
> +
> +    static {
> +        defaultFont = new Font(Font.DIALOG, Font.PLAIN, 12);
> +        Objects.requireNonNull(defaultFont, "default font must not be 
> null");
> +    }
>
>      @Override
>      public Graphics getGraphics() {
> diff -r 1a722ad6e23d 
> src/java.desktop/windows/classes/sun/awt/windows/WFontConfiguration.java
> --- 
> a/src/java.desktop/windows/classes/sun/awt/windows/WFontConfiguration.java  
> Tue Jul 28 09:05:36 2020 +0200
> +++ 
> b/src/java.desktop/windows/classes/sun/awt/windows/WFontConfiguration.java  
> Wed Jul 29 09:32:00 2020 +0900
> @@ -157,9 +157,12 @@
>      public String getTextComponentFontName(String familyName, int 
> style) {
>          FontDescriptor[] fontDescriptors = 
> getFontDescriptors(familyName, style);
>          String fontName = findFontWithCharset(fontDescriptors, 
> textInputCharset);
> -        if (fontName == null) {
> +        if ((fontName == null) && 
> !textInputCharset.equals("DEFAULT_CHARSET")) {
>              fontName = findFontWithCharset(fontDescriptors, 
> "DEFAULT_CHARSET");
>          }
> +        if ((fontName == null) && (fontDescriptors.length > 0)) {
> +            fontName = fontDescriptors[0].getNativeName();
> +        }
>          return fontName;
>      }
>
> ```
>
>
> Thanks,
>
> Yasumasa
>
>
> On 2020/07/28 23:43, Philip Race wrote:
>> 1) You assume there is a font with ANSI_CHARSET in the list.
>> I thought I already tried to point out that if the entry looked like 
>> this
>>
>> sequence.allfonts.UTF-8.ja=japanese,dingbats,symbol
>> instead of the current
>> sequence.allfonts.UTF-8.ja=alphabetic,japanese,dingbats,symbol
>>
>> then I am not sure your fix in-line below will find anything and 
>> we'll still crash.
>>
>> 2) changing findFontWithCharset is still the wrong/more dangerous 
>> place to change this.
>> You may be uninintentionally changing behaviour.
>>
>> My proposal won't break anything. It just finds a fall back when 
>> otherwise we'd crash.
>>
>> I am not sure any of the CHARSET selections matter any more when 
>> using a unicode locale.
>> It is mainly about selecting the most appropriate font.
>>
>> And that isn't going to work with out more changes.
>>
>> At the very least we have to augment these
>>          subsetCharsetMap.put("japanese", "SHIFTJIS_CHARSET");
>>          subsetEncodingMap.put("japanese", "windows-31j");
>>
>> with *something like*
>>
>>          subsetCharsetMap.put("ja_utf8", "DEFAULT_CHARSET");
>>          subsetEncodingMap.put("japanese-utf8", "utf-8");
>>
>> and also update the fontconfig file to have
>> sequence.allfonts.UTF-8.ja=alphabetic,ja_utf8,dingbats,symbol
>>
>> and lots of fontconfig change like adding
>> sequence.serif.japanese-utf8=alphabetic,ja_utf8,dingbats,symbol
>>
>> I haven't thought that through to see if this is exactly right but it 
>> is what is really missing IMO.
>>
>> However having done all of that it still doesn't change that
>> 1) There is the possibility of  a crash if not done right
>> 2) It isn't clear that it *really matters*.
>>
>> And a rearchitecture of this file and the supporting code is beyond 
>> the scope of what we want to do today ...
>>
>> -phil
>>
>> On 7/28/20, 2:21 AM, Yasumasa Suenaga wrote:
>>> Hi Phil
>>>
>>> Thank you so much for further investigation!
>>> Your change works fine on my Windows which is set to Japanese locale.
>>>
>>> However I wonder the meanings of "DEFAULT_CHARSET". It does not 
>>> appear to be working, right?
>>>
>>> To my understand, alphabetic font should be used if 
>>> "DEFAULT_CHARSET" is chosen.
>>> (So I think your change may be chosen "Arial,ANSI_CHARSET")
>>>
>>> Thus I think we can fix as below:
>>>
>>> ```
>>> diff -r 1a722ad6e23d 
>>> src/java.desktop/windows/classes/sun/awt/windows/WFontConfiguration.java 
>>>
>>> --- 
>>> a/src/java.desktop/windows/classes/sun/awt/windows/WFontConfiguration.java 
>>> Tue Jul 28 09:05:36 2020 +0200
>>> +++ 
>>> b/src/java.desktop/windows/classes/sun/awt/windows/WFontConfiguration.java 
>>> Tue Jul 28 18:08:06 2020 +0900
>>> @@ -165,6 +165,9 @@
>>>
>>>      private String findFontWithCharset(FontDescriptor[] 
>>> fontDescriptors, String charset) {
>>>          String fontName = null;
>>> +        if (charset.equals("DEFAULT_CHARSET")) {
>>> +            charset = "ANSI_CHARSET";
>>> +        }
>>>          for (int i = 0; i < fontDescriptors.length; i++) {
>>>              String componentFontName = 
>>> fontDescriptors[i].getNativeName();
>>>              if (componentFontName.endsWith(charset)) {
>>> ```
>>>
>>> The following code is pointless as you said, so I agree with you to 
>>> remove it.
>>>
>>>>> if (fontName ==null) {
>>>>>      fontName = 
>>>>> findFontWithCharset(fontDescriptors,"DEFAULT_CHARSET");
>>>>> }
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/07/28 15:15, Philip Race wrote:
>>>> I do see some case when default is being returned.
>>>>
>>>> subsetEncodingMap.put("alphabetic", "default");
>>>>
>>>> which then needs an alphabetic font as part of the core script 
>>>> sequence.
>>>> Now looking at desc.isDefaultFont() && 
>>>> charset.equals("DEFAULT_CHARSET") Perhaps that could change the 
>>>> answer in some cases you don't intend. For these UTF 8 locales 
>>>> there is nothing in the fontconfig that identifies the right font. 
>>>> The "ja" in UTF-8.ja is not connected to "japanese" in the 
>>>> fontconfig file. Something like that may be the right fix but it 
>>>> would be a bigger change. I am not sure how much it matters either. 
>>>> There just needs to be a font. In the win9x days and when AWT was 
>>>> an "A" lib not using unicode maybe. Or maybe there's still some 
>>>> benefit to the right font for the language still being set as the 
>>>> text component font but it is not happening anyway in this case and 
>>>> your fix won't solve that. All roads lead to the latin/alphabetic 
>>>> font here. My thinking right now is to just make changes in 
>>>> getTextComponentFontNameso it always returns something but only 
>>>> after the current code fails.
>>>> So instead of your fix, just add this there :
>>>>
>>>> if (fontName ==null) {
>>>>      if (fontDescriptors.length >0) {
>>>>        return fontDescriptors[0].getNativeName();
>>>>      }else {
>>>>          fontName ="Arial,ANSI_CHARSET";
>>>>     }
>>>> }
>>>>
>>>> Not very satisfactory but then we can remove the comment about 
>>>> maybe returning NULL. -phil.
>>>> On 7/27/2020 5:34 PM, Philip Race wrote:
>>>>> This did start when we updated the fontconfiguration file but I 
>>>>> think there was nothing wrong with the update
>>>>> and I found it could happen with the previous  version if we just 
>>>>> remove "devanagari" from this line in the OLD version.
>>>>>
>>>>> sequence.allfonts.UTF-8.ja=alphabetic,japanese,devanagari,dingbats,symbol 
>>>>>
>>>>>
>>>>> Removing that mimics what happened in the new version and is the 
>>>>> first piece of the puzzle.
>>>>>
>>>>> I don't know why devanagari is even there. Possibly it is because 
>>>>> that line was derived from this one :-
>>>>> sequence.allfonts.UTF-8.hi=alphabetic/1252,devanagari,dingbats,symbol
>>>>> since hindi was the first UTF-8 locale that was supported and 
>>>>> someone just edited it to create the JA entry.
>>>>>
>>>>> But it indicates to me that this is quite fragile and could easily 
>>>>> have crashed a long time ago if Devanagari were
>>>>> not there as one of the "core fonts" for UTF-8.ja
>>>>>
>>>>> Then in WFontConfiguration.initTables() a few things happen
>>>>>
>>>>> first this
>>>>> subsetCharsetMap.put("devanagari","DEFAULT_CHARSET");
>>>>> subsetCharsetMap.put("japanese","SHIFTJIS_CHARSET");
>>>>>
>>>>> [for devanagari JDK specifies the Mangal font.]
>>>>>
>>>>> the subsetEncodinging map has this for Japanese
>>>>>   subsetEncodingMap.put("japanese", "windows-31j");
>>>>>
>>>>> then this for UTF-8 for textInputCharset
>>>>> }else if ("UTF-8".equals(defaultEncoding)) {
>>>>>      textInputCharset ="DEFAULT_CHARSET";
>>>>>
>>>>> whereas for the old ms932/windows-31j code page we would have had 
>>>>> this
>>>>>
>>>>> }else if ("windows-31j".equals(defaultEncoding)) {
>>>>>      textInputCharset ="SHIFTJIS_CHARSET";
>>>>>
>>>>> it then calls makeAWTFontName("MS Gothic", "japanese");
>>>>> which looks like this :
>>>>>
>>>>> WFontConfiguration.makeAWTFontName(String platformFontName, String 
>>>>> characterSubsetName) {
>>>>>      String windowsCharset = 
>>>>> subsetCharsetMap.get(characterSubsetName);
>>>>>      if (windowsCharset ==null) {
>>>>>          windowsCharset ="DEFAULT_CHARSET";
>>>>>      }
>>>>>      return platformFontName +"," + windowsCharset;
>>>>> }
>>>>>
>>>>> For "japanese", the result of
>>>>> subsetCharsetMap.get(characterSubsetName);
>>>>>
>>>>> will always be"SHIFTJIS_CHARSET"
>>>>>
>>>>> So the method will return "MS Gothic,SHIFTJIS_CHARSET"
>>>>> and this will get stored in the FontDescriptor
>>>>>
>>>>> The other core entries for Japanese map to ANSI_CHARSET and 
>>>>> SYMBOL_CHARSET.
>>>>>
>>>>> When in the old fontconfig file  is called for "devanagari", it 
>>>>> will return "Mangal,DEFAULT_CHARSET".
>>>>>
>>>>> Without that, there is no DEFAULT_CHARSET mapped for any font in 
>>>>> the core Japanese fonts.
>>>>>
>>>>> This all becomes important when 
>>>>> WFontConfiguration.getTextComponentFontName() is called from 
>>>>> native code.
>>>>>
>>>>> It has this line
>>>>> String fontName = findFontWithCharset(fontDescriptors, 
>>>>> textInputCharset);
>>>>>
>>>>> from above we know that for UTF-8 :
>>>>>      textInputCharset ="DEFAULT_CHARSET";
>>>>>
>>>>> but as just noted above there are NO fonts tagged with that
>>>>>
>>>>> So the look up fails. The code retries : -
>>>>> if (fontName ==null) {
>>>>>      fontName = 
>>>>> findFontWithCharset(fontDescriptors,"DEFAULT_CHARSET");
>>>>> }
>>>>>   but that was pointless since DEFAULT_CHARSET is what was already 
>>>>> tried.
>>>>>
>>>>> Now back to the windows-31j locale, there we had
>>>>>
>>>>>      textInputCharset ="SHIFTJIS_CHARSET";
>>>>>
>>>>> so that finds the match "MS Gothic,SHIFTJIS_CHARSET".
>>>>>
>>>>>   getTextComponentFontName() has the comment "May return null." 
>>>>> which is true, but not very helpful to the native caller, which 
>>>>> bails out, leaving the
>>>>> native font structs uninitialised and ready to cause a crash.
>>>>>
>>>>> That's the kind of analysis I was hoping for !
>>>>>
>>>>> Now, the question is, is what you propose the right fix for this ?
>>>>> But I am not sure it can even work.
>>>>>
>>>>> 931 descriptors[i] = new FontDescriptor(awtFontName, enc, 
>>>>> exclusionRanges, encoding.equals("default")); seems like it will 
>>>>> never pass true in my testing. Then the whole fix falls apart. Can 
>>>>> you show some evidence ? -phil
>>>>>
>>>>>
>>>>> On 7/27/2020 3:50 PM, Yasumasa Suenaga wrote:
>>>>>> Hi Phil,
>>>>>>
>>>>>> I confirmed WFontConfiguration::findFontWithCharset cannot find 
>>>>>> if -Dfile.encoding=UTF-8 is passed.
>>>>>> I guess one of the cause is the definitions in 
>>>>>> make/data/fontconfig/windows.fontconfig.properties, but also 
>>>>>> DEFAULT_CHARSET does not work at this point.
>>>>>>
>>>>>> If we do not pass -Dfile.encoding=UTF-8, `charset` in 
>>>>>> WFontConfiguration::findFontWithCharset is set to "windows-31j" 
>>>>>> and it can find out valid font when Windows is set to Japanese 
>>>>>> locale.
>>>>>>
>>>>>> I can share minidump for further investigation. What should I do 
>>>>>> / share?
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2020/07/28 0:02, Philip Race wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> You're avoiding a crash but I don't yet know what *exactly* 
>>>>>>> caused the crash.
>>>>>>> Some Java code not handling DEFAULT_CHARSET is obviously not the 
>>>>>>> exact cause.
>>>>>>> This just starts it and something bad presumably happens later 
>>>>>>> in native code.
>>>>>>>
>>>>>>> And I don't yet understand why (we think) this started happening 
>>>>>>> when some
>>>>>>> additional fonts were added to the file.
>>>>>>>
>>>>>>> Knowing exactly what is wrong will help decide if this is the 
>>>>>>> right fix.
>>>>>>>
>>>>>>> -phil
>>>>>>>
>>>>>>> On 7/24/20, 5:59 AM, Yasumasa Suenaga wrote:
>>>>>>>> Hi Jay,
>>>>>>>>
>>>>>>>> I share you hs_err log of this issue.
>>>>>>>> `chcp` on my console shows "932" (MS932). It is Japanese locale.
>>>>>>>>
>>>>>>>> I can share you if you want to know.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2020/07/24 20:59, Jayathirth D V wrote:
>>>>>>>>> Hi Yasumasa,
>>>>>>>>>
>>>>>>>>> I tried after changing the locale to Japanese but I don’t see 
>>>>>>>>> the issue.
>>>>>>>>>
>>>>>>>>> Also tried to reproduce the issue by enabling/disabling 
>>>>>>>>> setting "Beta:Use Unicode UTF-8 for worldwide language 
>>>>>>>>> support" in my locale setting.
>>>>>>>>>
>>>>>>>>> @Others : Can somebody else try to reproduce this issue?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Jay
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Yasumasa Suenaga <suenaga at oss.nttdata.com>
>>>>>>>>> Sent: Thursday, July 23, 2020 5:41 PM
>>>>>>>>> To: Jayathirth D v <jayathirth.d.v at oracle.com>
>>>>>>>>> Cc: 2d-dev <2d-dev at openjdk.java.net>; awt-dev at openjdk.java.net
>>>>>>>>> Subject: Re: [OpenJDK 2D-Dev] PING: RFR: 8249215: 
>>>>>>>>> JFrame::setVisible crashed with -Dfile.encoding=UTF-8
>>>>>>>>>
>>>>>>>>> Hi Jay,
>>>>>>>>>
>>>>>>>>> On 2020/07/23 19:09, Jayathirth D v wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I tried reproducing the issue in my Windows 10 machine with 
>>>>>>>>>> UTF-8 encoding and test file mentioned in the bug, I don’t 
>>>>>>>>>> see any crash.
>>>>>>>>>> Am I missing something?
>>>>>>>>>
>>>>>>>>> OS locale may be affecting.
>>>>>>>>>
>>>>>>>>> My laptop has been set Japanese (CP932 / Windows-31J), so 
>>>>>>>>> WFontConfiguration attempt to find Japanese font by default.
>>>>>>>>> However WFontConfiguration cannot find out the font of 
>>>>>>>>> "DEFAULT_CHARSET" when -Dfile.encoding=UTF-8 is passed.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Also I think this should be in awt-dev so adding the mailing 
>>>>>>>>>> list.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Jay
>>>>>>>>>>
>>>>>>>>>>> On 20-Jul-2020, at 12:59 PM, Yasumasa Suenaga 
>>>>>>>>>>> <suenaga at oss.nttdata.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> PING: could you review it?
>>>>>>>>>>>
>>>>>>>>>>>>     JBS: https://bugs.openjdk.java.net/browse/JDK-8249215
>>>>>>>>>>>>     webrev: 
>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8249215/webrev.00/
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>> On 2020/07/11 17:39, Yasumasa Suenaga wrote:
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>     JBS: https://bugs.openjdk.java.net/browse/JDK-8249215
>>>>>>>>>>>>     webrev: 
>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8249215/webrev.00/
>>>>>>>>>>>> I tried to run Sample.java in JDK-8236161 with 
>>>>>>>>>>>> -Dfile.encoding=UTF-8, but JVM crashed due to internal 
>>>>>>>>>>>> error on fastdebug VM. I saw same call stack with 
>>>>>>>>>>>> JDK-8236161 in hs_err log.
>>>>>>>>>>>> I investigated it, then I found out current implementation 
>>>>>>>>>>>> cannot handle default charset.
>>>>>>>>>>>> If charset is set to UTF-8, it would be handled as 
>>>>>>>>>>>> "DEFAULT_CHARSET" in WFontConfiguration::initTables. 
>>>>>>>>>>>> However it does not affect native font name, so we cannot 
>>>>>>>>>>>> find valid font.
>>>>>>>>>>>> This change has passed all tests on submit repo 
>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8249215-20200711-0655-12566039)
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>
>>>>


More information about the awt-dev mailing list