RFR(XS): 8202329 [AIX] Fix codepage mappings for IBM-943 and Big5

Bhaktavatsal R Maram bhamaram at in.ibm.com
Fri Jun 1 08:03:00 UTC 2018


Hi Thomas,

Sorry, I was on vacation. I will submit webrev with jtreg testcase.

Thanks,
Bhaktavatsal
 

  

-----"Thomas Stüfe" <thomas.stuefe at gmail.com> wrote: -----
To: Bhaktavatsal R Maram <bhamaram at in.ibm.com>
From: "Thomas Stüfe" <thomas.stuefe at gmail.com>
Date: 05/23/2018 12:32AM
Cc: Ichiroh Takiguchi <takiguc at linux.vnet.ibm.com>, Volker Simonis <volker.simonis at gmail.com>, Java Core Libs <core-libs-dev at openjdk.java.net>
Subject: Re: RFR(XS): 8202329 [AIX] Fix codepage mappings for IBM-943 and Big5

Hi Bhaktavatsal,

the fix is fine. Reviewed. Thanks a lot for your work!

Could you please (its okay in a future patch) add a regression test
for IBM943 and IBM943C, in the form of a jtreg test? I would like to
test conversions for both code pages (at least for the crucial
tilde/backslash code points).

Additionally, that when started on AIX with Ja_JP.IBM-943, we
correctly default to IBM943C.

As an example, here is an old test I wrote years ago. You won't be
able to use it verbatim, so it is just a suggestion. Feel free to do
it differently.

http://cr.openjdk.java.net/~stuefe/other/IBM943MappingTest.java

If you have not written jtreg tests before:

http://openjdk.java.net/jtreg/

Also, under /test are many many examples.

Best Regards, Thomas

On Mon, May 21, 2018 at 9:47 AM, Bhaktavatsal R Maram
<bhamaram at in.ibm.com> wrote:
> Hi Thomas,
>
> Is this fix ready to merge?
>
> Thanks,
> Bhaktavatsal
>
>
>
>
> -----"Thomas Stüfe" <thomas.stuefe at gmail.com> wrote: -----
> To: Ichiroh Takiguchi <takiguc at linux.vnet.ibm.com>, Bhaktavatsal R Maram <bhamaram at in.ibm.com>
> From: "Thomas Stüfe" <thomas.stuefe at gmail.com>
> Date: 05/11/2018 11:49AM
> Cc: Volker Simonis <volker.simonis at gmail.com>, Java Core Libs <core-libs-dev at openjdk.java.net>
> Subject: Re: RFR(XS): 8202329 [AIX] Fix codepage mappings for IBM-943 and Big5
>
> Hi,
>
> I'll test and review next week. We also have some in-house tests which
> I'd like to run.
>
> You IBM folks should really apply for authorship so that this
> contribution process gets streamlined. After all, if something breaks
> in this code, you want to be able to fix it, yes? So even if you do
> not contribute much else, more patches may be forthcoming.
>
> Of course I hope these are not your last contributions :)
>
> Best, Thomas
>
>
>
> On Fri, May 11, 2018 at 7:57 AM, Ichiroh Takiguchi
> <takiguc at linux.vnet.ibm.com> wrote:
>> Hi.
>>
>> I tested this fix on AIX.
>>
>> I got following results.
>> $ LANG=Ja_JP ~/jdk/bin/java PrintDefaultCharset
>> Ja_JP   x-IBM943C       IBM-943C        IBM-943C
>> $ LANG=Ja_JP.IBM-943 ~/jdk/bin/java PrintDefaultCharset
>> Ja_JP.IBM-943   x-IBM943C       IBM-943C        IBM-943C
>> $ LANG=Zh_TW ~/jdk/bin/java PrintDefaultCharset
>> Zh_TW   x-IBM950        IBM-950 IBM-950
>> $ LANG=Zh_TW.big5 ~/jdk/bin/java PrintDefaultCharset
>> Zh_TW.big5      x-IBM950        IBM-950 IBM-950
>>
>> Also I reviewed source code, it's fine
>>
>> Since this testing requires locale installation for Ja_JP and Zh_TW,
>> so it's not easy to test it...
>> (At least, I think bos.loc.pc.Ja_JP and bos.loc.iso.Zh_TW filesets are
>> required)
>>
>>
>> On 2018-05-02 18:32, Volker Simonis wrote:
>>>
>>> Hi Bhaktavatsal Reddy,
>>>
>>> your change looks good. I can sponsor it.
>>>
>>> Just waiting for a second review...
>>>
>>> Thank you and best regards,
>>> Volker
>>>
>>>
>>> On Mon, Apr 30, 2018 at 11:29 AM, Bhaktavatsal R Maram
>>> <bhamaram at in.ibm.com> wrote:
>>>>
>>>> Hi All,
>>>>
>>>> Please review the fix.
>>>>
>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8202329
>>>> webrev: http://cr.openjdk.java.net/~aleonard/8202329/webrev.00/
>>>>
>>>> Thanks,
>>>> Bhaktavatsal Reddy
>>>>
>>>> -----"core-libs-dev" <core-libs-dev-bounces at openjdk.java.net> wrote:
>>>> -----
>>>> To: Volker Simonis <volker.simonis at gmail.com>
>>>> From: "Bhaktavatsal R Maram"
>>>> Sent by: "core-libs-dev"
>>>> Date: 04/26/2018 09:31PM
>>>> Cc: Java Core Libs <core-libs-dev at openjdk.java.net>
>>>> Subject: Re: [AIX] Fix codepage mappings in Java for IBM-943 and Big5
>>>>
>>>> Hi Volker,
>>>>
>>>> Thank you. I will address your review comments and send webrev for
>>>> review.
>>>>
>>>> - Bhaktavatsal Reddy
>>>>
>>>>
>>>>
>>>> -----Volker Simonis <volker.simonis at gmail.com> wrote: -----
>>>> To: Bhaktavatsal R Maram <bhamaram at in.ibm.com>
>>>> From: Volker Simonis <volker.simonis at gmail.com>
>>>> Date: 04/26/2018 09:12PM
>>>> Cc: Java Core Libs <core-libs-dev at openjdk.java.net>
>>>> Subject: Re: [AIX] Fix codepage mappings in Java for IBM-943 and Big5
>>>>
>>>> Hi Bhaktavatsal Reddy,
>>>>
>>>> I've opened the following issue for this problem:
>>>>
>>>> 8202329: [AIX] Fix codepage mappings for IBM-943 and Big5
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8202329
>>>>
>>>> Looking at you fix, can you please replace the "#elif AIX" by "#ifdef
>>>> AIX" and the original "#else" by "#ifdef __solaris__". The original
>>>> else branch contains Solaris-only code anyway and it is an historical
>>>> omission that there are still a lot of places in the code where "not
>>>> Linux" implicitly means "Solaris", but that's often wrong.
>>>>
>>>> Regards,
>>>> Volker
>>>>
>>>>
>>>> On Thu, Apr 26, 2018 at 4:02 PM, Bhaktavatsal R Maram
>>>> <bhamaram at in.ibm.com> wrote:
>>>>>
>>>>> Oops! Looks like there is problem with attachment (might be because I
>>>>> attached .class file as well). I'm pasting the fix and test program here in
>>>>> mail.
>>>>>
>>>>> Test Program:
>>>>>
>>>>> import java.nio.charset.*;
>>>>> class PrintDefaultCharset {
>>>>>      public static void main(String[] args) {
>>>>>         System.out.println("LANG = "+System.getenv("LANG"));
>>>>>         System.out.println("Default charset =
>>>>> "+Charset.defaultCharset().name());
>>>>>         System.out.println("file.encoding =
>>>>> "+System.getProperty("file.encoding"));
>>>>>         System.out.println("sun.jnu.encoding =
>>>>> "+System.getProperty("sun.jnu.encoding"));
>>>>>      }
>>>>> }
>>>>>
>>>>>
>>>>> Fix:
>>>>>
>>>>> diff --git a/src/java.base/unix/native/libjava/java_props_md.c
>>>>> b/src/java.base/unix/native/libjava/java_props_md.c
>>>>> --- a/src/java.base/unix/native/libjava/java_props_md.c
>>>>> +++ b/src/java.base/unix/native/libjava/java_props_md.c
>>>>> @@ -1,5 +1,5 @@
>>>>>  /*
>>>>> - * Copyright (c) 1998, 2016, Oracle and/or its affiliates. All rights
>>>>> reserved.
>>>>> + * Copyright (c) 1998, 2018, Oracle and/or its affiliates. All rights
>>>>> reserved.
>>>>>   * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
>>>>>   *
>>>>>   * This code is free software; you can redistribute it and/or modify it
>>>>> @@ -297,6 +297,18 @@
>>>>>          if (strcmp(p, "EUC-JP") == 0) {
>>>>>              *std_encoding = "EUC-JP-LINUX";
>>>>>          }
>>>>> +#elif defined _AIX
>>>>> +        if (strcmp(p, "big5") == 0) {
>>>>> +            /* On AIX Traditional Chinese Big5 codeset is mapped to
>>>>> IBM-950 */
>>>>> +            *std_encoding = "IBM-950";
>>>>> +        } else if (strcmp(p, "IBM-943") == 0) {
>>>>> +            /*
>>>>> +             * On AIX, IBM-943 is mapped to IBM-943C in which symbol
>>>>> 'yen' and
>>>>> +             * 'overline' are replaced with 'backslash' and 'tilde'
>>>>> from ASCII
>>>>> +             * making first 96 code points same as ASCII.
>>>>> +             */
>>>>> +            *std_encoding = "IBM-943C";
>>>>> +        }
>>>>>  #else
>>>>>          if (strcmp(p,"eucJP") == 0) {
>>>>>              /* For Solaris use customized vendor defined character
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Bhaktavatsal Reddy
>>>>>
>>>>>
>>>>> -----"core-libs-dev" <core-libs-dev-bounces at openjdk.java.net> wrote:
>>>>> -----
>>>>> To: "Java Core Libs" <core-libs-dev at openjdk.java.net>
>>>>> From: "Bhaktavatsal R Maram"
>>>>> Sent by: "core-libs-dev"
>>>>> Date: 04/26/2018 07:26PM
>>>>> Subject: [AIX] Fix codepage mappings in Java for IBM-943 and Big5
>>>>>
>>>>> Hi All,
>>>>>
>>>>> This issue is continuation to bug 8201540 (Extend the set of supported
>>>>> charsets in java.base on AIX) in which we have moved default charsets of
>>>>> most of the locales supported by Operating System to java.base module thus
>>>>> enabling OpenJDK on those locales for AIX platform.
>>>>>
>>>>> As part of that, charsets for locales Ja_JP (IBM-943) and Zh_TW (big5)
>>>>> also have been moved. However, corresponding charsets mapped in Java is not
>>>>> correct for them on AIX. Following are the details:
>>>>>
>>>>> 1. IBM-943 [1] for locale Ja_JP should be mapped to IBM-943C [2]
>>>>>
>>>>> Fundamental difference between IBM-943 and IBM-943C is that IBM-943C is
>>>>> ASCII compatible which means code points 'yen' and 'overline' of IBM-943 is
>>>>> replaced with 'backslash' and 'tilde' from ASCII character set.
>>>>>
>>>>>
>>>>> 2. Big5 for locale Zh_TW should be mapped to IBM-950 [3]
>>>>>
>>>>> I've attached simple test program to print the default charset along
>>>>> with fix for this issue. When run test program (PrintDefaultCharset) with
>>>>> IBM JDK 8 (on AIX) for locales Ja_JP & Zh_TW, following is output.
>>>>>
>>>>> -bash-4.4$ LANG=Ja_JP ~/JDKs/IBM/80/ON/sdk/jre/bin/java
>>>>> PrintDefaultCharset
>>>>> LANG = Ja_JP
>>>>> Default charset = x-IBM943C
>>>>> file.encoding = IBM-943C
>>>>> sun.jnu.encoding = IBM-943C
>>>>>
>>>>> -bash-4.4$ LANG=Zh_TW ~/JDKs/IBM/80/ON/sdk/jre/bin/java
>>>>> PrintDefaultCharset
>>>>> LANG = Zh_TW
>>>>> Default charset = x-IBM950
>>>>> file.encoding = IBM-950
>>>>> sun.jnu.encoding = IBM-950
>>>>>
>>>>>
>>>>> Same test run with openJDK 11 gives following output
>>>>>
>>>>> -bash-4.4$ LANG=Ja_JP ~/jdk/bin/java PrintDefaultCharset
>>>>> LANG = Ja_JP
>>>>> Default charset = x-IBM943
>>>>> file.encoding = IBM-943
>>>>> sun.jnu.encoding = IBM-943
>>>>>
>>>>> -bash-4.4$ LANG=Zh_TW ~/jdk/bin/java PrintDefaultCharset
>>>>> LANG = Zh_TW
>>>>> Default charset = Big5
>>>>> file.encoding = big5
>>>>> sun.jnu.encoding = big5
>>>>>
>>>>> I will get webrev hosted in
>>>>> http://cr.openjdk.java.net
>>>>> for this change and send it for review once JIRA bug is created.
>>>>>
>>>>> [1]
>>>>> http://demo.icu-project.org/icu-bin/convexp?conv=ibm-943_P130-1999&s=JAVA
>>>>> [2]
>>>>> http://demo.icu-project.org/icu-bin/convexp?conv=ibm-943_P15A-2003&s=ALL
>>>>> [3]
>>>>> https://www.ibm.com/support/knowledgecenter/en/ssw_aix_71/com.ibm.aix.nlsgdrf/big5.htm
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Bhaktavatsal Reddy
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>
>
>




More information about the core-libs-dev mailing list