RFR(XS): 8202329 [AIX] Fix codepage mappings for IBM-943 and Big5

Thomas Stüfe thomas.stuefe at gmail.com
Fri May 11 06:19:08 UTC 2018


Hi,

I'll test and review next week. We also have some in-house tests which
I'd like to run.

You IBM folks should really apply for authorship so that this
contribution process gets streamlined. After all, if something breaks
in this code, you want to be able to fix it, yes? So even if you do
not contribute much else, more patches may be forthcoming.

Of course I hope these are not your last contributions :)

Best, Thomas



On Fri, May 11, 2018 at 7:57 AM, Ichiroh Takiguchi
<takiguc at linux.vnet.ibm.com> wrote:
> Hi.
>
> I tested this fix on AIX.
>
> I got following results.
> $ LANG=Ja_JP ~/jdk/bin/java PrintDefaultCharset
> Ja_JP   x-IBM943C       IBM-943C        IBM-943C
> $ LANG=Ja_JP.IBM-943 ~/jdk/bin/java PrintDefaultCharset
> Ja_JP.IBM-943   x-IBM943C       IBM-943C        IBM-943C
> $ LANG=Zh_TW ~/jdk/bin/java PrintDefaultCharset
> Zh_TW   x-IBM950        IBM-950 IBM-950
> $ LANG=Zh_TW.big5 ~/jdk/bin/java PrintDefaultCharset
> Zh_TW.big5      x-IBM950        IBM-950 IBM-950
>
> Also I reviewed source code, it's fine
>
> Since this testing requires locale installation for Ja_JP and Zh_TW,
> so it's not easy to test it...
> (At least, I think bos.loc.pc.Ja_JP and bos.loc.iso.Zh_TW filesets are
> required)
>
>
> On 2018-05-02 18:32, Volker Simonis wrote:
>>
>> Hi Bhaktavatsal Reddy,
>>
>> your change looks good. I can sponsor it.
>>
>> Just waiting for a second review...
>>
>> Thank you and best regards,
>> Volker
>>
>>
>> On Mon, Apr 30, 2018 at 11:29 AM, Bhaktavatsal R Maram
>> <bhamaram at in.ibm.com> wrote:
>>>
>>> Hi All,
>>>
>>> Please review the fix.
>>>
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8202329
>>> webrev: http://cr.openjdk.java.net/~aleonard/8202329/webrev.00/
>>>
>>> Thanks,
>>> Bhaktavatsal Reddy
>>>
>>> -----"core-libs-dev" <core-libs-dev-bounces at openjdk.java.net> wrote:
>>> -----
>>> To: Volker Simonis <volker.simonis at gmail.com>
>>> From: "Bhaktavatsal R Maram"
>>> Sent by: "core-libs-dev"
>>> Date: 04/26/2018 09:31PM
>>> Cc: Java Core Libs <core-libs-dev at openjdk.java.net>
>>> Subject: Re: [AIX] Fix codepage mappings in Java for IBM-943 and Big5
>>>
>>> Hi Volker,
>>>
>>> Thank you. I will address your review comments and send webrev for
>>> review.
>>>
>>> - Bhaktavatsal Reddy
>>>
>>>
>>>
>>> -----Volker Simonis <volker.simonis at gmail.com> wrote: -----
>>> To: Bhaktavatsal R Maram <bhamaram at in.ibm.com>
>>> From: Volker Simonis <volker.simonis at gmail.com>
>>> Date: 04/26/2018 09:12PM
>>> Cc: Java Core Libs <core-libs-dev at openjdk.java.net>
>>> Subject: Re: [AIX] Fix codepage mappings in Java for IBM-943 and Big5
>>>
>>> Hi Bhaktavatsal Reddy,
>>>
>>> I've opened the following issue for this problem:
>>>
>>> 8202329: [AIX] Fix codepage mappings for IBM-943 and Big5
>>>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8202329&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=KUVGEwJiRVpNtQ9wUhGP6BKqzSTV1OWX31WWPdQMmqg&m=iQCg2Acve4LeG-Zymt7gpXuSgJLFbCFHsSVHCETqGt8&s=3KL9rSzXZgjLGz-ayIEaq94QK5rTY0PlEgewOjarNPE&e=
>>>
>>> Looking at you fix, can you please replace the "#elif AIX" by "#ifdef
>>> AIX" and the original "#else" by "#ifdef __solaris__". The original
>>> else branch contains Solaris-only code anyway and it is an historical
>>> omission that there are still a lot of places in the code where "not
>>> Linux" implicitly means "Solaris", but that's often wrong.
>>>
>>> Regards,
>>> Volker
>>>
>>>
>>> On Thu, Apr 26, 2018 at 4:02 PM, Bhaktavatsal R Maram
>>> <bhamaram at in.ibm.com> wrote:
>>>>
>>>> Oops! Looks like there is problem with attachment (might be because I
>>>> attached .class file as well). I'm pasting the fix and test program here in
>>>> mail.
>>>>
>>>> Test Program:
>>>>
>>>> import java.nio.charset.*;
>>>> class PrintDefaultCharset {
>>>>      public static void main(String[] args) {
>>>>         System.out.println("LANG = "+System.getenv("LANG"));
>>>>         System.out.println("Default charset =
>>>> "+Charset.defaultCharset().name());
>>>>         System.out.println("file.encoding =
>>>> "+System.getProperty("file.encoding"));
>>>>         System.out.println("sun.jnu.encoding =
>>>> "+System.getProperty("sun.jnu.encoding"));
>>>>      }
>>>> }
>>>>
>>>>
>>>> Fix:
>>>>
>>>> diff --git a/src/java.base/unix/native/libjava/java_props_md.c
>>>> b/src/java.base/unix/native/libjava/java_props_md.c
>>>> --- a/src/java.base/unix/native/libjava/java_props_md.c
>>>> +++ b/src/java.base/unix/native/libjava/java_props_md.c
>>>> @@ -1,5 +1,5 @@
>>>>  /*
>>>> - * Copyright (c) 1998, 2016, Oracle and/or its affiliates. All rights
>>>> reserved.
>>>> + * Copyright (c) 1998, 2018, Oracle and/or its affiliates. All rights
>>>> reserved.
>>>>   * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
>>>>   *
>>>>   * This code is free software; you can redistribute it and/or modify it
>>>> @@ -297,6 +297,18 @@
>>>>          if (strcmp(p, "EUC-JP") == 0) {
>>>>              *std_encoding = "EUC-JP-LINUX";
>>>>          }
>>>> +#elif defined _AIX
>>>> +        if (strcmp(p, "big5") == 0) {
>>>> +            /* On AIX Traditional Chinese Big5 codeset is mapped to
>>>> IBM-950 */
>>>> +            *std_encoding = "IBM-950";
>>>> +        } else if (strcmp(p, "IBM-943") == 0) {
>>>> +            /*
>>>> +             * On AIX, IBM-943 is mapped to IBM-943C in which symbol
>>>> 'yen' and
>>>> +             * 'overline' are replaced with 'backslash' and 'tilde'
>>>> from ASCII
>>>> +             * making first 96 code points same as ASCII.
>>>> +             */
>>>> +            *std_encoding = "IBM-943C";
>>>> +        }
>>>>  #else
>>>>          if (strcmp(p,"eucJP") == 0) {
>>>>              /* For Solaris use customized vendor defined character
>>>>
>>>>
>>>> Thanks,
>>>> Bhaktavatsal Reddy
>>>>
>>>>
>>>> -----"core-libs-dev" <core-libs-dev-bounces at openjdk.java.net> wrote:
>>>> -----
>>>> To: "Java Core Libs" <core-libs-dev at openjdk.java.net>
>>>> From: "Bhaktavatsal R Maram"
>>>> Sent by: "core-libs-dev"
>>>> Date: 04/26/2018 07:26PM
>>>> Subject: [AIX] Fix codepage mappings in Java for IBM-943 and Big5
>>>>
>>>> Hi All,
>>>>
>>>> This issue is continuation to bug 8201540 (Extend the set of supported
>>>> charsets in java.base on AIX) in which we have moved default charsets of
>>>> most of the locales supported by Operating System to java.base module thus
>>>> enabling OpenJDK on those locales for AIX platform.
>>>>
>>>> As part of that, charsets for locales Ja_JP (IBM-943) and Zh_TW (big5)
>>>> also have been moved. However, corresponding charsets mapped in Java is not
>>>> correct for them on AIX. Following are the details:
>>>>
>>>> 1. IBM-943 [1] for locale Ja_JP should be mapped to IBM-943C [2]
>>>>
>>>> Fundamental difference between IBM-943 and IBM-943C is that IBM-943C is
>>>> ASCII compatible which means code points 'yen' and 'overline' of IBM-943 is
>>>> replaced with 'backslash' and 'tilde' from ASCII character set.
>>>>
>>>>
>>>> 2. Big5 for locale Zh_TW should be mapped to IBM-950 [3]
>>>>
>>>> I've attached simple test program to print the default charset along
>>>> with fix for this issue. When run test program (PrintDefaultCharset) with
>>>> IBM JDK 8 (on AIX) for locales Ja_JP & Zh_TW, following is output.
>>>>
>>>> -bash-4.4$ LANG=Ja_JP ~/JDKs/IBM/80/ON/sdk/jre/bin/java
>>>> PrintDefaultCharset
>>>> LANG = Ja_JP
>>>> Default charset = x-IBM943C
>>>> file.encoding = IBM-943C
>>>> sun.jnu.encoding = IBM-943C
>>>>
>>>> -bash-4.4$ LANG=Zh_TW ~/JDKs/IBM/80/ON/sdk/jre/bin/java
>>>> PrintDefaultCharset
>>>> LANG = Zh_TW
>>>> Default charset = x-IBM950
>>>> file.encoding = IBM-950
>>>> sun.jnu.encoding = IBM-950
>>>>
>>>>
>>>> Same test run with openJDK 11 gives following output
>>>>
>>>> -bash-4.4$ LANG=Ja_JP ~/jdk/bin/java PrintDefaultCharset
>>>> LANG = Ja_JP
>>>> Default charset = x-IBM943
>>>> file.encoding = IBM-943
>>>> sun.jnu.encoding = IBM-943
>>>>
>>>> -bash-4.4$ LANG=Zh_TW ~/jdk/bin/java PrintDefaultCharset
>>>> LANG = Zh_TW
>>>> Default charset = Big5
>>>> file.encoding = big5
>>>> sun.jnu.encoding = big5
>>>>
>>>> I will get webrev hosted in
>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__cr.openjdk.java.net&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=KUVGEwJiRVpNtQ9wUhGP6BKqzSTV1OWX31WWPdQMmqg&m=Prdd2GWj8c4aCa1qPr65xklNyDyu64w_6X7qkNaI-D8&s=8C1bILMg6JLJtbL0KLRPfU0MkIPkAmq_IlJgdTfpjdI&e=
>>>> for this change and send it for review once JIRA bug is created.
>>>>
>>>> [1]
>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__demo.icu-2Dproject.org_icu-2Dbin_convexp-3Fconv-3Dibm-2D943-5FP130-2D1999-26s-3DJAVA&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=KUVGEwJiRVpNtQ9wUhGP6BKqzSTV1OWX31WWPdQMmqg&m=Prdd2GWj8c4aCa1qPr65xklNyDyu64w_6X7qkNaI-D8&s=RJOiyJTR1jkgxxnRZu5JL97irAnHo1M4wMp7x21dgvs&e=
>>>> [2]
>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__demo.icu-2Dproject.org_icu-2Dbin_convexp-3Fconv-3Dibm-2D943-5FP15A-2D2003-26s-3DALL&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=KUVGEwJiRVpNtQ9wUhGP6BKqzSTV1OWX31WWPdQMmqg&m=Prdd2GWj8c4aCa1qPr65xklNyDyu64w_6X7qkNaI-D8&s=gXshfq2f9yed1iEYTdt57Lk0vvHXztqgUzOLQ64h8Vo&e=
>>>> [3]
>>>> https://www.ibm.com/support/knowledgecenter/en/ssw_aix_71/com.ibm.aix.nlsgdrf/big5.htm
>>>>
>>>>
>>>> Thanks,
>>>> Bhaktavatsal Reddy
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>


More information about the core-libs-dev mailing list