<i18n dev> Supporting charset GB18030-2005

Pushkar N Kulkarni pushkar.nk at in.ibm.com
Tue Nov 16 19:02:08 UTC 2021


Hi Alan,

Thanks. I appreciate your response.

Yes, I think GB13080 must continue to be GB13080-2000 for now. I was initially hoping to add a new character set with the name GB13080-2005. But I guess your suggestion of internally mapping one of the two versions (2000 or 2005) to "GB13080", based on the value of a new System property,  version 2000 being the default, could be a better approach.

Pushkar N Kulkarni,
Developer, IBM Runtimes

Simplicity is prerequisite for reliability - Edsger W. Dijkstra



-----"Alan Bateman" <Alan.Bateman at oracle.com> wrote: -----
To: "Pushkar N Kulkarni" <pushkar.nk at in.ibm.com>, i18n-dev at openjdk.java.net
From: "Alan Bateman" <Alan.Bateman at oracle.com>
Date: 11/16/2021 04:00PM
Cc: core-libs-dev at openjdk.java.net
Subject: [EXTERNAL] Re: <i18n dev> Supporting charset GB18030-2005

On 15/11/2021 17:53, Pushkar N Kulkarni wrote:
> Hi there,
>
>
> OpenJDK currently supports version 2000 of the GB18030 (https://en.wikipedia.org/wiki/GB_18030 ) character set viz. GB18030-2000. The character mappings corresponding to Unicode codepoints '\u1E3F' and '\uE7C7' were swapped in a new version of the character set named GB18030-2005. I learn that this corrected a mistake in version 2000.
>
> OpenJDK does not support version 2005 as yet. Can someone please help me with reasons for the same, if any?
>
> We do have users requesting for 2005 support. While Linux (RHEL 7/8) has moved to supporting GB18030-2005 via glibc, Windows 10 and AIX 7.2 still have GB18030-2000 base. That means OpenJDK cannot move to GB18030-2005 base as yet. However, we can support both the versions until all the supported platforms move to GB18030-2005 base. Would that be an acceptable proposition?
>
> If we can have an enhancement request opened, I'd be glad to contribute the GB18030-2005 charset implementation.

If I read this correctly, then your proposal is for GB18030 to continue 
to be GB18030-2000 but you would introduce a new charset GB18030 map to 
GB18030-2005 for the new version. Are you also proposing a system 
property or some means to have GB18030 be GB18030-2005 until the time is 
right to make it the default?

-Alan



More information about the i18n-dev mailing list