Adding new IBM extended charsets
Alan Bateman
Alan.Bateman at oracle.com
Sun Aug 5 18:38:45 UTC 2018
On 24/07/2018 09:56, Nasser Ebrahim wrote:
> Thank you Martin, Sherman and Alan for your valuable inputs.
>
> I have done some initial analysis on the ICU4J. There are some
> compatibility issues on the ICU4J charsets with JDK charsets but am
> more concerned about its performance as JDK optimization do no exist
> in that implementation. I think we need to work with the ICU4J
> community to resolve those issues before we remove those charsets from
> JDK.
If you can work with the ICU4J project on these issues then I think we
have a way forward. An additional issue with their downloads is that
they target JDK 6 and don't seem to have thought about deploying as
modules with JDK 9 or newer yet. Their downloads can be used as
automatic modules but it requires renaming their JAR files due to
unusual naming that they use to encode the version string. A simple
Automatic-Module-Name attribute would make it easy for developers to
deploy their charset provider on the module path, they can still target
JDK 6.
As regards the way forward then I think we have to put infrastructure
into the build to make it easy to allow specific charsets be included or
excluded from specific platforms. As things stand, and as have you have
found with your updates to the stdcs-<platform> files, the charsets are
generated to be included in either java.base or jdk.charsets. We need
another input to the configurability to make it possible to include or
exclude so that the main stream platforms do not have to include the IBM
charsets. There are several details around this, particularly around
aliases, but if we can get that done then we have a lot of flexibility.
My personal view is that we should work towards excluding the IBM
charsets from the main stream platforms, starting with a cull of the
EBCDIC charsets. If the ICU4J project can get their issues sorted out in
a similar time frame then it makes for a simple migration story -- the
JDK includes the standard charsets and many additional charsets. If you
need others then download the ICU4J charset provider and deploy it on
your class path or module path.
-Alan
More information about the core-libs-dev
mailing list