Adding new IBM extended charsets

Alan Bateman Alan.Bateman at oracle.com
Sun Aug 5 18:38:45 UTC 2018


On 24/07/2018 09:56, Nasser Ebrahim wrote:
> Thank you Martin, Sherman and Alan for your valuable inputs.
>
> I have done some initial analysis on the ICU4J. There are some 
> compatibility issues on the ICU4J charsets with JDK charsets but am 
> more concerned about its performance as JDK optimization do no exist 
> in that implementation. I think we need to work with the ICU4J 
> community to resolve those issues before we remove those charsets from 
> JDK.
If you can work with the ICU4J project on these issues then I think we 
have a way forward. An additional issue with their downloads is that 
they target JDK 6 and don't seem to have thought about deploying as 
modules with JDK 9 or newer yet. Their downloads can be used as 
automatic modules but it requires renaming their JAR files due to 
unusual naming that they use to encode the version string. A simple 
Automatic-Module-Name attribute would make it easy for developers to 
deploy their charset provider on the module path, they can still target 
JDK 6.

As regards the way forward then I think we have to put infrastructure 
into the build to make it easy to allow specific charsets be included or 
excluded from specific platforms. As things stand, and as have you have 
found with your updates to the stdcs-<platform> files, the charsets are 
generated to be included in either java.base or jdk.charsets. We need 
another input to the configurability to make it possible to include or 
exclude so that the main stream platforms do not have to include the IBM 
charsets. There are several details around this, particularly around 
aliases, but if we can get that done then we have a lot of flexibility.  
My personal view is that we should work towards excluding the IBM 
charsets from the main stream platforms, starting with a cull of the 
EBCDIC charsets. If the ICU4J project can get their issues sorted out in 
a similar time frame then it makes for a simple migration story -- the 
JDK includes the standard charsets and many additional charsets. If you 
need others then download the ICU4J charset provider and deploy it on 
your class path or module path.

-Alan


More information about the core-libs-dev mailing list