Adding new IBM extended charsets

Alan Bateman Alan.Bateman at oracle.com
Thu Jul 19 10:14:15 UTC 2018


On 19/07/2018 08:27, Xueming Shen wrote:
> Hi Nasser,
>
> From openjdk's perspective It would be preferred to direct the develop 
> to use the charset
> implementation provided by IBM, or the reliable third party that has 
> the appropriate knowledge,
> experience and resource to support/maintain those charsets such as the 
> icu4j charset
> project. I have been pulling the data from that huge icu-charset-data 
> file and implement/maintain
> them based on my best knowledge, but I'm sure engineers from IBM or 
> the icu project probably
> can do a much better job to implement/maintain/update those charsets 
> going forward.
>
> As first step we can separate those IBM charsets from the jdk.charset 
> into a separate package
> somewhere and configure them to be built into java.base and 
> jdk.charsets, for aix platform only.
> Then we can further discuss the best way to handle/distribute those 
> charsets that are not needed
> for the java.base module (for vm startup). As I said, it would be 
> ideal if we can remove them from the
> openjdk repo/binaries complete and direct the developer/user to use 
> the icu4j charset provider
> for those encodings, when needed. But given the possible compatibility 
> concern, we might want to
> phase this work out gradually in next major release.
I agree and in terms of phasing then I don't think it would be too 
disruptive if the EBCDIC charsets were dropped from jdk.charsets in JDK 
12, at least on the main stream platforms. As we've established in this 
thread, the ICU4J project does seem to publish its charset provider to 
Maven so there are alternatives for applications that really need these 
charsets

Nasser - do you do any testing with the ICU4J charsets? I quickly tried 
62.1 and it seems to work fine on the class path. I didn't check for any 
compatibility differences or compare the performance but maybe you have. 
It's a bit awkward to test this provider as an automatic module due to 
the unusual naming of these JAR files. They may not have looked at 
modules yet but the ability to link thee icu4h.charsets module into a 
run-time image seems something that people may want to do in the future.

-Alan


More information about the core-libs-dev mailing list