<i18n dev> RFR: 8257733: Move module-specific data from make to respective module [v2]

Tue Dec 8 18:44:10 UTC 2020

On Tue, 8 Dec 2020 17:33:16 GMT, Alan Bateman <alanb at openjdk.org> wrote:

>> The mapping and nr tables, and the *-X.java.template files in make/data/charsetmapping are used to generate source code for the java.base and jdk.charsets modules. The stdcs-$OS files configure the package and module that each charset go into. If the tables used to generate the source files are moved to src/java.base then make/modules/jdk.charsets/Gensrc.gmk will probably need a new home too.
>
>> @AlanBateman The process of modularization was not fully completed with Project Jigsaw, and a few ugly warts remained. I was under the impression that these should be addressed in follow-up fixes, but this was unfortunately never done. Charsets and cldrconverter were both split between a core portion in java.base and the rest in jdk.charsets and jdk.localedata, respectively, but the split was never handled properly, but just "duct taped" in place.
> 
> This is a complicated area of the build, not really a Project Jigsaw issue. It's complicated because the source code for the charsets is generated at build time and the set of non-standard charsets included in java.base varies by platform, e.g. there's are several IBMxxx charsets in java.base when building on AIX that are not interesting to include in java.base on other platforms. This means we can't split up the mapping tables in make/data/charsetmapping and put them in different directories. If you are moving them into the src tree then src/java.base (as you have it) is best but will still have the ugly wart that some of these mapping tables will be used to generate code for the jdk.charsets module.

> @AlanBateman The process of modularization was not fully completed with Project Jigsaw, and a few ugly warts remained. I was under the impression that these should be addressed in follow-up fixes, but this was unfortunately never done. Charsets and cldrconverter were both split between a core portion in java.base and the rest in jdk.charsets and jdk.localedata, respectively, but the split was never handled properly, but just "duct taped" in place.
> 
> I chose to put the data files used for both java.base and the "additional" modules in java.base, based on the comment that Naoto made in https://mail.openjdk.java.net/pipermail/build-dev/2020-March/027044.html:
> 
> > As to charsetmapping and cldrconverter, I believe they can reside in
> > java.base, as jdk.charsets and jdk.localedata modules depend on it.
> 
> Of course it would be preferable to make a proper split, but that requires work done by the component teams to break the modules apart.
> 
> Specifically for make/modules/jdk.charsets/Gensrc.gmk; the code in that file is more or less duplicated in make/modules/java.base/gensrc/GensrcCharsetMapping.gmk, since the same data set is processed twice, once for java.base and once for jdk.charsets. I don't think that means that make/modules/jdk.charsets/Gensrc.gmk should move to any other place.

I still stand by what I wrote above. It's best to put data in java.base for charsets/localedata. Otherwise we would have to duplicate some in each modules source directory.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1611