[jdk8u] RFR: 8301119: Support for GB18030-2022 [v3]
Severin Gehwolf
sgehwolf at openjdk.org
Fri Jun 23 13:39:08 UTC 2023
On Fri, 23 Jun 2023 12:58:19 GMT, Andrew John Hughes <andrew at openjdk.org> wrote:
>> This is being proposed for inclusion in 8u382 during rampdown, so that the changes are in place for when GB18030-2022 enforcement begins in August. It modifies GB18030 to handle both the 2000 and the 2022 variant. The 2000 variant is available by setting `-Djdk.charset.GB18030=2000`.
>>
>> With the preceding test changes in place (#43 and #44), the changes needed for this are fairly minimal. The biggest divergence from 11u is in the character set providers. The changes in the `make` directory are not needed as 8u never moved to using a template for GB18030 in the first place (the 11u changes revert it back to being source-based). The change in the `SPI.java` generator tool moves into `ExtendedCharsets.java` in the class library, as the file is not auto-generated in 8u. The change to `StandardCharsets.java.template` lands in `AbstractCharsetProvider.java`.
>>
>> In 8u, the standard charsets are generated from a text file by a shell script, while the extended charsets are handled by a standard class. 11u moves GB18030 from extended to standard. I experimented with this in 8u, but it seemed more problematic than just keeping it in the extended set. The only reason I can see for moving it in 11u is it allows `IS_2000` to be package-private to `sun.nio.cs`, whereas we need to make it public in `sun.nio.cs.ext` so it can be accessed from `sun.nio.cs`.
>>
>> To use the 11u solution would mean major rewrites to the shell script or bringing over the whole change in how the standard charset provider is generated from 11u, which I think, along with moving the package the character set is in, is too risky and unnecessary for this change. The generation changes are necessary because the GB18030 character set needs to provide a different alias, depending on whether it is the 2000 or 2002 variant. The `genCharsetProvider.sh` would need the alterations we have added to `ExtendedCharsets.java` to handle this, but converted to awk.
>>
>> The only adjustment to `GB18030.java`, other than copyright headers, is to replace the use of `jdk.internal.misc.VM.initLevel` with that of `sun.misc.VM.isBooted`. 8u does not provide as fine-grained access to the initialisation status as 11u, and so may force the use of the 2022 standard until a later stage in the bootup (`BOOTED` is `initLevel() = 4` in 11u).
>>
>> With the tests, the adjustments are just due to differing bug IDs, the absence of `@modules` and the use of constructs (`var`) and library calls (`Set.of`) that don't exist in 8u....
>
> Andrew John Hughes has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit:
>
> Backport 5c4e744dabcf7785c35168db5d0458ccebfd41e6
These 4 tests consistently fail for me with a fastdebug build:
FAILED: java/nio/charset/Charset/RegisteredCharsets.java
FAILED: sun/nio/cs/mapping/CoderTest.java
FAILED: sun/nio/cs/mapping/TestConv.java
FAILED: sun/nio/cs/TestGB18030.java
`RegisteredCharsets.java` failure is:
java.nio.charset.UnsupportedCharsetException: gb18030-2022
at java.nio.charset.Charset.forName(Charset.java:531)
at RegisteredCharsets.aliasCheck(RegisteredCharsets.java:182)
at RegisteredCharsets.main(RegisteredCharsets.java:255)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
at java.lang.Thread.run(Thread.java:750)
`CoderTest.java` failure is:
java.lang.Exception: Errors detected in 1 charset
at CoderTest.main(CoderTest.java:526)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
at java.lang.Thread.run(Thread.java:750)
specifically:
GB18030 (GB18030)
1 byte/char
decode
decode (direct)
encode
encode (direct)
2 bytes/char
decode
Error: a8bc --> U+e7c7, expected U+1e3f
Error: a6d9 --> U+e78d, expected U+fe10
Error: a6da --> U+e78e, expected U+fe12
Error: a6db --> U+e78f, expected U+fe11
Error: a6dc --> U+e790, expected U+fe13
Error: a6dd --> U+e791, expected U+fe14
Error: a6de --> U+e792, expected U+fe15
Error: a6df --> U+e793, expected U+fe16
Error: a6ec --> U+e794, expected U+fe17
Error: a6ed --> U+e795, expected U+fe18
Too many errors, giving up
decode (direct)
Error: a8bc --> U+e7c7, expected U+1e3f
Error: a6d9 --> U+e78d, expected U+fe10
Error: a6da --> U+e78e, expected U+fe12
Error: a6db --> U+e78f, expected U+fe11
Error: a6dc --> U+e790, expected U+fe13
Error: a6dd --> U+e791, expected U+fe14
Error: a6de --> U+e792, expected U+fe15
Error: a6df --> U+e793, expected U+fe16
Error: a6ec --> U+e794, expected U+fe17
Error: a6ed --> U+e795, expected U+fe18
Too many errors, giving up
encode
Error: U+1e3f --> 8135, expected a8bc
Error: U+2010 --> f437, expected a95c
Error: U+2013 --> a95c, expected a843
Error: U+2014 --> a843, expected a1aa
Error: U+2015 --> a1aa, expected a844
Error: U+2016 --> a844, expected a1ac
Error: U+2018 --> a1ac, expected a1ae
Error: U+2019 --> a1ae, expected a1af
Error: U+201c --> a1af, expected a1b0
Error: U+201d --> a1b0, expected a1b1
Too many errors, giving up
encode (direct)
Error: U+1e3f --> 8135, expected a8bc
Error: U+2010 --> f437, expected a95c
Error: U+2013 --> a95c, expected a843
Error: U+2014 --> a843, expected a1aa
Error: U+2015 --> a1aa, expected a844
Error: U+2016 --> a844, expected a1ac
Error: U+2018 --> a1ac, expected a1ae
Error: U+2019 --> a1ae, expected a1af
Error: U+201c --> a1af, expected a1b0
Error: U+201d --> a1b0, expected a1b1
Too many errors, giving up
3 bytes/char
decode
decode (direct)
encode
encode (direct)
4 bytes/char
decode
Error: 82359037 --> U+9fb4, expected U+e81e
Error: 82359038 --> U+9fb5, expected U+e826
Error: 82359039 --> U+9fb6, expected U+e82b
Error: 82359130 --> U+9fb7, expected U+e82c
Error: 82359131 --> U+9fb8, expected U+e832
Error: 82359132 --> U+9fb9, expected U+e843
Error: 82359133 --> U+9fba, expected U+e854
Error: 82359134 --> U+9fbb, expected U+e864
Error: 8135f437 --> U+1e3f, expected U+e7c7
Error: 84318236 --> U+fe10, expected U+e78d
Too many errors, giving up
decode (direct)
Error: 82359037 --> U+9fb4, expected U+e81e
Error: 82359038 --> U+9fb5, expected U+e826
Error: 82359039 --> U+9fb6, expected U+e82b
Error: 82359130 --> U+9fb7, expected U+e82c
Error: 82359131 --> U+9fb8, expected U+e832
Error: 82359132 --> U+9fb9, expected U+e843
Error: 82359133 --> U+9fba, expected U+e854
Error: 82359134 --> U+9fbb, expected U+e864
Error: 8135f437 --> U+1e3f, expected U+e7c7
Error: 84318236 --> U+fe10, expected U+e78d
Too many errors, giving up
encode
Error: U+e81e --> fe59fe61, expected 82359037
Error: U+e826 --> fe66fe67, expected 82359038
Error: U+e82b --> fe6dfe7e, expected 82359039
Error: U+e82c --> fe90fea0, expected 82359130
Error: U+e832 --> 82359135, expected 82359131
Error: U+e843 --> 82359136, expected 82359132
Error: U+e854 --> 82359137, expected 82359133
Error: U+e864 --> 82359138, expected 82359134
Error: U+9fbc --> 82359139, expected 82359135
Error: U+9fbd --> 82359230, expected 82359136
Too many errors, giving up
encode (direct)
Error: U+e81e --> fe59fe61, expected 82359037
Error: U+e826 --> fe66fe67, expected 82359038
Error: U+e82b --> fe6dfe7e, expected 82359039
Error: U+e82c --> fe90fea0, expected 82359130
Error: U+e832 --> 82359135, expected 82359131
Error: U+e843 --> 82359136, expected 82359132
Error: U+e854 --> 82359137, expected 82359133
Error: U+e864 --> 82359138, expected 82359134
Error: U+9fbc --> 82359139, expected 82359135
Error: U+9fbd --> 82359230, expected 82359136
Too many errors, giving up
`TestConv.java` failure is:
Checking GB18030_2000...
Not supported: GB18030_2000
[...]
Checking GB18030...
Warning 1: 0xA8BC -> \\uE7C7 multi-mapping? \\u1E3F
Warning 2: 0x82359037 -> \\u9FB4 multi-mapping? \\uE81E
Warning 3: 0x82359038 -> \\u9FB5 multi-mapping? \\uE826
Warning 4: 0x82359039 -> \\u9FB6 multi-mapping? \\uE82B
Warning 5: 0x82359130 -> \\u9FB7 multi-mapping? \\uE82C
Warning 6: 0x82359131 -> \\u9FB8 multi-mapping? \\uE832
Warning 7: 0x82359132 -> \\u9FB9 multi-mapping? \\uE843
Warning 8: 0x82359133 -> \\u9FBA multi-mapping? \\uE854
Warning 9: 0x82359134 -> \\u9FBB multi-mapping? \\uE864
Warning 10: 0xA6D9 -> \\uE78D multi-mapping? \\uFE10
Warning 11: 0xA6DA -> \\uE78E multi-mapping? \\uFE12
Warning 12: 0xA6DB -> \\uE78F multi-mapping? \\uFE11
Warning 13: 0xA6DC -> \\uE790 multi-mapping? \\uFE13
Warning 14: 0xA6DD -> \\uE791 multi-mapping? \\uFE14
Warning 15: 0xA6DE -> \\uE792 multi-mapping? \\uFE15
Warning 16: 0xA6DF -> \\uE793 multi-mapping? \\uFE16
Warning 17: 0xA6EC -> \\uE794 multi-mapping? \\uFE17
Warning 18: 0xA6ED -> \\uE795 multi-mapping? \\uFE18
Warning 19: 0xA6F3 -> \\uE796 multi-mapping? \\uFE19
Warning 20: 0x8135F437 -> \\u1E3F multi-mapping? \\uE7C7
Warning 21: 0xFE59 -> \\uE81E multi-mapping? \\u9FB4
Warning 22: 0xFE61 -> \\uE826 multi-mapping? \\u9FB5
Warning 23: 0xFE66 -> \\uE82B multi-mapping? \\u9FB6
Warning 24: 0xFE67 -> \\uE82C multi-mapping? \\u9FB7
Warning 25: 0xFE6D -> \\uE832 multi-mapping? \\u9FB8
Warning 26: 0xFE7E -> \\uE843 multi-mapping? \\u9FB9
Warning 27: 0xFE90 -> \\uE854 multi-mapping? \\u9FBA
Warning 28: 0xFEA0 -> \\uE864 multi-mapping? \\u9FBB
Warning 29: 0x84318236 -> \\uFE10 multi-mapping? \\uE78D
Warning 30: 0x84318237 -> \\uFE11 multi-mapping? \\uE78F
Warning 31: 0x84318238 -> \\uFE12 multi-mapping? \\uE78E
Warning 32: 0x84318239 -> \\uFE13 multi-mapping? \\uE790
Warning 33: 0x84318330 -> \\uFE14 multi-mapping? \\uE791
Warning 34: 0x84318331 -> \\uFE15 multi-mapping? \\uE792
Warning 35: 0x84318332 -> \\uFE16 multi-mapping? \\uE793
Warning 36: 0x84318333 -> \\uFE17 multi-mapping? \\uE794
Warning 37: 0x84318334 -> \\uFE18 multi-mapping? \\uE795
Warning 38: 0x84318335 -> \\uFE19 multi-mapping? \\uE796
----------System.err:(14/735)----------
java.lang.RuntimeException: 38 Warning(s).
at TestConv.check(TestConv.java:124)
at TestConv.main(TestConv.java:47)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
at java.lang.Thread.run(Thread.java:750)
JavaTest Message: Test threw exception: java.lang.RuntimeException: 38 Warning(s).
JavaTest Message: shutting down test
`TestGB18030.java` failure is:
checkAlias(): IS_2000: false, expected: [gb18030-2022], found: [gb18030-2000]
----------System.err:(14/754)----------
java.lang.RuntimeException: Result mismatch
at TestGB18030.checkAlias(TestGB18030.java:89)
at TestGB18030.main(TestGB18030.java:95)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
at java.lang.Thread.run(Thread.java:750)
JavaTest Message: Test threw exception: java.lang.RuntimeException: Result mismatch
-------------
PR Review: https://git.openjdk.org/jdk8u/pull/45#pullrequestreview-1495164844
More information about the jdk8u-dev
mailing list