[jdk8u] RFR: 8301119: Support for GB18030-2022 [v3]

Severin Gehwolf sgehwolf at openjdk.org
Fri Jun 23 13:39:08 UTC 2023


On Fri, 23 Jun 2023 12:58:19 GMT, Andrew John Hughes <andrew at openjdk.org> wrote:

>> This is being proposed for inclusion in 8u382 during rampdown, so that the changes are in place for when GB18030-2022 enforcement begins in August. It modifies GB18030 to handle both the 2000 and the 2022 variant. The 2000 variant is available by setting `-Djdk.charset.GB18030=2000`.
>> 
>> With the preceding test changes in place (#43 and #44), the changes needed for this are fairly minimal. The biggest divergence from 11u is in the character set providers. The changes in the `make` directory are not needed as 8u never moved to using a template for GB18030 in the first place (the 11u changes revert it back to being source-based). The change in the `SPI.java` generator tool moves into `ExtendedCharsets.java` in the class library, as the file is not auto-generated in 8u. The change to `StandardCharsets.java.template` lands in `AbstractCharsetProvider.java`.
>> 
>> In 8u, the standard charsets are generated from a text file by a shell script, while the extended charsets are handled by a standard class. 11u moves GB18030 from extended to standard. I experimented with this in 8u, but it seemed more problematic than just keeping it in the extended set. The only reason I can see for moving it in 11u is it allows `IS_2000` to be package-private to `sun.nio.cs`, whereas we need to make it public in `sun.nio.cs.ext` so it can be accessed from `sun.nio.cs`.
>> 
>> To use the 11u solution would mean major rewrites to the shell script or bringing over the whole change in how the standard charset provider is generated from 11u, which I think, along with moving the package the character set is in, is too risky and unnecessary for this change. The generation changes are necessary because the GB18030 character set needs to provide a different alias, depending on whether it is the 2000 or 2002 variant. The `genCharsetProvider.sh` would need the alterations we have added to `ExtendedCharsets.java` to handle this, but converted to awk.
>> 
>> The only adjustment to `GB18030.java`, other than copyright headers, is to replace the use of `jdk.internal.misc.VM.initLevel` with that of `sun.misc.VM.isBooted`. 8u does not provide as fine-grained access to the initialisation status as 11u, and so may force the use of the 2022 standard until a later stage in the bootup (`BOOTED` is `initLevel() = 4` in 11u).
>> 
>> With the tests, the adjustments are just due to differing bug IDs, the absence of `@modules` and the use of constructs (`var`) and library calls (`Set.of`) that don't exist in 8u....
>
> Andrew John Hughes has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit:
> 
>   Backport 5c4e744dabcf7785c35168db5d0458ccebfd41e6

These 4 tests consistently fail for me with a fastdebug build:


FAILED: java/nio/charset/Charset/RegisteredCharsets.java
FAILED: sun/nio/cs/mapping/CoderTest.java
FAILED: sun/nio/cs/mapping/TestConv.java
FAILED: sun/nio/cs/TestGB18030.java


`RegisteredCharsets.java` failure is:

java.nio.charset.UnsupportedCharsetException: gb18030-2022
        at java.nio.charset.Charset.forName(Charset.java:531)
        at RegisteredCharsets.aliasCheck(RegisteredCharsets.java:182)
        at RegisteredCharsets.main(RegisteredCharsets.java:255)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
        at java.lang.Thread.run(Thread.java:750)


`CoderTest.java` failure is:


java.lang.Exception: Errors detected in 1 charset
        at CoderTest.main(CoderTest.java:526)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
        at java.lang.Thread.run(Thread.java:750)


specifically:


GB18030 (GB18030)
  1 byte/char
    decode
    decode (direct)
    encode
    encode (direct)
  2 bytes/char
    decode
      Error: a8bc --> U+e7c7, expected U+1e3f
      Error: a6d9 --> U+e78d, expected U+fe10
      Error: a6da --> U+e78e, expected U+fe12
      Error: a6db --> U+e78f, expected U+fe11
      Error: a6dc --> U+e790, expected U+fe13
      Error: a6dd --> U+e791, expected U+fe14
      Error: a6de --> U+e792, expected U+fe15
      Error: a6df --> U+e793, expected U+fe16
      Error: a6ec --> U+e794, expected U+fe17
      Error: a6ed --> U+e795, expected U+fe18
      Too many errors, giving up
    decode (direct)
      Error: a8bc --> U+e7c7, expected U+1e3f
      Error: a6d9 --> U+e78d, expected U+fe10
      Error: a6da --> U+e78e, expected U+fe12
      Error: a6db --> U+e78f, expected U+fe11
      Error: a6dc --> U+e790, expected U+fe13
      Error: a6dd --> U+e791, expected U+fe14
      Error: a6de --> U+e792, expected U+fe15
      Error: a6df --> U+e793, expected U+fe16
      Error: a6ec --> U+e794, expected U+fe17
      Error: a6ed --> U+e795, expected U+fe18
      Too many errors, giving up
    encode
      Error: U+1e3f --> 8135, expected a8bc
      Error: U+2010 --> f437, expected a95c
      Error: U+2013 --> a95c, expected a843
      Error: U+2014 --> a843, expected a1aa
      Error: U+2015 --> a1aa, expected a844
      Error: U+2016 --> a844, expected a1ac
      Error: U+2018 --> a1ac, expected a1ae
      Error: U+2019 --> a1ae, expected a1af
      Error: U+201c --> a1af, expected a1b0
      Error: U+201d --> a1b0, expected a1b1
      Too many errors, giving up
    encode (direct)
      Error: U+1e3f --> 8135, expected a8bc
      Error: U+2010 --> f437, expected a95c
      Error: U+2013 --> a95c, expected a843
      Error: U+2014 --> a843, expected a1aa
      Error: U+2015 --> a1aa, expected a844
      Error: U+2016 --> a844, expected a1ac
      Error: U+2018 --> a1ac, expected a1ae
      Error: U+2019 --> a1ae, expected a1af
      Error: U+201c --> a1af, expected a1b0
      Error: U+201d --> a1b0, expected a1b1
      Too many errors, giving up
  3 bytes/char
    decode
    decode (direct)
    encode
    encode (direct)
  4 bytes/char
    decode
      Error: 82359037 --> U+9fb4, expected U+e81e
      Error: 82359038 --> U+9fb5, expected U+e826
      Error: 82359039 --> U+9fb6, expected U+e82b
      Error: 82359130 --> U+9fb7, expected U+e82c
      Error: 82359131 --> U+9fb8, expected U+e832
      Error: 82359132 --> U+9fb9, expected U+e843
      Error: 82359133 --> U+9fba, expected U+e854
      Error: 82359134 --> U+9fbb, expected U+e864
      Error: 8135f437 --> U+1e3f, expected U+e7c7
      Error: 84318236 --> U+fe10, expected U+e78d
      Too many errors, giving up
    decode (direct)
      Error: 82359037 --> U+9fb4, expected U+e81e
      Error: 82359038 --> U+9fb5, expected U+e826
      Error: 82359039 --> U+9fb6, expected U+e82b
      Error: 82359130 --> U+9fb7, expected U+e82c
      Error: 82359131 --> U+9fb8, expected U+e832
      Error: 82359132 --> U+9fb9, expected U+e843
      Error: 82359133 --> U+9fba, expected U+e854
      Error: 82359134 --> U+9fbb, expected U+e864
      Error: 8135f437 --> U+1e3f, expected U+e7c7
      Error: 84318236 --> U+fe10, expected U+e78d
      Too many errors, giving up
    encode
      Error: U+e81e --> fe59fe61, expected 82359037
      Error: U+e826 --> fe66fe67, expected 82359038
      Error: U+e82b --> fe6dfe7e, expected 82359039
      Error: U+e82c --> fe90fea0, expected 82359130
      Error: U+e832 --> 82359135, expected 82359131
      Error: U+e843 --> 82359136, expected 82359132
      Error: U+e854 --> 82359137, expected 82359133
      Error: U+e864 --> 82359138, expected 82359134
      Error: U+9fbc --> 82359139, expected 82359135
      Error: U+9fbd --> 82359230, expected 82359136
      Too many errors, giving up
    encode (direct)
      Error: U+e81e --> fe59fe61, expected 82359037
      Error: U+e826 --> fe66fe67, expected 82359038
      Error: U+e82b --> fe6dfe7e, expected 82359039
      Error: U+e82c --> fe90fea0, expected 82359130
      Error: U+e832 --> 82359135, expected 82359131
      Error: U+e843 --> 82359136, expected 82359132
      Error: U+e854 --> 82359137, expected 82359133
      Error: U+e864 --> 82359138, expected 82359134
      Error: U+9fbc --> 82359139, expected 82359135
      Error: U+9fbd --> 82359230, expected 82359136
      Too many errors, giving up


`TestConv.java` failure is:


Checking GB18030_2000...
Not supported: GB18030_2000
[...]
Checking GB18030...
Warning 1: 0xA8BC -> \\uE7C7  multi-mapping? \\u1E3F
Warning 2: 0x82359037 -> \\u9FB4  multi-mapping? \\uE81E
Warning 3: 0x82359038 -> \\u9FB5  multi-mapping? \\uE826
Warning 4: 0x82359039 -> \\u9FB6  multi-mapping? \\uE82B
Warning 5: 0x82359130 -> \\u9FB7  multi-mapping? \\uE82C
Warning 6: 0x82359131 -> \\u9FB8  multi-mapping? \\uE832
Warning 7: 0x82359132 -> \\u9FB9  multi-mapping? \\uE843
Warning 8: 0x82359133 -> \\u9FBA  multi-mapping? \\uE854
Warning 9: 0x82359134 -> \\u9FBB  multi-mapping? \\uE864
Warning 10: 0xA6D9 -> \\uE78D  multi-mapping? \\uFE10
Warning 11: 0xA6DA -> \\uE78E  multi-mapping? \\uFE12
Warning 12: 0xA6DB -> \\uE78F  multi-mapping? \\uFE11
Warning 13: 0xA6DC -> \\uE790  multi-mapping? \\uFE13
Warning 14: 0xA6DD -> \\uE791  multi-mapping? \\uFE14
Warning 15: 0xA6DE -> \\uE792  multi-mapping? \\uFE15
Warning 16: 0xA6DF -> \\uE793  multi-mapping? \\uFE16
Warning 17: 0xA6EC -> \\uE794  multi-mapping? \\uFE17
Warning 18: 0xA6ED -> \\uE795  multi-mapping? \\uFE18
Warning 19: 0xA6F3 -> \\uE796  multi-mapping? \\uFE19
Warning 20: 0x8135F437 -> \\u1E3F  multi-mapping? \\uE7C7
Warning 21: 0xFE59 -> \\uE81E  multi-mapping? \\u9FB4
Warning 22: 0xFE61 -> \\uE826  multi-mapping? \\u9FB5
Warning 23: 0xFE66 -> \\uE82B  multi-mapping? \\u9FB6
Warning 24: 0xFE67 -> \\uE82C  multi-mapping? \\u9FB7
Warning 25: 0xFE6D -> \\uE832  multi-mapping? \\u9FB8
Warning 26: 0xFE7E -> \\uE843  multi-mapping? \\u9FB9
Warning 27: 0xFE90 -> \\uE854  multi-mapping? \\u9FBA
Warning 28: 0xFEA0 -> \\uE864  multi-mapping? \\u9FBB
Warning 29: 0x84318236 -> \\uFE10  multi-mapping? \\uE78D
Warning 30: 0x84318237 -> \\uFE11  multi-mapping? \\uE78F
Warning 31: 0x84318238 -> \\uFE12  multi-mapping? \\uE78E
Warning 32: 0x84318239 -> \\uFE13  multi-mapping? \\uE790
Warning 33: 0x84318330 -> \\uFE14  multi-mapping? \\uE791
Warning 34: 0x84318331 -> \\uFE15  multi-mapping? \\uE792
Warning 35: 0x84318332 -> \\uFE16  multi-mapping? \\uE793
Warning 36: 0x84318333 -> \\uFE17  multi-mapping? \\uE794
Warning 37: 0x84318334 -> \\uFE18  multi-mapping? \\uE795
Warning 38: 0x84318335 -> \\uFE19  multi-mapping? \\uE796
----------System.err:(14/735)----------
java.lang.RuntimeException: 38 Warning(s).
        at TestConv.check(TestConv.java:124)
        at TestConv.main(TestConv.java:47)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
        at java.lang.Thread.run(Thread.java:750)

JavaTest Message: Test threw exception: java.lang.RuntimeException: 38 Warning(s).
JavaTest Message: shutting down test


`TestGB18030.java` failure is:


checkAlias(): IS_2000: false, expected: [gb18030-2022], found: [gb18030-2000]
----------System.err:(14/754)----------
java.lang.RuntimeException: Result mismatch
        at TestGB18030.checkAlias(TestGB18030.java:89)
        at TestGB18030.main(TestGB18030.java:95)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
        at java.lang.Thread.run(Thread.java:750)

JavaTest Message: Test threw exception: java.lang.RuntimeException: Result mismatch

-------------

PR Review: https://git.openjdk.org/jdk8u/pull/45#pullrequestreview-1495164844


More information about the jdk8u-dev mailing list