<i18n dev> RFR: 8290488: IBM864 character encoding implementation bug
Ichiroh Takiguchi
itakiguchi at openjdk.org
Thu Jul 28 10:32:48 UTC 2022
On Thu, 28 Jul 2022 01:46:26 GMT, Naoto Sato <naoto at openjdk.org> wrote:
>> Hello @naotoj .
>> I'm not reviewer, but I'd like to test this change.
>> Could you wait for a moment ?
>> Thanks.
>
> @takiguc Sure. Appreciate it.
Many thanks @naotoj .
I checked the latest IBM-864 mapping table.
(I assume current OpenJDK's IBM864 may refer older mapping table)
https://raw.githubusercontent.com/unicode-org/icu/main/icu4c/source/data/mappings/ibm-864_X110-1999.ucm
.ucm file format is as follows:
https://unicode-org.github.io/icu/userguide/conversion/data.html#ucm-file-format
I checked roundtrip mapping
| IBM864.map | ibm-864_X110-1999.ucm |
| --- | --- |
| 0x1a U+001a | 0x1a U+001c |
| 0x1c U+001c | 0x1c U+007f |
| **0x25 U+066a** | **0x25 U+0025** |
| 0x7f U+007f | 0x7f U+001a |
| 0x9f U+fffd | 0x9f U+200b |
| 0xd7 U+fec1 | 0xd7 U+fec3 |
| 0xd8 U+fec5 | 0xd8 U+fec7 |
| 0xf1 U+0651 | 0xf1 U+fe7c |
**Note**: 0x1a <-> U+001c / 0x1c <-> U+007f / 0x7f <-> U+001a entries are control character rotation for DOS.
I think it should be ignored.
I think, roundtrip side should be changed.
0x25 entry should be U+0025 on IBM864.map
Add `0x25 U+066a` into IBM864.c2b
Modify test/jdk/sun/nio/cs/mapping/Cp864.b2c for `0025 0025`
Add `0025 066a` into test/jdk/sun/nio/cs/mapping/Cp864.c2b-irreversible
This issue just for U+0025, but f possible, please add `0x9f, 0xd7, 0xd8, 0xf1` entries.
-------------
PR: https://git.openjdk.org/jdk/pull/9661
More information about the i18n-dev
mailing list