<i18n dev> RFR: 8303018: Unicode Emoji Properties [v2]
Roger Riggs
rriggs at openjdk.org
Tue Mar 14 21:46:46 UTC 2023
On Tue, 14 Mar 2023 15:49:56 GMT, Naoto Sato <naoto at openjdk.org> wrote:
>> Proposing accessor methods to Emoji properties defined in [Unicode Technical Standard #51](https://unicode.org/reports/tr51/) in `java.lang.Character` class. This is per a request from the client group, as well as refining the currently existing ad-hoc emoji implementation in regex. A CSR has also been drafted, and I would appreciate reviews for it too.
>
> Naoto Sato has updated the pull request incrementally with one additional commit since the last revision:
>
> Fixed method descriptions
Looks good.
There are opportunities to modernize the code, but maybe separately.
make/jdk/src/classes/build/tools/generatecharacter/EmojiData.java line 99:
> 97: case "Emoji_Component" -> EMOJI_COMPONENT;
> 98: case "Extended_Pictographic" -> EXTENDED_PICTOGRAPHIC;
> 99: default -> throw new InternalError();
It would be useful to include the "type" as the exception argument. It give some idea as to the corruption or missing case.
make/jdk/src/classes/build/tools/generatecharacter/GenerateCharacter.java line 214:
> 212: maskEmojiModifierBase = 0x020000000000L,
> 213: maskEmojiComponent = 0x040000000000L,
> 214: maskExtendedPictographic = 0x080000000000L;
It would be good to leverage a common definition (perhaps a bit number) here and in EmojiData.java
and build the constants with <<< shifts.
make/jdk/src/classes/build/tools/generatecharacter/GenerateCharacter.java line 810:
> 808: if (x.equals("maskEmojiModifierBase")) return "0x" + hex4(maskEmojiModifierBase >> 32);
> 809: if (x.equals("maskEmojiComponent")) return "0x" + hex4(maskEmojiComponent >> 32);
> 810: if (x.equals("maskExtendedPictographic")) return "0x" + hex4(maskExtendedPictographic >> 32);
An upgrade would be to modify hex4(), hexNN() to use `HexFormat.of().toUpperCase().toHexDigits((short)xxx)`
The HexFormat is reusable and would avoid creating extra strings.
Perhaps also create a method that combines the repetitive shift and prefixing.
This if...then... sequence could be an expression switch (x) {...}.
-------------
Marked as reviewed by rriggs (Reviewer).
PR: https://git.openjdk.org/jdk/pull/13006
More information about the i18n-dev
mailing list