<i18n dev> RFR: 8291660: Grapheme support in BreakIterator [v4]
Naoto Sato
naoto at openjdk.org
Thu Sep 8 16:14:30 UTC 2022
On Wed, 7 Sep 2022 21:27:10 GMT, Stuart Marks <smarks at openjdk.org> wrote:
>> Naoto Sato has updated the pull request incrementally with one additional commit since the last revision:
>>
>> Changed the paragraph to @implSpec
>
> src/java.base/share/classes/jdk/internal/util/regex/Grapheme.java line 47:
>
>> 45: */
>> 46: public static int nextBoundary(CharSequence src, int off, int limit) {
>> 47: Objects.checkFromToIndex(0, limit - off, src.length());
>
> Is this right? The old code's use of `checkFromToIndex` method seems to be the right way to check that `off` and `limit` are a valid from-to range within `[0, src.length)`. The new code subtracts `off` from both args but the arithmetic seems to allow for some errors. For example, depending on the value of `limit`, this might permit `off` to be a small negative number.
Thanks for the catch! Yes, this was a leftover before I fixed a couple of JCK failures which correctly fixed edge cases. Reverted the change.
> src/java.base/share/classes/sun/util/locale/provider/BreakIteratorProviderImpl.java line 135:
>
>> 133: public BreakIterator getCharacterInstance(Locale locale) {
>> 134: return new GraphemeBreakIterator();
>> 135: }
>
> It looks like there is some kind of table Since CHARACTER_INDEX is no longer used, does it mean there is now dead code for the CHARACTER break iterator class, and dead resources for CharacterData and CharacterDictionary? Should this be removed? Or maybe this is all in each locale or something and should be cleaned up later?
Right. Removed the now-dead code.
-------------
PR: https://git.openjdk.org/jdk/pull/9991
More information about the i18n-dev
mailing list