<i18n dev> RFR: 8041488: Locale-Dependent List Patterns [v12]

Naoto Sato naoto at openjdk.org
Tue Sep 5 19:53:44 UTC 2023


On Sun, 3 Sep 2023 05:05:49 GMT, Joe Wang <joehw at openjdk.org> wrote:

>> Naoto Sato has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Removing unnecessary commas
>
> src/java.base/share/classes/java/text/ListFormat.java line 46:
> 
>> 44:  * defined in Unicode Consortium's LDML specification for
>> 45:  * <a href="https://www.unicode.org/reports/tr35/tr35-general.html#ListPatterns">
>> 46:  * List Patterns</a>.
> 
> The main function, it seems to me, is to change the representation from one form to another. what would you think about the following:
> 
> The {@code ListFormat} class is a tool for converting a list of strings to a text representation and vice versa in a locale-sensitive way. It transforms strings to text in accordance with the List Patterns (link) as defined in Unicode Consortium's LDML specification. For example, it can be used to format a list of 3 weekdays, i.e. "Monday", "Wednesday", "Friday", as "Monday, Wednesday, and Friday" in an inclusive list pattern.

Thanks. Will modify the wording in the next revision. I think we should stick to the wording `format`/`parse` here.

> src/java.base/share/classes/java/text/ListFormat.java line 48:
> 
>> 46:  * List Patterns</a>.
>> 47:  * <p>
>> 48:  * Three types of concatenation are provided: {@link Type#STANDARD STANDARD},
> 
> A "Type" and "Style" together make up a specific pattern. It might be good to introduce the term "List Patterns" here first, that is, moving the introduction of patterns to the class description from the 1-arg getInstance method. once we have the terms established, we can then delve into the specific cases "types" and "styles" represent. Something like:
> 
> <h2>List Patterns</h2>
> List Patterns are rules that define how a series or list is formed ... (include the description for the getInstance(String[] patterns) here)
> 
> <h2>Standard Patterns</h2>
> {@code ListFormat} supports a few pre-defined patterns with a combination of Type (link) and Style(link). Types and Styles are defined as follows.
> 
> <h3>Type</h3>
> {@link Type#STANDARD STANDARD}: a simple list with conjunction "and";
> 
> ...
> 
> <h3>Style</h3>
> {@link Style#FULL FULL}: uses the conjunction word such as "and";
> 
> {@link Style#SHORT SHORT}: uses the shorthand of the conjunction word, "&" (ampersand) for "and" for example;
> 
> {@link Style#NARROW NARROW}: uses no conjunction word.
> 
> For example, a combination of {@link Type#STANDARD STANDARD} and {@link Style#FULL FULL} forms an inclusive list pattern.

I think that Type/Style/Locale forming a specific pattern is an implementation detail, so I would not describe it in the spec (although as you say they form a specific pattern in the impl). It could be that an impl of 3-arg getInstance() can be independent of patterns described in 1-arg getInstance().

> src/java.base/share/classes/java/text/ListFormat.java line 521:
> 
>> 519:         var sb = new StringBuilder(256).append(patterns[START]);
>> 520:         IntStream.range(2, count - 1).forEach(i -> sb.append(middleBetween).append("{").append(i).append("}"));
>> 521:         sb.append(patterns[END].replaceFirst("\\{0}", "").replaceFirst("\\{1}", "\\{" + (count - 1) + "\\}"));
> 
> From what it looks, it could be a concern for potentially adding large number of long strings with a list of small items. I don't seem to see where the input is limited.

Good point. Will add some kind of limitation.

> src/java.base/share/classes/java/text/ListFormat.java line 560:
> 
>> 558:          * The {@code UNIT} ListFormat style. This style concatenates
>> 559:          * elements, useful for enumerating units.
>> 560:          */
> 
> The word "style" used in Type, I assume you meant "type"? Just that it might be confused with Style below.
> 
> Same as previous comments, a combination of Type and Style, if I understand correctly, forms a specific pattern. I might say something about it in the enum class description.
> 
> A STANDARD type then is a simple list with conjunction "and", or an inclusive list, and etc.

Ah, good catch. Will correct those style/type typos.

> A STANDARD type then is a simple list with conjunction "and", or an inclusive list, and etc.

I could not quite catch what you meant by this. Can you please elaborate on it more?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/15130#discussion_r1316330290
PR Review Comment: https://git.openjdk.org/jdk/pull/15130#discussion_r1316330343
PR Review Comment: https://git.openjdk.org/jdk/pull/15130#discussion_r1316330443
PR Review Comment: https://git.openjdk.org/jdk/pull/15130#discussion_r1316330402


More information about the i18n-dev mailing list