<i18n dev> RFR: 8041488: Locale-Dependent List Patterns [v12]

Joe Wang joehw at openjdk.org
Sun Sep 3 05:56:57 UTC 2023


On Tue, 29 Aug 2023 16:51:49 GMT, Naoto Sato <naoto at openjdk.org> wrote:

>> Introducing a new formatting class for locale-dependent list patterns. The class is to provide the functionality from the Unicode Consortium's LDML specification for [list patterns](https://www.unicode.org/reports/tr35/tr35-general.html#ListPatterns). For example, given a list of String as "Monday", "Wednesday", "Friday", its `format` method would produce "Monday, Wednesday, and Friday" in US English. A CSR has also been drafted, and its draft javadoc can be viewed here: https://cr.openjdk.org/~naoto/JDK-8041488-ListPatterns-PR/api.00/java.base/java/text/ListFormat.html
>
> Naoto Sato has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Removing unnecessary commas

src/java.base/share/classes/java/text/ListFormat.java line 46:

> 44:  * defined in Unicode Consortium's LDML specification for
> 45:  * <a href="https://www.unicode.org/reports/tr35/tr35-general.html#ListPatterns">
> 46:  * List Patterns</a>.

The main function, it seems to me, is to change the representation from one form to another. what would you think about the following:

The {@code ListFormat} class is a tool for converting a list of strings to a text representation and vice versa in a locale-sensitive way. It transforms strings to text in accordance with the List Patterns (link) as defined in Unicode Consortium's LDML specification. For example, it can be used to format a list of 3 weekdays, i.e. "Monday", "Wednesday", "Friday", as "Monday, Wednesday, and Friday" in an inclusive list pattern.

src/java.base/share/classes/java/text/ListFormat.java line 48:

> 46:  * List Patterns</a>.
> 47:  * <p>
> 48:  * Three types of concatenation are provided: {@link Type#STANDARD STANDARD},

A "Type" and "Style" together make up a specific pattern. It might be good to introduce the term "List Patterns" here first, that is, moving the introduction of patterns to the class description from the 1-arg getInstance method. once we have the terms established, we can then delve into the specific cases "types" and "styles" represent. Something like:

<h2>List Patterns</h2>
List Patterns are rules that define how a series or list is formed ... (include the description for the getInstance(String[] patterns) here)

<h2>Standard Patterns</h2>
{@code ListFormat} supports a few pre-defined patterns with a combination of Type (link) and Style(link). Types and Styles are defined as follows.

<h3>Type</h3>
{@link Type#STANDARD STANDARD}: a simple list with conjunction "and";

...

<h3>Style</h3>
{@link Style#FULL FULL}: uses the conjunction word such as "and";

{@link Style#SHORT SHORT}: uses the shorthand of the conjunction word, "&" (ampersand) for "and" for example;

{@link Style#NARROW NARROW}: uses no conjunction word.

For example, a combination of {@link Type#STANDARD STANDARD} and {@link Style#FULL FULL} forms an inclusive list pattern.

src/java.base/share/classes/java/text/ListFormat.java line 521:

> 519:         var sb = new StringBuilder(256).append(patterns[START]);
> 520:         IntStream.range(2, count - 1).forEach(i -> sb.append(middleBetween).append("{").append(i).append("}"));
> 521:         sb.append(patterns[END].replaceFirst("\\{0}", "").replaceFirst("\\{1}", "\\{" + (count - 1) + "\\}"));

>From what it looks, it could be a concern for potentially adding large number of long strings with a list of small items. I don't seem to see where the input is limited.

src/java.base/share/classes/java/text/ListFormat.java line 560:

> 558:          * The {@code UNIT} ListFormat style. This style concatenates
> 559:          * elements, useful for enumerating units.
> 560:          */

The word "style" used in Type, I assume you meant "type"? Just that it might be confused with Style below.

Same as previous comments, a combination of Type and Style, if I understand correctly, forms a specific pattern. I might say something about it in the enum class description.

A STANDARD type then is a simple list with conjunction "and", or an inclusive list, and etc.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/15130#discussion_r1314106218
PR Review Comment: https://git.openjdk.org/jdk/pull/15130#discussion_r1314109373
PR Review Comment: https://git.openjdk.org/jdk/pull/15130#discussion_r1314116397
PR Review Comment: https://git.openjdk.org/jdk/pull/15130#discussion_r1314111702


More information about the i18n-dev mailing list