From y.umaoka at gmail.com Tue Sep 1 09:11:26 2009 From: y.umaoka at gmail.com (Yoshito Umaoka) Date: Tue, 1 Sep 2009 12:11:26 -0400 Subject: [loc-en-dev] variant field casing In-Reply-To: <4A816D52.2020705@sun.com> References: <4A56364C.3000507@gmail.com> <4A5AF14B.8030700@sun.com> <4A64B02B.2050405@gmail.com> <4A6EBCD6.3050007@sun.com> <4A6EDFE2.3040202@gmail.com> <4A6FBA18.4030208@sun.com> <4A7F89FD.7020307@gmail.com> <4A816D52.2020705@sun.com> Message-ID: <1c8828620909010911t42a7344bpd7abfb15c55d9b0d@mail.gmail.com> >>> Let me ask another question. If you do: >>> >>> Locale locale = new >>> Locale.Builder().setLanguage("es").setRegion("ES").setVariant("Traditional_WIN").create(); >>> System.out.println(locale.getVariant()); >>> >>> what will that produce? ("Traditional_WIN" is an example from the Locale >>> API doc.) >>> >> The idea is to throw an exception for variant "Traditional_WIN" and >> clearly document that builder only accept variant value(s) which satisfy the >> BCP47 variant syntax. > > What if someone wants to use ISO Language Code "he" without making changes > to variant names? The current code might be: > > ?Locale locale = new Locale("iw", "IL", "MyVariant"); > > Then, the user might want to change this one to: > > ?Locale locale = new > Locale.Builder().setLanguage("he")....setVariant("MyVariant").create(); > > But if the new one throws an exception, the user needs to stay with "iw". > > I think we need to have a separate variant to allow users to migrate from > the constructor to the Locale.Builder without changing other things. > > Thanks, > Masayoshi > This may introduce a hole in the new Builder design. We agreed not to support new fields through constructors. We wanted to make sure individual field specified in Builder's setter method satisfies its syntax requirement. The current design contract is to support BCP47 syntax. So a Locale created by the Builder is always well-formed in term of BCP47. Even we separate Java variant from BCP47 variant, Java variant cannot be mapped to any fields in BCP47. setLanguage("he") is a different story. Both "he" and "iw" satisfy the syntax requirement and setLanguage won't throw an exception for setLanguage("he") / setLanguage("iw"). When Builder#create() is invoked, "he" is mapped to "iw" internally and Locale#getLanguage() returns "iw" always. Strictly speaking, variant field in BCP47 is used for additional semantics to a language. Java variant such as Traditional_WIN is semantically mapped to BCP47 extension or private use, because it extends language tag for use in applications. If we really want to support free formed Java variant, we could remove the syntax check in Builder#setVariant. We can still create a Locale, if user want to convert the Locale to BCP47 language tag, such variant field value will be dropped. For example - Locale locale = new Locale.Builder().setLanguage("es") .setRegion("ES").setVariant("Traditional_WIN").create(); will create a Locale - es_ES_Traditional_WIN However String tag = locale.toLanguageTag(); returns "es-ES". Also, Locale locale1 = new Locale.Builder().setLanguageTag( "es-ES-Traditonal-WIN"); will throw IllformedLocaleException. When existing Java variant satisfies BCP47 variant syntax, I think toLanguageTag should still map the variant value to BCP47 variant. For example, Locale locale2 = new Locale("el", "GR", "monoton"); String tag2 = locale2.toLanguageTag(); returns "el-GR-monoton", which is a valid BCP47 language tag ("monoton" is registered in the IANA registry). For the next example, Locale locale3 = new Locale("en", "US", "POSIX"); String tag3 = locale3.toLanguageTag(); "POSIX" satisfies the BCP47 variant syntax. The current proposal is to lower case the variant in a language tag. But we could keep the casing - "en-US-POSIX", although BCP47 recommend lower case letters for variant. If the language tag was consumed by other apps, "POSIX" might be normalized to "posix", but if the code is consumed within Java, it can round trip and you can get the exact same locale. In summary - 1. Builder#setVariant(String) does not check the input - will never through an exception. 2. Locale#toLanguageTag handles variant field which does not satisfy the BCP47 variant syntax (5*8alphanum / (DIGIT 3alphanum) as an error (as the result, variant and following fields are stripped off). 3. Locale#toLanguageTag maps Java variant to BCP47 variant if it satisfies the syntax requirement. But it won't change the casing. How about this approach? -Yoshito From y.umaoka at gmail.com Tue Sep 8 10:07:33 2009 From: y.umaoka at gmail.com (Yoshito Umaoka) Date: Tue, 08 Sep 2009 13:07:33 -0400 Subject: [loc-en-dev] Revised Locale API proposal Message-ID: <4AA68F55.4070801@gmail.com> Hi all, I reviewed the proposed API docs and updated followings - 1. Locale class description: Reorganized and provided more comprehensive explanation for logical locale fields. 2. Variant: Removed syntactical restrictions on variant, provided some note about BCP47 compatibility. 3. Mics changes: Consistent terminology, removed "New API", etc... I'm attaching a zip file in this message. Please take a look if this version is good enough for submitting a proposal to OpenJDK. I'll add some explanation about the updated behavior for candidate list in ResourceBundle.Control and LocaleServiceProvider, but these are behavior changes, not adding/changing APIs. -Yoshito -------------- next part -------------- A non-text attachment was scrubbed... Name: LocaleAPI0908.zip Type: application/x-zip-compressed Size: 23745 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20090908/da175b2b/attachment-0001.bin From y.umaoka at gmail.com Tue Sep 8 10:09:15 2009 From: y.umaoka at gmail.com (Yoshito Umaoka) Date: Tue, 08 Sep 2009 17:09:15 +0000 Subject: [loc-en-dev] =?windows-1252?q?Invitation=3A_OpenJDK_Locale_Enhanc?= =?windows-1252?q?ement_Project_Bi-Weely_Meeting_=40_Tue_Sep_8_7=3A?= =?windows-1252?q?30pm_=96_8=3A15pm_=28locale-enhancement-dev=40ope?= =?windows-1252?q?njdk=2Ejava=2Enet=29?= Message-ID: <00163630ffd9a86811047314069c@google.com> locale-enhancement-dev at openjdk.java.net, you are invited to Title: OpenJDK Locale Enhancement Project Bi-Weely Meeting Time: Tue Sep 8 7:30pm ? 8:15pm (Timezone: Eastern Time) Where: Teleconference Calendar: locale-enhancement-dev at openjdk.java.net Owner/Creator: y.umaoka at gmail.com Description: Toll free from the US and Canada: (877)421-0033 International dialing: +1(770)615-1250 Toll free from Japan (KDD): 00531-11-3180 Toll free from Japan (Cable&Wireless): 0066-33-801263 Toll free from Japan (Softbank): 0044-22-112668 Toll free from Japan (NTT): 0034-800-900155 Other country dial-ins available on request Passcode: 662122# Meeting agenda page: http://sites.google.com/site/openjdklocale/meeting-agenda You can view this event at http://www.google.com/calendar/event?action=VIEW&eid=OWdpNjZ2YmFhcmp2NGJlOTZvNDQzYXFia2cgbG9jYWxlLWVuaGFuY2VtZW50LWRldkBvcGVuamRrLmphdmEubmV0&tok=MTgjeS51bWFva2FAZ21haWwuY29tNjQ4ODg4MDcwYjdiOGQ1OWVjMzZjYzk3ZDc3YjZlNDY1ZjAzMmNlYw&ctz=America%2FNew_York&hl=en You are receiving this courtesy email at the account locale-enhancement-dev at openjdk.java.net because you are an attendee of this event. To stop receiving future notifications for this event, decline this event. Alternatively you can sign up for a Google account at http://www.google.com/calendar/ and control your notification settings for your entire calendar. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20090908/7a8bff43/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/calendar Size: 1572 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20090908/7a8bff43/attachment.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: invite.ics Type: application/ics Size: 1606 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20090908/7a8bff43/attachment-0001.bin From y.umaoka at gmail.com Tue Sep 8 10:09:47 2009 From: y.umaoka at gmail.com (Yoshito Umaoka) Date: Tue, 08 Sep 2009 17:09:47 +0000 Subject: [loc-en-dev] =?windows-1252?q?Invitation=3A_OpenJDK_Locale_Enhanc?= =?windows-1252?q?ement_Project_Bi-Weely_Meeting_=40_Tue_Sep_22_7?= =?windows-1252?q?=3A30pm_=96_8=3A15pm_=28locale-enhancement-dev=40?= =?windows-1252?q?openjdk=2Ejava=2Enet=29?= Message-ID: <001485f7c3a498a9c0047314086a@google.com> locale-enhancement-dev at openjdk.java.net, you are invited to Title: OpenJDK Locale Enhancement Project Bi-Weely Meeting Time: Tue Sep 22 7:30pm ? 8:15pm (Timezone: Eastern Time) Where: Teleconference Calendar: locale-enhancement-dev at openjdk.java.net Owner/Creator: y.umaoka at gmail.com Description: Toll free from the US and Canada: (877)421-0033 International dialing: +1(770)615-1250 Toll free from Japan (KDD): 00531-11-3180 Toll free from Japan (Cable&Wireless): 0066-33-801263 Toll free from Japan (Softbank): 0044-22-112668 Toll free from Japan (NTT): 0034-800-900155 Other country dial-ins available on request Passcode: 662122# Meeting agenda page: http://sites.google.com/site/openjdklocale/meeting-agenda You can view this event at http://www.google.com/calendar/event?action=VIEW&eid=NWtmb3Q5bm00M2J2bGFrdW1obmdmMDljb2sgbG9jYWxlLWVuaGFuY2VtZW50LWRldkBvcGVuamRrLmphdmEubmV0&tok=MTgjeS51bWFva2FAZ21haWwuY29tMWY0NmFkMjViYmVkOWJlY2I4OTY1NDkwNDRiOTAzMDAxNGI5ZTI3MQ&ctz=America%2FNew_York&hl=en You are receiving this courtesy email at the account locale-enhancement-dev at openjdk.java.net because you are an attendee of this event. To stop receiving future notifications for this event, decline this event. Alternatively you can sign up for a Google account at http://www.google.com/calendar/ and control your notification settings for your entire calendar. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20090908/9d37e6c2/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/calendar Size: 1572 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20090908/9d37e6c2/attachment.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: invite.ics Type: application/ics Size: 1606 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20090908/9d37e6c2/attachment-0001.bin From y.umaoka at gmail.com Tue Sep 8 10:10:10 2009 From: y.umaoka at gmail.com (Yoshito Umaoka) Date: Tue, 08 Sep 2009 17:10:10 +0000 Subject: [loc-en-dev] =?windows-1252?q?Invitation=3A_OpenJDK_Locale_Enhanc?= =?windows-1252?q?ement_Project_Bi-Weely_Meeting_=40_Tue_Oct_6_7=3A?= =?windows-1252?q?30pm_=96_8=3A15pm_=28locale-enhancement-dev=40ope?= =?windows-1252?q?njdk=2Ejava=2Enet=29?= Message-ID: <0016e646049afd0abd047314096a@google.com> locale-enhancement-dev at openjdk.java.net, you are invited to Title: OpenJDK Locale Enhancement Project Bi-Weely Meeting Time: Tue Oct 6 7:30pm ? 8:15pm (Timezone: Eastern Time) Where: Teleconference Calendar: locale-enhancement-dev at openjdk.java.net Owner/Creator: y.umaoka at gmail.com Description: Toll free from the US and Canada: (877)421-0033 International dialing: +1(770)615-1250 Toll free from Japan (KDD): 00531-11-3180 Toll free from Japan (Cable&Wireless): 0066-33-801263 Toll free from Japan (Softbank): 0044-22-112668 Toll free from Japan (NTT): 0034-800-900155 Other country dial-ins available on request Passcode: 662122# Meeting agenda page: http://sites.google.com/site/openjdklocale/meeting-agenda You can view this event at http://www.google.com/calendar/event?action=VIEW&eid=bWNqbTV2NWc1NHMxbGZ0Y3Z0aWVrZDY3OGsgbG9jYWxlLWVuaGFuY2VtZW50LWRldkBvcGVuamRrLmphdmEubmV0&tok=MTgjeS51bWFva2FAZ21haWwuY29tZWFlZjEyZDczMTZkMWFhYTYxZDAzNzI0ZTZlOTFmZjg5YTU3MWQ5ZQ&ctz=America%2FNew_York&hl=en You are receiving this courtesy email at the account locale-enhancement-dev at openjdk.java.net because you are an attendee of this event. To stop receiving future notifications for this event, decline this event. Alternatively you can sign up for a Google account at http://www.google.com/calendar/ and control your notification settings for your entire calendar. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20090908/b03846fd/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/calendar Size: 1572 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20090908/b03846fd/attachment-0002.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: invite.ics Type: application/ics Size: 1606 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20090908/b03846fd/attachment-0003.bin From y.umaoka at gmail.com Tue Sep 8 10:10:36 2009 From: y.umaoka at gmail.com (Yoshito Umaoka) Date: Tue, 08 Sep 2009 17:10:36 +0000 Subject: [loc-en-dev] =?windows-1252?q?Invitation=3A_OpenJDK_Locale_Enhanc?= =?windows-1252?q?ement_Project_Bi-Weely_Meeting_=40_Tue_Oct_20_7?= =?windows-1252?q?=3A30pm_=96_8=3A15pm_=28locale-enhancement-dev=40?= =?windows-1252?q?openjdk=2Ejava=2Enet=29?= Message-ID: <00163623a7a587651a0473140b42@google.com> locale-enhancement-dev at openjdk.java.net, you are invited to Title: OpenJDK Locale Enhancement Project Bi-Weely Meeting Time: Tue Oct 20 7:30pm ? 8:15pm (Timezone: Eastern Time) Where: Teleconference Calendar: locale-enhancement-dev at openjdk.java.net Owner/Creator: y.umaoka at gmail.com Description: Toll free from the US and Canada: (877)421-0033 International dialing: +1(770)615-1250 Toll free from Japan (KDD): 00531-11-3180 Toll free from Japan (Cable&Wireless): 0066-33-801263 Toll free from Japan (Softbank): 0044-22-112668 Toll free from Japan (NTT): 0034-800-900155 Other country dial-ins available on request Passcode: 662122# Meeting agenda page: http://sites.google.com/site/openjdklocale/meeting-agenda You can view this event at http://www.google.com/calendar/event?action=VIEW&eid=cTFrbjZyNWFsb2NxaGxoNWwzdjJ0ZHBlMmsgbG9jYWxlLWVuaGFuY2VtZW50LWRldkBvcGVuamRrLmphdmEubmV0&tok=MTgjeS51bWFva2FAZ21haWwuY29tZjIwMThkMjNkNmIwZDk0NDVjMWJjMDRiMmM2MDE3MDQ3Mjk4MjBkZA&ctz=America%2FNew_York&hl=en You are receiving this courtesy email at the account locale-enhancement-dev at openjdk.java.net because you are an attendee of this event. To stop receiving future notifications for this event, decline this event. Alternatively you can sign up for a Google account at http://www.google.com/calendar/ and control your notification settings for your entire calendar. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20090908/933ebae6/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/calendar Size: 1572 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20090908/933ebae6/attachment.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: invite.ics Type: application/ics Size: 1606 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20090908/933ebae6/attachment-0001.bin From y.umaoka at gmail.com Tue Sep 8 10:25:53 2009 From: y.umaoka at gmail.com (Yoshito Umaoka) Date: Tue, 08 Sep 2009 13:25:53 -0400 Subject: [loc-en-dev] Revised Locale API proposal In-Reply-To: <4AA68F55.4070801@gmail.com> References: <4AA68F55.4070801@gmail.com> Message-ID: <4AA693A1.5040905@gmail.com> Yoshito Umaoka wrote: > Hi all, > > I reviewed the proposed API docs and updated followings - > > 1. Locale class description: Reorganized and provided more > comprehensive explanation for logical locale fields. > 2. Variant: Removed syntactical restrictions on variant, provided some > note about BCP47 compatibility. > 3. Mics changes: Consistent terminology, removed "New API", etc... > > I'm attaching a zip file in this message. Please take a look if this > version is good enough for submitting a proposal to OpenJDK. > > I'll add some explanation about the updated behavior for candidate > list in ResourceBundle.Control and LocaleServiceProvider, but these > are behavior changes, not adding/changing APIs. > > -Yoshito In the conf call today, we'll go through the updated API proposal attached in the previous message. Especially, I would like to discuss about the variant field design change (which also affect the API contract in Builder). Also, with this change (no syntactical restrictions on variant), we probably need to revisit "toString()" topic. We thought toLanguageTag() could be used as Locale ID for valid locales, but with no restrictions on variant, this is no longer true. I'm wondering if we need to revisit "toString()" - more specifically, write out all new Locale internal field values after variant field. For example, en_US_POSIX_script=Latn_ext=a-xxx-x-java-1-7. Thanks, Yoshito From mark at macchiato.com Tue Sep 8 10:51:29 2009 From: mark at macchiato.com (=?UTF-8?B?TWFyayBEYXZpcyDijJs=?=) Date: Tue, 8 Sep 2009 10:51:29 -0700 Subject: [loc-en-dev] Revised Locale API proposal In-Reply-To: <4AA68F55.4070801@gmail.com> References: <4AA68F55.4070801@gmail.com> Message-ID: <30b660a20909081051o2358f98ag8b59e5ff4f6ea5f@mail.gmail.com> General: We should not be directly referencing ISO standards, because they are unstable. We can often reference BCP 47 and the iana registry, but it is more accurate to reference Unicode Locale Identifiers (LDML). Examples: === getISOCountries public static java.lang.String[] *getISOCountries*() Returns a list of all 2-letter country codes defined in ISO 3166. Can be used to create Locales. Should be: returns all region codes in the iana registry, including both 2 letter ISO codes (current and deprecated) and 3 digit UN M.49 codes. (Either that or we need to add a "real" method to get all the valid ones.) getISOLanguages public static java.lang.String[] *getISOLanguages*() Returns a list of all 2-letter language codes defined in ISO 639. Can be used to create Locales. [NOTE: ISO 639 is not a stable standard-- some languages' codes have changed. The list this function returns includes both the new and the old codes for the languages whose codes have changed.] Should be: returns a list of all primary language codes defined in the iana registry (includes 2 and 3 letter ISO 639 codes). (Either that or we need to add a "real" method to get all the valid ones.) getVariant public java.lang.String *getVariant*() Returns the variant code for this locale, which should either be the empty string or a conforming BCP47 variant string. What about variants like "rozaj_biske"? [see http://www.iana.org/assignments/language-subtag-registry] Does this method return only the first subtag, or all? [My preference would be all, in alphabetical order.] Similar issue for LocaleBuilder.setVariant Mark On Tue, Sep 8, 2009 at 10:07, Yoshito Umaoka wrote: > Hi all, > > I reviewed the proposed API docs and updated followings - > > 1. Locale class description: Reorganized and provided more comprehensive > explanation for logical locale fields. > 2. Variant: Removed syntactical restrictions on variant, provided some note > about BCP47 compatibility. > 3. Mics changes: Consistent terminology, removed "New API", etc... > > I'm attaching a zip file in this message. Please take a look if this > version is good enough for submitting a proposal to OpenJDK. > > I'll add some explanation about the updated behavior for candidate list in > ResourceBundle.Control and LocaleServiceProvider, but these are behavior > changes, not adding/changing APIs. > > -Yoshito > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20090908/88d05e68/attachment.html From dougfelt at google.com Tue Sep 8 15:13:40 2009 From: dougfelt at google.com (Doug Felt) Date: Tue, 8 Sep 2009 15:13:40 -0700 Subject: [loc-en-dev] Revised Locale API proposal In-Reply-To: <30b660a20909081051o2358f98ag8b59e5ff4f6ea5f@mail.gmail.com> References: <4AA68F55.4070801@gmail.com> <30b660a20909081051o2358f98ag8b59e5ff4f6ea5f@mail.gmail.com> Message-ID: <146f39a80909081513w13770216j4af4282735ada88e@mail.gmail.com> Looks good to me. I can edit the text for readability in some places, but there's nothing significant. I suggest not documenting NullPointerExceptions, but instead only documenting places where null is _accepted_ as a value, since you have to document what it means anyway, and this is in general rarer. Other comments:: - Do we want LocaleNameProvider to be able to return a display name for a full locale? It seems eventually data could provide more natural display names. - The Builder idiom (from Josh Bloch) uses 'build' instead of 'create' as the name for the method that returns the product of the builder. We should probably follow this. Doug On Tue, Sep 8, 2009 at 10:51 AM, Mark Davis ? wrote: > General: We should not be directly referencing ISO standards, because they > are unstable. We can often reference BCP 47 and the iana registry, but it is > more accurate to reference Unicode Locale Identifiers (LDML). > > Examples: > > === > > getISOCountries > > public static java.lang.String[] getISOCountries() > > Returns a list of all 2-letter country codes defined in ISO 3166. Can be > used to create Locales. > Should be: returns all region codes in the iana registry, including both 2 > letter ISO codes (current and deprecated) and 3 digit UN M.49 codes. (Either > that or we need to add a "real" method to get all the valid ones.) > > getISOLanguages > > public static java.lang.String[] getISOLanguages() > > Returns a list of all 2-letter language codes defined in ISO 639. Can be > used to create Locales. [NOTE: ISO 639 is not a stable standard-- some > languages' codes have changed. The list this function returns includes both > the new and the old codes for the languages whose codes have changed.] > > Should be: returns a list of all primary language codes defined in the iana > registry (includes 2 and 3 letter ISO 639 codes). (Either that or we need to > add a "real" method to get all the valid ones.) > > getVariant > > public java.lang.String getVariant() > > Returns the variant code for this locale, which should either be the empty > string or a conforming BCP47 variant string. > What about variants like "rozaj_biske"? [see > http://www.iana.org/assignments/language-subtag-registry] Does this method > return only the first subtag, or all? [My preference would be all, in > alphabetical order.] Similar issue for LocaleBuilder.setVariant > > > Mark > > > On Tue, Sep 8, 2009 at 10:07, Yoshito Umaoka wrote: >> >> Hi all, >> >> I reviewed the proposed API docs and updated followings - >> >> 1. Locale class description: Reorganized and provided more comprehensive >> explanation for logical locale fields. >> 2. Variant: Removed syntactical restrictions on variant, provided some >> note about BCP47 compatibility. >> 3. Mics changes: Consistent terminology, removed "New API", etc... >> >> I'm attaching a zip file in this message. ?Please take a look if this >> version is good enough for submitting a proposal to OpenJDK. >> >> I'll add some explanation about the updated behavior for candidate list in >> ResourceBundle.Control and LocaleServiceProvider, but these are behavior >> changes, not adding/changing APIs. >> >> -Yoshito > > From srl at icu-project.org Tue Sep 8 15:38:05 2009 From: srl at icu-project.org (Steven R. Loomis) Date: Tue, 08 Sep 2009 15:38:05 -0700 Subject: [loc-en-dev] Revised Locale API proposal In-Reply-To: <146f39a80909081513w13770216j4af4282735ada88e@mail.gmail.com> References: <4AA68F55.4070801@gmail.com> <30b660a20909081051o2358f98ag8b59e5ff4f6ea5f@mail.gmail.com> <146f39a80909081513w13770216j4af4282735ada88e@mail.gmail.com> Message-ID: <4AA6DCCD.5090809@icu-project.org> Hello, I also mostly have readability comments, but they are minor. Here are a few items possibly worth discussing: - Locale.Builder: should 'initial, default state' say 'empty state'? We don't want to confuse it with the default locale. - same comment with Locale.Builder.setScript - if the script is the empty string, the documentation should say what happens (Does it really get the default script for the locale ? Or does it remain empty, with the /effect/ of being default?) - Locale.Builder.setExtension: I think 'x' and 'u' are swapped here, should be LDML_EXTENSION(u) and PRIVATE_USE_EXTENSION('x). Steven Doug Felt wrote: > Looks good to me. I can edit the text for readability in some places, > but there's nothing significant. I suggest not documenting > NullPointerExceptions, but instead only documenting places where null > is _accepted_ as a value, since you have to document what it means > anyway, and this is in general rarer. > > Other comments:: > - Do we want LocaleNameProvider to be able to return a display name > for a full locale? It seems eventually data could provide more > natural display names. > > - The Builder idiom (from Josh Bloch) uses 'build' instead of 'create' > as the name for the method that returns the product of the builder. > We should probably follow this. > > Doug > > On Tue, Sep 8, 2009 at 10:51 AM, Mark Davis ? wrote: > >> General: We should not be directly referencing ISO standards, because they >> are unstable. We can often reference BCP 47 and the iana registry, but it is >> more accurate to reference Unicode Locale Identifiers (LDML). >> >> Examples: >> >> === >> >> getISOCountries >> >> public static java.lang.String[] getISOCountries() >> >> Returns a list of all 2-letter country codes defined in ISO 3166. Can be >> used to create Locales. >> Should be: returns all region codes in the iana registry, including both 2 >> letter ISO codes (current and deprecated) and 3 digit UN M.49 codes. (Either >> that or we need to add a "real" method to get all the valid ones.) >> >> getISOLanguages >> >> public static java.lang.String[] getISOLanguages() >> >> Returns a list of all 2-letter language codes defined in ISO 639. Can be >> used to create Locales. [NOTE: ISO 639 is not a stable standard-- some >> languages' codes have changed. The list this function returns includes both >> the new and the old codes for the languages whose codes have changed.] >> >> Should be: returns a list of all primary language codes defined in the iana >> registry (includes 2 and 3 letter ISO 639 codes). (Either that or we need to >> add a "real" method to get all the valid ones.) >> >> getVariant >> >> public java.lang.String getVariant() >> >> Returns the variant code for this locale, which should either be the empty >> string or a conforming BCP47 variant string. >> What about variants like "rozaj_biske"? [see >> http://www.iana.org/assignments/language-subtag-registry] Does this method >> return only the first subtag, or all? [My preference would be all, in >> alphabetical order.] Similar issue for LocaleBuilder.setVariant >> >> >> Mark >> >> >> On Tue, Sep 8, 2009 at 10:07, Yoshito Umaoka wrote: >> >>> Hi all, >>> >>> I reviewed the proposed API docs and updated followings - >>> >>> 1. Locale class description: Reorganized and provided more comprehensive >>> explanation for logical locale fields. >>> 2. Variant: Removed syntactical restrictions on variant, provided some >>> note about BCP47 compatibility. >>> 3. Mics changes: Consistent terminology, removed "New API", etc... >>> >>> I'm attaching a zip file in this message. Please take a look if this >>> version is good enough for submitting a proposal to OpenJDK. >>> >>> I'll add some explanation about the updated behavior for candidate list in >>> ResourceBundle.Control and LocaleServiceProvider, but these are behavior >>> changes, not adding/changing APIs. >>> >>> -Yoshito >>> >> -- Steven R. Loomis srl at icu-project.org Technical Lead, ICU for C/C++ IBM San Jos? Globalization Center of Competency -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20090908/7ed0b0f0/attachment.html From y.umaoka at gmail.com Tue Sep 8 16:10:43 2009 From: y.umaoka at gmail.com (Yoshito Umaoka) Date: Tue, 08 Sep 2009 19:10:43 -0400 Subject: [loc-en-dev] Revised Locale API proposal In-Reply-To: <4AA6DCCD.5090809@icu-project.org> References: <4AA68F55.4070801@gmail.com> <30b660a20909081051o2358f98ag8b59e5ff4f6ea5f@mail.gmail.com> <146f39a80909081513w13770216j4af4282735ada88e@mail.gmail.com> <4AA6DCCD.5090809@icu-project.org> Message-ID: <4AA6E473.8060304@gmail.com> Steven R. Loomis wrote: > Hello, > I also mostly have readability comments, but they are minor. Here > are a few items possibly worth discussing: > > - Locale.Builder: should 'initial, default state' say 'empty > state'? We don't want to confuse it with the default locale. > > - same comment with Locale.Builder.setScript - if the script is the > empty string, the documentation should say what happens (Does it > really get the default script for the locale ? Or does it remain > empty, with the /effect/ of being default?) > > - Locale.Builder.setExtension: I think 'x' and 'u' are swapped > here, should be LDML_EXTENSION(u) and PRIVATE_USE_EXTENSION('x). > I was also wondering if we should clearly state "empty" instead of "default state". Doug actually provided Java doc comments for Builder and I made only some necessary changes / minor modifications. I was also considering if we really want to have non-empty default state in future, but I do not think we really need such. So I agree to use the term "empty" instead of "default state". Character key in setExtension is my simple mistake. I'll fix the problem. -Yoshito From y.umaoka at gmail.com Tue Sep 8 16:16:10 2009 From: y.umaoka at gmail.com (Yoshito Umaoka) Date: Tue, 08 Sep 2009 19:16:10 -0400 Subject: [loc-en-dev] Revised Locale API proposal In-Reply-To: <146f39a80909081513w13770216j4af4282735ada88e@mail.gmail.com> References: <4AA68F55.4070801@gmail.com> <30b660a20909081051o2358f98ag8b59e5ff4f6ea5f@mail.gmail.com> <146f39a80909081513w13770216j4af4282735ada88e@mail.gmail.com> Message-ID: <4AA6E5BA.4090607@gmail.com> Doug Felt wrote: > Looks good to me. I can edit the text for readability in some places, > but there's nothing significant. I suggest not documenting > NullPointerExceptions, but instead only documenting places where null > is _accepted_ as a value, since you have to document what it means > anyway, and this is in general rarer. > > Other comments:: > - Do we want LocaleNameProvider to be able to return a display name > for a full locale? It seems eventually data could provide more > natural display names. > > - The Builder idiom (from Josh Bloch) uses 'build' instead of 'create' > as the name for the method that returns the product of the builder. > We should probably follow this. > We once discuss about LocaleNameProvider to support the combined rule. The idea itself is valid, but not sure it actually fits the current SPI framework, which is based on a list of supported locales and it does not allow 3rd party's impl to override the Java's standard implementation for most common locales. For the build idiom, I do not have any preferences. I'm fine with "build". -Yoshito From y.umaoka at gmail.com Mon Sep 14 00:49:21 2009 From: y.umaoka at gmail.com (Yoshito Umaoka) Date: Mon, 14 Sep 2009 03:49:21 -0400 Subject: [loc-en-dev] Locale RFE Message-ID: <4AADF581.2070502@gmail.com> Hi folks, After the conference call last week and some additional sub discussions with Mark, Doug and Steven, I update the API specification. I'm attaching the latest proposal (API doc) in this message. Since the last revision, followings were updated. - toString() to include script/extensions after variant prefixed by "#". For example, "en_US_#Latn", "th_TH_TH_#nu-thai", etc. - toLanguageTag() to preserve variant which does not conform the BCP 47 syntax using private use with special subtag "jvariant". For example, Locale en_US_WIN will be transformed to a language tag "en-US-x-jvariant-WIN". - Builder to check variant syntax to be conformed to the BCP 47 variant by default. Another Builder constructor - Builder(boolean isLenientVariant) allows people who want to use any arbitrary variants. With isLenientVariant true, Buidler#setVariant will skip all syntax checks and never normalize the input value to lowercase letters. - The class description to refer the Unicode language and locale identifiers as the reference design of java.util.Locale - Clarify Locale and its methods won't check validity of each subtag values - only checks syntactical restrictions in Builder to support "well-formedness". Last week, we revisited toString() issues. Mark, Doug and myself really want to see all internal fields written by toString(). The proposal above should not break any practical code parsing toString() values (recognized as a segment of variant and usually discarded while processing). But if this is not really acceptable, we could give up this change and back up some explanation about this in the previous revision. Please take your time to review this revision and provide your comments by the end of Tuesday September 15. I could not finish some necessary changes in ResouceBundle / LocaleServiceProvider class (no API changes, but the description should be updated. I'm still struggling to select proper wording...) I'll provide this part by Tuesday morning. BTW, the new language tag RFC got the number assigned - http://www.rfc-editor.org/rfc/rfc5646.txt -Yoshito -------------- next part -------------- A non-text attachment was scrubbed... Name: LocaleAPI0914.zip Type: application/x-zip-compressed Size: 25331 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20090914/023b3ffe/attachment-0001.bin From y.umaoka at gmail.com Wed Sep 16 05:11:44 2009 From: y.umaoka at gmail.com (Yoshito Umaoka) Date: Wed, 16 Sep 2009 08:11:44 -0400 Subject: [loc-en-dev] [Fwd: [Fwd: Locale RFE]] Message-ID: <4AB0D600.3010306@gmail.com> I got a warning message from the ML, when I sent out the message below (originally with some attachments). I'm not sure the message actually got through or not - so I'm resending the message without attachment files. I posted all of update JavaDocs to the OpenJDK LE site - http://sites.google.com/site/openjdklocale/apis Thanks, Yoshito -------------------------------------------- Your mail to 'locale-enhancement-dev' with the subject [Fwd: Locale RFE] Is being held until the list moderator can review it for approval. The reason it is being held: Message body is too big: 183752 bytes with a limit of 40 KB Either the message will get posted to the list, or you will receive notification of the moderator's decision. If you would like to cancel this posting, please visit the following URL: http://mail.openjdk.java.net/mailman/confirm/locale-enhancement-dev/640d8d78f0b53681c9980c6523acc458d91d870e -------------------------------------------- -------- Original Message -------- Subject: [Fwd: Locale RFE] Date: Wed, 16 Sep 2009 07:57:39 -0400 From: Yoshito Umaoka To: locale-enhancement-dev at openjdk.java.net Hi folks, I updated the API specification of resource bundle/locale service lookup - ResourceBundle, ResourceBundle.Control and LocaleServiceProvider. There are no API signature changes proposed, but I updated following API descriptions to match the changes in Locale. ResourceBundle.html#getBundle(java.lang.String, java.util.Locale, java.lang.ClassLoader) ResourceBundle.Control.html#getCandidateLocales(java.lang.String, java.util.Locale) LocaleServiceProvider.html (class description) Note: - LocaleServiceProvider takes into account fallbacks within variant when variant has multiple subtags separated by underscore. I updated ResourceBundle to match the design. I think this is a right thing to do, but if you have any concerns, please provide your comments. - I did not include special handling of Locales with deprecated language code. We discussed about MyResource_he.class should be included in the lookup path when Locale "iw" is requested. This behavior does not quite fit to the design of ResourceBundle.Control. We could state this as the special JRE implementation only behavior, or extending ResourceBundle.Control to add a new API which returns "alternative" bundle name. But I start feeling that the feature does not make people happy - at least, introducing such a hacky spec would not match the outcome. If someone really need to use "he" for bundle naming for some reasons, he/she can still override Control#toBundleName to support bundle name with "he". It cannot support both, but I do not think there is a strong requirement. Please take a look at these updates and provide your feedback. Thanks, Yoshito -------- Original Message -------- Subject: Locale RFE Date: Mon, 14 Sep 2009 03:49:21 -0400 From: Yoshito Umaoka To: locale-enhancement-dev at openjdk.java.net Hi folks, After the conference call last week and some additional sub discussions with Mark, Doug and Steven, I update the API specification. I'm attaching the latest proposal (API doc) in this message. Since the last revision, followings were updated. - toString() to include script/extensions after variant prefixed by "#". For example, "en_US_#Latn", "th_TH_TH_#nu-thai", etc. - toLanguageTag() to preserve variant which does not conform the BCP 47 syntax using private use with special subtag "jvariant". For example, Locale en_US_WIN will be transformed to a language tag "en-US-x-jvariant-WIN". - Builder to check variant syntax to be conformed to the BCP 47 variant by default. Another Builder constructor - Builder(boolean isLenientVariant) allows people who want to use any arbitrary variants. With isLenientVariant true, Buidler#setVariant will skip all syntax checks and never normalize the input value to lowercase letters. - The class description to refer the Unicode language and locale identifiers as the reference design of java.util.Locale - Clarify Locale and its methods won't check validity of each subtag values - only checks syntactical restrictions in Builder to support "well-formedness". Last week, we revisited toString() issues. Mark, Doug and myself really want to see all internal fields written by toString(). The proposal above should not break any practical code parsing toString() values (recognized as a segment of variant and usually discarded while processing). But if this is not really acceptable, we could give up this change and back up some explanation about this in the previous revision. Please take your time to review this revision and provide your comments by the end of Tuesday September 15. I could not finish some necessary changes in ResouceBundle / LocaleServiceProvider class (no API changes, but the description should be updated. I'm still struggling to select proper wording...) I'll provide this part by Tuesday morning. BTW, the new language tag RFC got the number assigned - http://www.rfc-editor.org/rfc/rfc5646.txt -Yoshito From y.umaoka at gmail.com Wed Sep 16 04:57:39 2009 From: y.umaoka at gmail.com (Yoshito Umaoka) Date: Wed, 16 Sep 2009 07:57:39 -0400 Subject: [loc-en-dev] [Fwd: Locale RFE] Message-ID: <4AB0D2B3.3020209@gmail.com> Hi folks, I updated the API specification of resource bundle/locale service lookup - ResourceBundle, ResourceBundle.Control and LocaleServiceProvider. There are no API signature changes proposed, but I updated following API descriptions to match the changes in Locale. ResourceBundle.html#getBundle(java.lang.String, java.util.Locale, java.lang.ClassLoader) ResourceBundle.Control.html#getCandidateLocales(java.lang.String, java.util.Locale) LocaleServiceProvider.html (class description) Note: - LocaleServiceProvider takes into account fallbacks within variant when variant has multiple subtags separated by underscore. I updated ResourceBundle to match the design. I think this is a right thing to do, but if you have any concerns, please provide your comments. - I did not include special handling of Locales with deprecated language code. We discussed about MyResource_he.class should be included in the lookup path when Locale "iw" is requested. This behavior does not quite fit to the design of ResourceBundle.Control. We could state this as the special JRE implementation only behavior, or extending ResourceBundle.Control to add a new API which returns "alternative" bundle name. But I start feeling that the feature does not make people happy - at least, introducing such a hacky spec would not match the outcome. If someone really need to use "he" for bundle naming for some reasons, he/she can still override Control#toBundleName to support bundle name with "he". It cannot support both, but I do not think there is a strong requirement. Please take a look at these updates and provide your feedback. Thanks, Yoshito -------- Original Message -------- Subject: Locale RFE Date: Mon, 14 Sep 2009 03:49:21 -0400 From: Yoshito Umaoka To: locale-enhancement-dev at openjdk.java.net Hi folks, After the conference call last week and some additional sub discussions with Mark, Doug and Steven, I update the API specification. I'm attaching the latest proposal (API doc) in this message. Since the last revision, followings were updated. - toString() to include script/extensions after variant prefixed by "#". For example, "en_US_#Latn", "th_TH_TH_#nu-thai", etc. - toLanguageTag() to preserve variant which does not conform the BCP 47 syntax using private use with special subtag "jvariant". For example, Locale en_US_WIN will be transformed to a language tag "en-US-x-jvariant-WIN". - Builder to check variant syntax to be conformed to the BCP 47 variant by default. Another Builder constructor - Builder(boolean isLenientVariant) allows people who want to use any arbitrary variants. With isLenientVariant true, Buidler#setVariant will skip all syntax checks and never normalize the input value to lowercase letters. - The class description to refer the Unicode language and locale identifiers as the reference design of java.util.Locale - Clarify Locale and its methods won't check validity of each subtag values - only checks syntactical restrictions in Builder to support "well-formedness". Last week, we revisited toString() issues. Mark, Doug and myself really want to see all internal fields written by toString(). The proposal above should not break any practical code parsing toString() values (recognized as a segment of variant and usually discarded while processing). But if this is not really acceptable, we could give up this change and back up some explanation about this in the previous revision. Please take your time to review this revision and provide your comments by the end of Tuesday September 15. I could not finish some necessary changes in ResouceBundle / LocaleServiceProvider class (no API changes, but the description should be updated. I'm still struggling to select proper wording...) I'll provide this part by Tuesday morning. BTW, the new language tag RFC got the number assigned - http://www.rfc-editor.org/rfc/rfc5646.txt -Yoshito -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20090916/1ec7c49a/attachment-0003.html -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20090916/1ec7c49a/attachment-0004.html -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20090916/1ec7c49a/attachment-0005.html From y.umaoka at gmail.com Fri Sep 18 10:17:04 2009 From: y.umaoka at gmail.com (Yoshito Umaoka) Date: Fri, 18 Sep 2009 13:17:04 -0400 Subject: [loc-en-dev] getISO3Language and getISO3Country Message-ID: <4AB3C090.8060506@gmail.com> I have several questions about getISO3Language and getISO3Country. With the support for some ISO 639-2/639-3/639-5 three letter codes in Locale, getISO3Language should be updated. For example, new Locale("kok").getISO3Language(); should return "kok", because it's a three letter valid ISO 639-2 code, conforming to the BCP 47 language tag. Q1: What if an unassigned 3 letter code is used? For example - new Locale("zzz").getISO3Language(); Q2: What if 3 an ill-formed 3 letter code is used? For example - new Locale("123").getISO3Language(); Also, a question for getISO3Country. We will support UN M.49 area code. Q3. What if M.49 code is used? For example - new Locale("en", "029").getISO3Country() In addition to this, the current getISO3Country JavaDoc mentions ISO 3166-2, which looks inappropriate to me. What this method is handling is ISO 3166-1 alpha-2 codes and ISO 3166-1 alpha-3 codes. The API doc is referencing to http://www.davros.org/misc/iso3166.txt . It looks this is a personal web site and the data is already out of date. Unfortunately, ISO 3166 maintenance agency does not disclose ISO 3166-1 alpha-3 code data. But we obviously want a reference which is public and stable. I think wikipedia has a topic for ISO 3166-1 and contains the code list -> http://en.wikipedia.org/wiki/ISO_3166-1 I think this reference would work better than the current. What do you think? -Yoshito From mark at macchiato.com Fri Sep 18 14:38:00 2009 From: mark at macchiato.com (=?UTF-8?B?TWFyayBEYXZpcyDimJU=?=) Date: Fri, 18 Sep 2009 14:38:00 -0700 Subject: [loc-en-dev] getISO3Language and getISO3Country In-Reply-To: <4AB3C090.8060506@gmail.com> References: <4AB3C090.8060506@gmail.com> Message-ID: <30b660a20909181438x6e79632of09aed56826cd4d6@mail.gmail.com> If the get codes are to return just valid codes, then I think the right references to is Unicode CLDR. That includes by reference all of the BCP 47, which has stabilized codes for all of the ISO language (2/3 letter), script (4-letter), country (2-letter & 3 digit) codes that should be considered valid. Mark On Fri, Sep 18, 2009 at 10:17, Yoshito Umaoka wrote: > I have several questions about getISO3Language and getISO3Country. > > With the support for some ISO 639-2/639-3/639-5 three letter codes in > Locale, getISO3Language should be updated. For example, > > new Locale("kok").getISO3Language(); > > should return "kok", because it's a three letter valid ISO 639-2 code, > conforming to the BCP 47 language tag. > > Q1: What if an unassigned 3 letter code is used? For example - > > new Locale("zzz").getISO3Language(); > > Q2: What if 3 an ill-formed 3 letter code is used? For example - > > new Locale("123").getISO3Language(); > > Also, a question for getISO3Country. We will support UN M.49 area code. > > Q3. What if M.49 code is used? For example - > > new Locale("en", "029").getISO3Country() > > In addition to this, the current getISO3Country JavaDoc mentions ISO > 3166-2, which looks inappropriate to me. What this method is handling is > ISO 3166-1 alpha-2 codes and ISO 3166-1 alpha-3 codes. The API doc is > referencing to http://www.davros.org/misc/iso3166.txt . It looks this is > a personal web site and the data is already out of date. Unfortunately, ISO > 3166 maintenance agency does not disclose ISO 3166-1 alpha-3 code data. But > we obviously want a reference which is public and stable. I think wikipedia > has a topic for ISO 3166-1 and contains the code list -> > http://en.wikipedia.org/wiki/ISO_3166-1 I think this reference would work > better than the current. What do you think? > > -Yoshito > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20090918/a2f65e2a/attachment.html From y.umaoka at gmail.com Tue Sep 29 07:07:21 2009 From: y.umaoka at gmail.com (Yoshito Umaoka) Date: Tue, 29 Sep 2009 10:07:21 -0400 Subject: [loc-en-dev] Updated JavaDoc to include protected APIs Message-ID: <4AC21499.2050903@gmail.com> When I generated the Java doc last time, I only included public. I regenerated the doc including protected and posted here - http://sites.google.com/site/openjdklocale/apis The previous revision was saved in this page - http://sites.google.com/site/openjdklocale/apis/proposed-apis---rev1 -Yoshito