From mark at macchiato.com Thu Oct 7 08:07:51 2010 From: mark at macchiato.com (=?UTF-8?B?TWFyayBEYXZpcyDimJU=?=) Date: Thu, 7 Oct 2010 08:07:51 -0700 Subject: [loc-en-dev] Java Identifiers unstable? Message-ID: I know this isn't the right forum, but I'm not sure how to report it. Unicode has mechanisms to guarantee that program identifiers are stable over versions of Unicode, and defines properties that have that guarantee: XID_Start and XID_Continue (see http://unicode.org/reports/tr31/). Sun was actually the one that brought up this issue, back some 6 or 7 years, prompting the Consortium to develop a definition that guaranteed stability. However, when I look at the documentation for isJavaIdentifierPart, isCharacteIdentifierPart, etc., it appears that these are defined not in terms of those properties, but in terms of properties that are *not* stable over releases. http://download.oracle.com/javase/6/docs/api/java/lang/Character.html#isJavaIdentifierPart(int) etc. That means that a program that was compiled under one release of Java could fail under a future one, simply because the identifiers break under the new release. It may be a matter of just documentation being wrong, or it could be the underlying implementation. Anyway, how can I surface this? Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20101007/176877b6/attachment.html From naoto.sato at oracle.com Thu Oct 7 11:54:16 2010 From: naoto.sato at oracle.com (Naoto Sato) Date: Thu, 07 Oct 2010 11:54:16 -0700 Subject: [loc-en-dev] Java Identifiers unstable? In-Reply-To: References: Message-ID: <4CAE1758.3030308@oracle.com> Hi Mark, It looks like the javadoc has been the same since JDK 1.1 days ("char" version of the APIs), and AFAIK, the implementation does not follow the TR31. Do you remember who at Sun brought this issue to the consortium? I would like to know to what extent s/he went about this, but probably we end up filing a bug to correct the APIs and their implementations. Naoto (10/7/10 8:07 AM), Mark Davis ? wrote: > I know this isn't the right forum, but I'm not sure how to report it. > > Unicode has mechanisms to guarantee that program identifiers are stable > over versions of Unicode, and defines properties that have that > guarantee: XID_Start and XID_Continue (see > http://unicode.org/reports/tr31/). > > Sun was actually the one that brought up this issue, back some 6 or 7 > years, prompting the Consortium to develop a definition that guaranteed > stability. > > However, when I look at the documentation for isJavaIdentifierPart, > isCharacteIdentifierPart, etc., it appears that these are defined not in > terms of those properties, but in terms of properties that are *not* > stable over releases. > > http://download.oracle.com/javase/6/docs/api/java/lang/Character.html#isJavaIdentifierPart(int) > etc. > > That means that a program that was compiled under one release of Java > could fail under a future one, simply because the identifiers break > under the new release. > > It may be a matter of just documentation being wrong, or it could be the > underlying implementation. Anyway, how can I surface this? > > Mark From mark at macchiato.com Thu Oct 7 18:15:40 2010 From: mark at macchiato.com (=?UTF-8?B?TWFyayBEYXZpcyDimJU=?=) Date: Thu, 7 Oct 2010 18:15:40 -0700 Subject: [loc-en-dev] Java Identifiers unstable? In-Reply-To: <4CAE1758.3030308@oracle.com> References: <4CAE1758.3030308@oracle.com> Message-ID: I don't remember who it was at Sun. For the unicodeStart and unicodePart, you should either shift to the Unicode definition, or at a minimum, document that it is not the same as the Unicode definition. And then document whether or not you keep it stable. For the javaStart and javaPart, it really needs to be stable. - You could keep basically the same definition, but note that characters may be added for backwards compatibility. Internally, what you'd need to do is whenever you update to a new version of Unicode, check that all characters are retained; if any aren't, then grandfather them in with a hard-coded list. - Alternatively, you could base it on Unicode definition, with a set of grandfathered characters for backwards compatibility. Mark *? Il meglio ? l?inimico del bene ?* On Thu, Oct 7, 2010 at 11:54, Naoto Sato wrote: > Hi Mark, > > It looks like the javadoc has been the same since JDK 1.1 days ("char" > version of the APIs), and AFAIK, the implementation does not follow the > TR31. > > Do you remember who at Sun brought this issue to the consortium? I would > like to know to what extent s/he went about this, but probably we end up > filing a bug to correct the APIs and their implementations. > > Naoto > > > (10/7/10 8:07 AM), Mark Davis ? wrote: > >> I know this isn't the right forum, but I'm not sure how to report it. >> >> Unicode has mechanisms to guarantee that program identifiers are stable >> over versions of Unicode, and defines properties that have that >> guarantee: XID_Start and XID_Continue (see >> http://unicode.org/reports/tr31/). >> >> Sun was actually the one that brought up this issue, back some 6 or 7 >> years, prompting the Consortium to develop a definition that guaranteed >> stability. >> >> However, when I look at the documentation for isJavaIdentifierPart, >> isCharacteIdentifierPart, etc., it appears that these are defined not in >> terms of those properties, but in terms of properties that are *not* >> stable over releases. >> >> >> http://download.oracle.com/javase/6/docs/api/java/lang/Character.html#isJavaIdentifierPart(int) >> etc. >> >> That means that a program that was compiled under one release of Java >> could fail under a future one, simply because the identifiers break >> under the new release. >> >> It may be a matter of just documentation being wrong, or it could be the >> underlying implementation. Anyway, how can I surface this? >> >> Mark >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20101007/a5b64c16/attachment.html From naoto.sato at oracle.com Fri Oct 8 10:51:21 2010 From: naoto.sato at oracle.com (Naoto Sato) Date: Fri, 08 Oct 2010 10:51:21 -0700 Subject: [loc-en-dev] Java Identifiers unstable? In-Reply-To: References: <4CAE1758.3030308@oracle.com> Message-ID: <4CAF5A19.2070009@oracle.com> Filed a bug 6990687 for this. Naoto (10/7/10 6:15 PM), Mark Davis ? wrote: > I don't remember who it was at Sun. > > > For the unicodeStart and unicodePart, you should either shift to the > Unicode definition, or at a minimum, document that it is not the same as > the Unicode definition. And then document whether or not you keep it stable. > > For the javaStart and javaPart, it really needs to be stable. > > * You could keep basically the same definition, but note that > characters may be added for backwards compatibility. Internally, > what you'd need to do is whenever you update to a new version of > Unicode, check that all characters are retained; if any aren't, > then grandfather them in with a hard-coded list. > * Alternatively, you could base it on Unicode definition, with a set > of grandfathered characters for backwards compatibility. > > > Mark > > /? Il meglio ? l?inimico del bene ?/ > > > On Thu, Oct 7, 2010 at 11:54, Naoto Sato > wrote: > > Hi Mark, > > It looks like the javadoc has been the same since JDK 1.1 days > ("char" version of the APIs), and AFAIK, the implementation does not > follow the TR31. > > Do you remember who at Sun brought this issue to the consortium? I > would like to know to what extent s/he went about this, but probably > we end up filing a bug to correct the APIs and their implementations. > > Naoto > > > (10/7/10 8:07 AM), Mark Davis ? wrote: > > I know this isn't the right forum, but I'm not sure how to > report it. > > Unicode has mechanisms to guarantee that program identifiers are > stable > over versions of Unicode, and defines properties that have that > guarantee: XID_Start and XID_Continue (see > http://unicode.org/reports/tr31/). > > Sun was actually the one that brought up this issue, back some 6 > or 7 > years, prompting the Consortium to develop a definition that > guaranteed > stability. > > However, when I look at the documentation for isJavaIdentifierPart, > isCharacteIdentifierPart, etc., it appears that these are > defined not in > terms of those properties, but in terms of properties that are *not* > stable over releases. > > http://download.oracle.com/javase/6/docs/api/java/lang/Character.html#isJavaIdentifierPart(int) > etc. > > That means that a program that was compiled under one release of > Java > could fail under a future one, simply because the identifiers break > under the new release. > > It may be a matter of just documentation being wrong, or it > could be the > underlying implementation. Anyway, how can I surface this? > > Mark > > > From mark at macchiato.com Fri Oct 8 11:31:48 2010 From: mark at macchiato.com (=?UTF-8?B?TWFyayBEYXZpcyDimJU=?=) Date: Fri, 8 Oct 2010 11:31:48 -0700 Subject: [loc-en-dev] Java Identifiers unstable? In-Reply-To: <4CAF5A19.2070009@oracle.com> References: <4CAE1758.3030308@oracle.com> <4CAF5A19.2070009@oracle.com> Message-ID: Thanks, Mark *? Il meglio ? l?inimico del bene ?* On Fri, Oct 8, 2010 at 10:51, Naoto Sato wrote: > Filed a bug 6990687 for this. > > Naoto > > > (10/7/10 6:15 PM), Mark Davis ? wrote: > >> I don't remember who it was at Sun. >> >> >> For the unicodeStart and unicodePart, you should either shift to the >> Unicode definition, or at a minimum, document that it is not the same as >> the Unicode definition. And then document whether or not you keep it >> stable. >> >> For the javaStart and javaPart, it really needs to be stable. >> >> * You could keep basically the same definition, but note that >> characters may be added for backwards compatibility. Internally, >> what you'd need to do is whenever you update to a new version of >> Unicode, check that all characters are retained; if any aren't, >> then grandfather them in with a hard-coded list. >> * Alternatively, you could base it on Unicode definition, with a set >> of grandfathered characters for backwards compatibility. >> >> >> Mark >> >> /? Il meglio ? l?inimico del bene ?/ >> >> >> On Thu, Oct 7, 2010 at 11:54, Naoto Sato > > wrote: >> >> Hi Mark, >> >> It looks like the javadoc has been the same since JDK 1.1 days >> ("char" version of the APIs), and AFAIK, the implementation does not >> follow the TR31. >> >> Do you remember who at Sun brought this issue to the consortium? I >> would like to know to what extent s/he went about this, but probably >> we end up filing a bug to correct the APIs and their implementations. >> >> Naoto >> >> >> (10/7/10 8:07 AM), Mark Davis ? wrote: >> >> I know this isn't the right forum, but I'm not sure how to >> report it. >> >> Unicode has mechanisms to guarantee that program identifiers are >> stable >> over versions of Unicode, and defines properties that have that >> guarantee: XID_Start and XID_Continue (see >> http://unicode.org/reports/tr31/). >> >> Sun was actually the one that brought up this issue, back some 6 >> or 7 >> years, prompting the Consortium to develop a definition that >> guaranteed >> stability. >> >> However, when I look at the documentation for isJavaIdentifierPart, >> isCharacteIdentifierPart, etc., it appears that these are >> defined not in >> terms of those properties, but in terms of properties that are >> *not* >> stable over releases. >> >> >> http://download.oracle.com/javase/6/docs/api/java/lang/Character.html#isJavaIdentifierPart(int) >> etc. >> >> That means that a program that was compiled under one release of >> Java >> could fail under a future one, simply because the identifiers break >> under the new release. >> >> It may be a matter of just documentation being wrong, or it >> could be the >> underlying implementation. Anyway, how can I surface this? >> >> Mark >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20101008/b1689a81/attachment.html