From naoto at openjdk.org Wed Jul 2 16:25:44 2025 From: naoto at openjdk.org (Naoto Sato) Date: Wed, 2 Jul 2025 16:25:44 GMT Subject: Withdrawn: 8360774: Use text representation of normalization data files In-Reply-To: References: Message-ID: On Fri, 27 Jun 2025 20:45:14 GMT, Naoto Sato wrote: > The ICU4J component currently stores binary data files directly in the repository. This change replaces them with base64-encoded text files and converts them to binary during the build process This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/26027 From naoto at openjdk.org Mon Jul 7 19:13:15 2025 From: naoto at openjdk.org (Naoto Sato) Date: Mon, 7 Jul 2025 19:13:15 GMT Subject: RFR: 8361519: Obsolete Unicode Scalar Value link in Character class Message-ID: Refining the description of "Unicode Scalar Value" in the `Character` class. The original description referenced the outdated Unicode 3.1 specification, which previously included the U+xxxx notation but no longer does. Updated the reference to point to the Unicode glossary, which defines the term more accurately. Additionally, replaced the obsolete `@spec` link to Unicode 3.1.0 with a reference to the current Unicode Character Database. ------------- Commit messages: - initial commit Changes: https://git.openjdk.org/jdk/pull/26169/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26169&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8361519 Stats: 6 lines in 1 file changed: 0 ins; 2 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/26169.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26169/head:pull/26169 PR: https://git.openjdk.org/jdk/pull/26169 From iris at openjdk.org Mon Jul 7 19:30:40 2025 From: iris at openjdk.org (Iris Clark) Date: Mon, 7 Jul 2025 19:30:40 GMT Subject: RFR: 8361519: Obsolete Unicode Scalar Value link in Character class In-Reply-To: References: Message-ID: On Mon, 7 Jul 2025 19:08:22 GMT, Naoto Sato wrote: > Refining the description of "Unicode Scalar Value" in the `Character` class. > The original description referenced the outdated Unicode 3.1 specification, which previously included the U+xxxx notation but no longer does. Updated the reference to point to the Unicode glossary, which defines the term more accurately. Additionally, replaced the obsolete `@spec` link to Unicode 3.1.0 with a reference to the current Unicode Character Database. Looks good. Since this appears to be the only reference to Unicode 3.1.0 in the repo, this change should drop the line referencing "Unicode 3.1.0" in the list of external specifications, e.g.: https://download.java.net/java/early_access/jdk26/docs/api/external-specs.html ------------- Marked as reviewed by iris (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26169#pullrequestreview-2995055834 From naoto at openjdk.org Mon Jul 7 20:24:21 2025 From: naoto at openjdk.org (Naoto Sato) Date: Mon, 7 Jul 2025 20:24:21 GMT Subject: RFR: 8361519: Obsolete Unicode Scalar Value link in Character class [v2] In-Reply-To: References: Message-ID: > Refining the description of "Unicode Scalar Value" in the `Character` class. > The original description referenced the outdated Unicode 3.1 specification, which previously included the U+xxxx notation but no longer does. Updated the reference to point to the Unicode glossary, which defines the term more accurately. Additionally, replaced the obsolete `@spec` link to Unicode 3.1.0 with a reference to the current Unicode Character Database. Naoto Sato has updated the pull request incrementally with one additional commit since the last revision: Some more Unicode related spec clean up ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26169/files - new: https://git.openjdk.org/jdk/pull/26169/files/44cf51fa..69089cbc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26169&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26169&range=00-01 Stats: 7 lines in 1 file changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/26169.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26169/head:pull/26169 PR: https://git.openjdk.org/jdk/pull/26169 From iris at openjdk.org Mon Jul 7 20:37:38 2025 From: iris at openjdk.org (Iris Clark) Date: Mon, 7 Jul 2025 20:37:38 GMT Subject: RFR: 8361519: Obsolete Unicode Scalar Value link in Character class [v2] In-Reply-To: References: Message-ID: <5C6YrFFpsog2GZHoVNYRi_6J9cHeKSxYaaXKUdX7Pnw=.891f3c24-5ae3-409a-ab06-098500034f1d@github.com> On Mon, 7 Jul 2025 20:24:21 GMT, Naoto Sato wrote: >> Refining the description of "Unicode Scalar Value" in the `Character` class. >> The original description referenced the outdated Unicode 3.1 specification, which previously included the U+xxxx notation but no longer does. Updated the reference to point to the Unicode glossary, which defines the term more accurately. Additionally, replaced the obsolete `@spec` link to Unicode 3.1.0 with a reference to the current Unicode Character Database. > > Naoto Sato has updated the pull request incrementally with one additional commit since the last revision: > > Some more Unicode related spec clean up Marked as reviewed by iris (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/26169#pullrequestreview-2995201943 From jwaters at openjdk.org Tue Jul 8 01:32:52 2025 From: jwaters at openjdk.org (Julian Waters) Date: Tue, 8 Jul 2025 01:32:52 GMT Subject: RFR: 8342868: Errors related to unused code on Windows after 8339120 in core libs [v2] In-Reply-To: References: Message-ID: On Thu, 31 Oct 2024 05:43:11 GMT, Julian Waters wrote: >> After 8339120, gcc began catching many different instances of unused code in the Windows specific codebase. Some of these seem to be bugs. I've taken the effort to mark out all the relevant globals and locals that trigger the unused warnings and addressed all of them by commenting out the code as appropriate. I am confident that in many cases this simplistic approach of commenting out code does not fix the underlying issue, and the warning actually found a bug that should be fixed. In these instances, I will be aiming to fix these bugs with help from reviewers, so I recommend anyone reviewing who knows more about the code than I do to see whether there is indeed a bug that needs fixing in a different way than what I did > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > Remove the got local Sorry for waiting so long. It's become clear that I won't be able to get awt and accessibility up to speed for a long time, so I will go ahead with this one first ------------- PR Comment: https://git.openjdk.org/jdk/pull/21654#issuecomment-3047051283 From jwaters at openjdk.org Tue Jul 8 01:32:52 2025 From: jwaters at openjdk.org (Julian Waters) Date: Tue, 8 Jul 2025 01:32:52 GMT Subject: Integrated: 8342868: Errors related to unused code on Windows after 8339120 in core libs In-Reply-To: References: Message-ID: On Wed, 23 Oct 2024 04:40:50 GMT, Julian Waters wrote: > After 8339120, gcc began catching many different instances of unused code in the Windows specific codebase. Some of these seem to be bugs. I've taken the effort to mark out all the relevant globals and locals that trigger the unused warnings and addressed all of them by commenting out the code as appropriate. I am confident that in many cases this simplistic approach of commenting out code does not fix the underlying issue, and the warning actually found a bug that should be fixed. In these instances, I will be aiming to fix these bugs with help from reviewers, so I recommend anyone reviewing who knows more about the code than I do to see whether there is indeed a bug that needs fixing in a different way than what I did This pull request has now been integrated. Changeset: bbc5c98b Author: Julian Waters URL: https://git.openjdk.org/jdk/commit/bbc5c98b144014a0423d666f74c4a5a15b08a7c2 Stats: 19 lines in 4 files changed: 9 ins; 0 del; 10 mod 8342868: Errors related to unused code on Windows after 8339120 in core libs Reviewed-by: naoto, jlu ------------- PR: https://git.openjdk.org/jdk/pull/21654 From dholmes at openjdk.org Tue Jul 8 01:51:47 2025 From: dholmes at openjdk.org (David Holmes) Date: Tue, 8 Jul 2025 01:51:47 GMT Subject: RFR: 8342868: Errors related to unused code on Windows after 8339120 in core libs [v2] In-Reply-To: References: Message-ID: On Tue, 8 Jul 2025 01:28:04 GMT, Julian Waters wrote: >> Julian Waters has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove the got local > > Sorry for waiting so long. It's become clear that I won't be able to get awt and accessibility up to speed for a long time, so I will go ahead with this one first @TheShermanTanker the commented out code really should have been deleted, not just left commented out. Please file anpther JBS issue to have this cleaned up so it is not forgotten. Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/21654#issuecomment-3047084740 From jpai at openjdk.org Tue Jul 8 04:22:48 2025 From: jpai at openjdk.org (Jaikiran Pai) Date: Tue, 8 Jul 2025 04:22:48 GMT Subject: RFR: 8342868: Errors related to unused code on Windows after 8339120 in core libs [v2] In-Reply-To: References: Message-ID: On Thu, 31 Oct 2024 05:43:11 GMT, Julian Waters wrote: >> After 8339120, gcc began catching many different instances of unused code in the Windows specific codebase. Some of these seem to be bugs. I've taken the effort to mark out all the relevant globals and locals that trigger the unused warnings and addressed all of them by commenting out the code as appropriate. I am confident that in many cases this simplistic approach of commenting out code does not fix the underlying issue, and the warning actually found a bug that should be fixed. In these instances, I will be aiming to fix these bugs with help from reviewers, so I recommend anyone reviewing who knows more about the code than I do to see whether there is indeed a bug that needs fixing in a different way than what I did > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > Remove the got local > Since your change was applied there have been 3762 commits pushed to the master branch It's usually risky to be integrating a PR which is so far behind the master branch, without first merging the latest changes and running the tier tests. In this case it hasn't caused any failures in the tier testing so far. ------------- PR Comment: https://git.openjdk.org/jdk/pull/21654#issuecomment-3047304336 From jwaters at openjdk.org Tue Jul 8 11:30:53 2025 From: jwaters at openjdk.org (Julian Waters) Date: Tue, 8 Jul 2025 11:30:53 GMT Subject: RFR: 8342868: Errors related to unused code on Windows after 8339120 in core libs [v2] In-Reply-To: References: Message-ID: On Tue, 8 Jul 2025 04:20:32 GMT, Jaikiran Pai wrote: > > Since your change was applied there have been 3762 commits pushed to the master branch > > It's usually risky to be integrating a PR which is so far behind the master branch, without first merging the latest changes and running the tier tests. In this case it hasn't caused any failures in the tier testing so far. I'll keep that in mind next time I'm submitting another Pull Request. Fortunately in this case the code touched was dead code and in an area that isn't touched often, so shouldn't cause major issues ------------- PR Comment: https://git.openjdk.org/jdk/pull/21654#issuecomment-3048519346 From jwaters at openjdk.org Tue Jul 8 11:36:01 2025 From: jwaters at openjdk.org (Julian Waters) Date: Tue, 8 Jul 2025 11:36:01 GMT Subject: RFR: 8342868: Errors related to unused code on Windows after 8339120 in core libs [v2] In-Reply-To: References: Message-ID: On Tue, 8 Jul 2025 01:49:14 GMT, David Holmes wrote: >> Sorry for waiting so long. It's become clear that I won't be able to get awt and accessibility up to speed for a long time, so I will go ahead with this one first > > @TheShermanTanker the commented out code really should have been deleted, not just left commented out. Please file anpther JBS issue to have this cleaned up so it is not forgotten. Thanks. @dholmes-ora Sorry about that. Here's the issue as was requested: https://bugs.openjdk.org/browse/JDK-8361593 ------------- PR Comment: https://git.openjdk.org/jdk/pull/21654#issuecomment-3048532200 From naoto at openjdk.org Tue Jul 8 17:16:43 2025 From: naoto at openjdk.org (Naoto Sato) Date: Tue, 8 Jul 2025 17:16:43 GMT Subject: RFR: 8361519: Obsolete Unicode Scalar Value link in Character class [v2] In-Reply-To: References: Message-ID: On Mon, 7 Jul 2025 20:24:21 GMT, Naoto Sato wrote: >> Refining the description of "Unicode Scalar Value" in the `Character` class. >> The original description referenced the outdated Unicode 3.1 specification, which previously included the U+xxxx notation but no longer does. Updated the reference to point to the Unicode glossary, which defines the term more accurately. Additionally, replaced the obsolete `@spec` link to Unicode 3.1.0 with a reference to the current Unicode Character Database. > > Naoto Sato has updated the pull request incrementally with one additional commit since the last revision: > > Some more Unicode related spec clean up Thanks for the review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/26169#issuecomment-3049711897 From naoto at openjdk.org Tue Jul 8 17:16:44 2025 From: naoto at openjdk.org (Naoto Sato) Date: Tue, 8 Jul 2025 17:16:44 GMT Subject: Integrated: 8361519: Obsolete Unicode Scalar Value link in Character class In-Reply-To: References: Message-ID: On Mon, 7 Jul 2025 19:08:22 GMT, Naoto Sato wrote: > Refining the description of "Unicode Scalar Value" in the `Character` class. > The original description referenced the outdated Unicode 3.1 specification, which previously included the U+xxxx notation but no longer does. Updated the reference to point to the Unicode glossary, which defines the term more accurately. Additionally, replaced the obsolete `@spec` link to Unicode 3.1.0 with a reference to the current Unicode Character Database. This pull request has now been integrated. Changeset: 5850bf44 Author: Naoto Sato URL: https://git.openjdk.org/jdk/commit/5850bf4488ea336c3dd4eafbefb8ade330e2f76a Stats: 13 lines in 2 files changed: 0 ins; 2 del; 11 mod 8361519: Obsolete Unicode Scalar Value link in Character class Reviewed-by: iris ------------- PR: https://git.openjdk.org/jdk/pull/26169 From naoto at openjdk.org Wed Jul 9 18:43:48 2025 From: naoto at openjdk.org (Naoto Sato) Date: Wed, 9 Jul 2025 18:43:48 GMT Subject: RFR: 8361717: Refactor Collections.emptyList() in Locale related classes Message-ID: <5o7ommmTBTXH9udy76qajqgUNSF89aiQ4ptsebC3v3o=.01b18594-1798-41a7-972f-15572d6b4ace@github.com> Replaced Collections.emptyList() with List.of() as part of refactoring. This was discussed in the context of investigating a CDS-related issue (https://bugs.openjdk.org/browse/JDK-8357281?focusedId=14796714&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14796714). Although the root cause was ultimately determined to be user error, modernizing the code by using List.of() is still a desirable improvement ------------- Commit messages: - initial commit Changes: https://git.openjdk.org/jdk/pull/26225/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26225&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8361717 Stats: 5 lines in 2 files changed: 0 ins; 2 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/26225.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26225/head:pull/26225 PR: https://git.openjdk.org/jdk/pull/26225 From bpb at openjdk.org Wed Jul 9 18:54:38 2025 From: bpb at openjdk.org (Brian Burkhalter) Date: Wed, 9 Jul 2025 18:54:38 GMT Subject: RFR: 8361717: Refactor Collections.emptyList() in Locale related classes In-Reply-To: <5o7ommmTBTXH9udy76qajqgUNSF89aiQ4ptsebC3v3o=.01b18594-1798-41a7-972f-15572d6b4ace@github.com> References: <5o7ommmTBTXH9udy76qajqgUNSF89aiQ4ptsebC3v3o=.01b18594-1798-41a7-972f-15572d6b4ace@github.com> Message-ID: On Wed, 9 Jul 2025 18:39:40 GMT, Naoto Sato wrote: > Replaced Collections.emptyList() with List.of() as part of refactoring. This was discussed in the context of investigating a CDS-related issue (https://bugs.openjdk.org/browse/JDK-8357281?focusedId=14796714&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14796714). Although the root cause was ultimately determined to be user error, modernizing the code by using List.of() is still a desirable improvement Looks fine. ------------- Marked as reviewed by bpb (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26225#pullrequestreview-3002724026 From jlu at openjdk.org Wed Jul 9 18:57:38 2025 From: jlu at openjdk.org (Justin Lu) Date: Wed, 9 Jul 2025 18:57:38 GMT Subject: RFR: 8361717: Refactor Collections.emptyList() in Locale related classes In-Reply-To: <5o7ommmTBTXH9udy76qajqgUNSF89aiQ4ptsebC3v3o=.01b18594-1798-41a7-972f-15572d6b4ace@github.com> References: <5o7ommmTBTXH9udy76qajqgUNSF89aiQ4ptsebC3v3o=.01b18594-1798-41a7-972f-15572d6b4ace@github.com> Message-ID: On Wed, 9 Jul 2025 18:39:40 GMT, Naoto Sato wrote: > Replaced Collections.emptyList() with List.of() as part of refactoring. This was discussed in the context of investigating a CDS-related issue (https://bugs.openjdk.org/browse/JDK-8357281?focusedId=14796714&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14796714). Although the root cause was ultimately determined to be user error, modernizing the code by using List.of() is still a desirable improvement Looks good. Thanks for fixing this. ------------- Marked as reviewed by jlu (Committer). PR Review: https://git.openjdk.org/jdk/pull/26225#pullrequestreview-3002729901 From duke at openjdk.org Wed Jul 9 19:11:39 2025 From: duke at openjdk.org (Johannes =?UTF-8?B?RMO2Ymxlcg==?=) Date: Wed, 9 Jul 2025 19:11:39 GMT Subject: RFR: 8361717: Refactor Collections.emptyList() in Locale related classes In-Reply-To: <5o7ommmTBTXH9udy76qajqgUNSF89aiQ4ptsebC3v3o=.01b18594-1798-41a7-972f-15572d6b4ace@github.com> References: <5o7ommmTBTXH9udy76qajqgUNSF89aiQ4ptsebC3v3o=.01b18594-1798-41a7-972f-15572d6b4ace@github.com> Message-ID: On Wed, 9 Jul 2025 18:39:40 GMT, Naoto Sato wrote: > modernizing the code by using List.of() is still a desirable improvement except that `Collections.emptyList()` and `List.of()` unfortunately have different tolerance to calls `List.indexOf(null)` and `List.contains(null)`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26225#issuecomment-3053718956 From liach at openjdk.org Wed Jul 9 19:30:40 2025 From: liach at openjdk.org (Chen Liang) Date: Wed, 9 Jul 2025 19:30:40 GMT Subject: RFR: 8361717: Refactor Collections.emptyList() in Locale related classes In-Reply-To: <5o7ommmTBTXH9udy76qajqgUNSF89aiQ4ptsebC3v3o=.01b18594-1798-41a7-972f-15572d6b4ace@github.com> References: <5o7ommmTBTXH9udy76qajqgUNSF89aiQ4ptsebC3v3o=.01b18594-1798-41a7-972f-15572d6b4ace@github.com> Message-ID: On Wed, 9 Jul 2025 18:39:40 GMT, Naoto Sato wrote: > Replaced Collections.emptyList() with List.of() as part of refactoring. This was discussed in the context of investigating a CDS-related issue (https://bugs.openjdk.org/browse/JDK-8357281?focusedId=14796714&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14796714). Although the root cause was ultimately determined to be user error, modernizing the code by using List.of() is still a desirable improvement Both lists aren't returned to users, so their null-hostile contains behavior has no impact. ------------- Marked as reviewed by liach (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26225#pullrequestreview-3002824059 From liach at openjdk.org Wed Jul 9 19:38:41 2025 From: liach at openjdk.org (Chen Liang) Date: Wed, 9 Jul 2025 19:38:41 GMT Subject: RFR: 8361717: Refactor Collections.emptyList() in Locale related classes In-Reply-To: <5o7ommmTBTXH9udy76qajqgUNSF89aiQ4ptsebC3v3o=.01b18594-1798-41a7-972f-15572d6b4ace@github.com> References: <5o7ommmTBTXH9udy76qajqgUNSF89aiQ4ptsebC3v3o=.01b18594-1798-41a7-972f-15572d6b4ace@github.com> Message-ID: On Wed, 9 Jul 2025 18:39:40 GMT, Naoto Sato wrote: > Replaced Collections.emptyList() with List.of() as part of refactoring. This was discussed in the context of investigating a CDS-related issue (https://bugs.openjdk.org/browse/JDK-8357281?focusedId=14796714&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14796714). Although the root cause was ultimately determined to be user error, modernizing the code by using List.of() is still a desirable improvement An effort #25922 exists to make the AOT/CDS requirements more obvious to core libraries. Hope we can have this easier down the road. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26225#issuecomment-3053781110 From cstein at openjdk.org Thu Jul 10 05:39:38 2025 From: cstein at openjdk.org (Christian Stein) Date: Thu, 10 Jul 2025 05:39:38 GMT Subject: RFR: 8361717: Refactor Collections.emptyList() in Locale related classes In-Reply-To: <5o7ommmTBTXH9udy76qajqgUNSF89aiQ4ptsebC3v3o=.01b18594-1798-41a7-972f-15572d6b4ace@github.com> References: <5o7ommmTBTXH9udy76qajqgUNSF89aiQ4ptsebC3v3o=.01b18594-1798-41a7-972f-15572d6b4ace@github.com> Message-ID: On Wed, 9 Jul 2025 18:39:40 GMT, Naoto Sato wrote: > Replaced Collections.emptyList() with List.of() as part of refactoring. This was discussed in the context of investigating a CDS-related issue (https://bugs.openjdk.org/browse/JDK-8357281?focusedId=14796714&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14796714). Although the root cause was ultimately determined to be user error, modernizing the code by using List.of() is still a desirable improvement Marked as reviewed by cstein (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/26225#pullrequestreview-3004002450 From naoto at openjdk.org Thu Jul 10 16:10:54 2025 From: naoto at openjdk.org (Naoto Sato) Date: Thu, 10 Jul 2025 16:10:54 GMT Subject: RFR: 8361717: Refactor Collections.emptyList() in Locale related classes In-Reply-To: <5o7ommmTBTXH9udy76qajqgUNSF89aiQ4ptsebC3v3o=.01b18594-1798-41a7-972f-15572d6b4ace@github.com> References: <5o7ommmTBTXH9udy76qajqgUNSF89aiQ4ptsebC3v3o=.01b18594-1798-41a7-972f-15572d6b4ace@github.com> Message-ID: On Wed, 9 Jul 2025 18:39:40 GMT, Naoto Sato wrote: > Replaced Collections.emptyList() with List.of() as part of refactoring. This was discussed in the context of investigating a CDS-related issue (https://bugs.openjdk.org/browse/JDK-8357281?focusedId=14796714&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14796714). Although the root cause was ultimately determined to be user error, modernizing the code by using List.of() is still a desirable improvement Thanks for the reviews! > except that `Collections.emptyList()` and `List.of()` unfortunately have different tolerance to calls `List.indexOf(null)` and `List.contains(null)`. True in general, but as @liach mentioned, those singletons are internal use only so the difference is not relevant here. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26225#issuecomment-3058072009 From naoto at openjdk.org Thu Jul 10 16:10:54 2025 From: naoto at openjdk.org (Naoto Sato) Date: Thu, 10 Jul 2025 16:10:54 GMT Subject: Integrated: 8361717: Refactor Collections.emptyList() in Locale related classes In-Reply-To: <5o7ommmTBTXH9udy76qajqgUNSF89aiQ4ptsebC3v3o=.01b18594-1798-41a7-972f-15572d6b4ace@github.com> References: <5o7ommmTBTXH9udy76qajqgUNSF89aiQ4ptsebC3v3o=.01b18594-1798-41a7-972f-15572d6b4ace@github.com> Message-ID: <5JciFlhCUpoKPw9lljQnP-04WwWDCQu4q5x04ln9YLI=.4766c2d7-bd7d-4e9e-8a57-4e751887d23c@github.com> On Wed, 9 Jul 2025 18:39:40 GMT, Naoto Sato wrote: > Replaced Collections.emptyList() with List.of() as part of refactoring. This was discussed in the context of investigating a CDS-related issue (https://bugs.openjdk.org/browse/JDK-8357281?focusedId=14796714&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14796714). Although the root cause was ultimately determined to be user error, modernizing the code by using List.of() is still a desirable improvement This pull request has now been integrated. Changeset: f5afbbd3 Author: Naoto Sato URL: https://git.openjdk.org/jdk/commit/f5afbbd32a0f46973664a228e6799fb1a958cd51 Stats: 5 lines in 2 files changed: 0 ins; 2 del; 3 mod 8361717: Refactor Collections.emptyList() in Locale related classes Reviewed-by: bpb, jlu, liach, cstein ------------- PR: https://git.openjdk.org/jdk/pull/26225 From sherman at openjdk.org Mon Jul 14 04:58:45 2025 From: sherman at openjdk.org (Xueming Shen) Date: Mon, 14 Jul 2025 04:58:45 GMT Subject: RFR: 8360459: UNICODE_CASE and character class with non-ASCII range does not match ASCII char Message-ID: Regex class should conform to **_Level 1_** of [Unicode Technical Standard #18: Unicode Regular Expressions](http://www.unicode.org/reports/tr18/), plus RL2.1 Canonical Equivalents and RL2.2 Extended Grapheme Clusters. This PR primarily addresses conformance with RL1.5: Simple Loose Matches, which requires that simple case folding be applied to literals and (optionally) to character classes. When applied to character classes, each class is expected to be closed under simple case folding. See the standard for a detailed explanation of what it means for a class to be ?closed.? To conform with Level 1 of UTS #18, specifically RL1.5: Simple Loose Matches, simple case folding must be applied to literals and (optionally) to character classes. When applied to character classes, each character class is expected to **be closed under simple case folding**. See the standard for the detailed explanation and example of "closed". **RL1.5 states**: To meet this requirement, an implementation that supports case-sensitive matching should 1. Provide at least the simple, default Unicode case-insensitive matching, and 2. Specify which character properties or constructs are closed under the matching. **In the Pattern implementation**, 5 types of constructs may be affected by case sensitivity: 1. back-refs 2. string slices (sequences) 3. single character, 4. character families (Unicode Properties ...), and 5. character class ranges **Note**: Single characters and families may appear independently or within a character class. For case-insensitive (loose) matching, the implementation already applies Character.toUpperCase() and Character.toLowerCase() to **both the pattern and the input string** for back-refs, slices, and single characters. This effectively makes these constructs closed under case folding. This has been verified in the newly added test case **_test/jdk/java/util/regex/CaseFoldingTest.java_**. For example: Pattern.compile("(?ui)\u017f").matcher("S").matches(). => true Pattern.compile("(?ui)[\u017f]").matcher("S").matches() => true The character properties (families) are not "closed" and should remain unchanged. This is acceptable per RL1.5, if the behavior is clearly specified (TBD: update javadoc to reflect this). **Current Non-Conformance: Character Class Ranges**, as reported in the original bug report. Pattern.compile("(?ui)[\u017f-\u017f]").matcher("S").matches() => false vs Pattern.compile("(?ui)[S-S]").matcher("\u017f").matches(). => true vs Perl. (Perl also claims to support the Unicode's loose match with it it's "i" modifier) perl -C -e 'print "S" =~ /[\x{017f}-\x{017f}]/ ? "true\n" : "false\n"'. => false perl -C -e 'print "S" =~ /[\x{017f}-\x{017f}]/**_i_** ? "true\n" : "false\n"'. => **_true_** The root issue is that the range construct is not implemented to be closed under simple case folding. Applying toUpperCase() and toLowerCase() to a range like [\u0170-\u0180] does not produce a meaningful or valid range for case-folding comparisons. For example [\u0170-\u0180] => [\u0053-\u243] with uppercase conversion. **What This PR Does** This PR adds support for ensuring that character class ranges are closed under simple case folding when the (?ui) (Unicode case-insensitive) flag is used, bringing Pattern into better conformance with UTS #18 Level 1 (RL1.5). **Notes** **(1) The PR also tries to fix a special corner case for U+00df** see: https://codepoints.net/U+00DF vs https://codepoints.net/U+1E9E?lang=en for more context. Pattern.compile("(?ui)\u00df").matcher("\u1e9e").matches() => false Pattern.compile("(?ui)\u1e9f").matcher("\u00df").matches() => false vs perl -C -e 'print "\x{1e9e}" =~ /\x{df}/ ? "true\n" : "false\n"' => false perl -C -e 'print "\x{df}" =~ /\x{1e9e}/ ? "true\n" : "false\n"' => false perl -C -e 'print "\x{1e9e}" =~ /\x{df}/i ? "true\n" : "false\n"' => true perl -C -e 'print "\x{df}" =~ /\x{1e9e}/i ? "true\n" : "false\n"' => true The Java Character class still CORRECTLY returns u+00df for its upper case, as suggested by the Unicode. So our toUpperCase() != toLowerCase() in single() implementation fails to pick SingleU for case-insensitive matching as expected. Integer.toHexString(Character.toUpperCase('\u00df')) => 0xdf **(2) Known limitations: 3 'S'-like characters still fail** There are 3 characters whose case folding mappings (per CaseFolding.txt) are not captured by our current logic, which relies only on Java's toUpperCase()/toLowerCase() conversions. These characters cannot be matched across constructs like back-ref, slice, single, or range using the current API. We will leave them unchanged for now, pending a possible migration to a pure case folding based matching implementation. 1FD3; S; 0390; # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND OXIA 1FE3; S; 03B0; # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND OXIA FB05; S; FB06; # LATIN SMALL LIGATURE LONG S T **Refs**: https://bugs.openjdk.org/browse/JDK-6486934 https://bugs.openjdk.org/browse/CCC-6486934 https://cr.openjdk.org/~sherman/6486934_6233084_6504326_6436458/ We are fixing an almost 20-year old bug :-) ------------- Commit messages: - 8360459: UNICODE_CASE and character class with non-ASCII range does not match ASCII char Changes: https://git.openjdk.org/jdk/pull/26285/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26285&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8360459 Stats: 2044 lines in 8 files changed: 2040 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/26285.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26285/head:pull/26285 PR: https://git.openjdk.org/jdk/pull/26285 From liach at openjdk.org Mon Jul 14 05:12:40 2025 From: liach at openjdk.org (Chen Liang) Date: Mon, 14 Jul 2025 05:12:40 GMT Subject: RFR: 8360459: UNICODE_CASE and character class with non-ASCII range does not match ASCII char In-Reply-To: References: Message-ID: On Mon, 14 Jul 2025 04:53:13 GMT, Xueming Shen wrote: > Regex class should conform to **_Level 1_** of [Unicode Technical Standard #18: Unicode Regular Expressions](http://www.unicode.org/reports/tr18/), plus RL2.1 Canonical Equivalents and RL2.2 Extended Grapheme Clusters. > > This PR primarily addresses conformance with RL1.5: Simple Loose Matches, which requires that simple case folding be applied to literals and (optionally) to character classes. When applied to character classes, each class is expected to be closed under simple case folding. See the standard for a detailed explanation of what it means for a class to be ?closed.? > > To conform with Level 1 of UTS #18, specifically RL1.5: Simple Loose Matches, simple case folding must be applied to literals and (optionally) to character classes. When applied to character classes, each character class is expected to **be closed under simple case folding**. See the standard for the detailed explanation and example of "closed". > > **RL1.5 states**: > > To meet this requirement, an implementation that supports case-sensitive matching should > > 1. Provide at least the simple, default Unicode case-insensitive matching, and > 2. Specify which character properties or constructs are closed under the matching. > > **In the Pattern implementation**, 5 types of constructs may be affected by case sensitivity: > > 1. back-refs > 2. string slices (sequences) > 3. single character, > 4. character families (Unicode Properties ...), and > 5. character class ranges > > **Note**: Single characters and families may appear independently or within a character class. > > For case-insensitive (loose) matching, the implementation already applies Character.toUpperCase() and Character.toLowerCase() to **both the pattern and the input string** for back-refs, slices, and single characters. This effectively makes these constructs closed under case folding. > > This has been verified in the newly added test case **_test/jdk/java/util/regex/CaseFoldingTest.java_**. > > For example: > > Pattern.compile("(?ui)\u017f").matcher("S").matches(). => true > Pattern.compile("(?ui)[\u017f]").matcher("S").matches() => true > > The character properties (families) are not "closed" and should remain unchanged. This is acceptable per RL1.5, if the behavior is clearly specified (TBD: update javadoc to reflect this). > > **Current Non-Conformance: Character Class Ranges**, as reported in the original bug report. > > Pattern.compile("(?ui)[\u017f-\u017f]").matcher("S").matches() => false > vs > Pattern.compile("(?ui)[S-S]").... make/jdk/src/classes/build/tools/generatecharacter/CaseFolding.java line 45: > 43: var caseFoldingTxt = Paths.get(args[1]); > 44: var genSrcFile = Paths.get(args[2]); > 45: var supportedTypes = "^.*; [CTS]; .*$"; Do we still need T here given you already have a hardcoded special case? make/jdk/src/classes/build/tools/generatecharacter/CaseFolding.java line 60: > 58: .map(cols -> String.format(" entry(0x%s, 0x%s),", cols[0], cols[2])) > 59: .collect(Collectors.joining("\n")) > 60: .replaceFirst(",$", ""); // remove the last ',' Suggestion: .map(cols -> String.format(" entry(0x%s, 0x%s)", cols[0], cols[2])) .collect(Collectors.joining(",\n", "", "\n")); // remove the last ',' make/jdk/src/classes/build/tools/generatecharacter/CaseFolding.java line 74: > 72: StandardOpenOption.CREATE, StandardOpenOption.TRUNCATE_EXISTING); > 73: } catch (IOException e) { > 74: e.printStackTrace(); I recommend removing this catch and add `throws Throwable` in the signature of `main` src/java.base/share/classes/jdk/internal/util/regex/CaseFolding.java.template line 36: > 34: public final class CaseFolding { > 35: > 36: private static Map expanded_casefolding = Map.ofEntries( Suggestion: private static final Map expanded_casefolding = Map.ofEntries( src/java.base/share/classes/jdk/internal/util/regex/CaseFolding.java.template line 99: > 97: */ > 98: public static int[] getClassRangeClosingCharacters(int start, int end) { > 99: int[] expanded = new int[expanded_casefolding.size()]; Can be `Math.min(expanded_casefolding.size(), end - start)` in case the table grows large, and update the `off < expanded.length` check below too. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26285#discussion_r2203858280 PR Review Comment: https://git.openjdk.org/jdk/pull/26285#discussion_r2203854636 PR Review Comment: https://git.openjdk.org/jdk/pull/26285#discussion_r2203852720 PR Review Comment: https://git.openjdk.org/jdk/pull/26285#discussion_r2203850027 PR Review Comment: https://git.openjdk.org/jdk/pull/26285#discussion_r2203851719 From sherman at openjdk.org Mon Jul 14 07:30:45 2025 From: sherman at openjdk.org (Xueming Shen) Date: Mon, 14 Jul 2025 07:30:45 GMT Subject: RFR: 8360459: UNICODE_CASE and character class with non-ASCII range does not match ASCII char In-Reply-To: References: Message-ID: <6AFn-UzXqb_oY_XGpadyiepteLDrlHJsSqZfXrybPug=.c8454bbe-15df-463a-8e6c-53022ece337b@github.com> On Mon, 14 Jul 2025 05:01:17 GMT, Chen Liang wrote: >> Regex class should conform to **_Level 1_** of [Unicode Technical Standard #18: Unicode Regular Expressions](http://www.unicode.org/reports/tr18/), plus RL2.1 Canonical Equivalents and RL2.2 Extended Grapheme Clusters. >> >> This PR primarily addresses conformance with RL1.5: Simple Loose Matches, which requires that simple case folding be applied to literals and (optionally) to character classes. When applied to character classes, each class is expected to be closed under simple case folding. See the standard for a detailed explanation of what it means for a class to be ?closed.? >> >> To conform with Level 1 of UTS #18, specifically RL1.5: Simple Loose Matches, simple case folding must be applied to literals and (optionally) to character classes. When applied to character classes, each character class is expected to **be closed under simple case folding**. See the standard for the detailed explanation and example of "closed". >> >> **RL1.5 states**: >> >> To meet this requirement, an implementation that supports case-sensitive matching should >> >> 1. Provide at least the simple, default Unicode case-insensitive matching, and >> 2. Specify which character properties or constructs are closed under the matching. >> >> **In the Pattern implementation**, 5 types of constructs may be affected by case sensitivity: >> >> 1. back-refs >> 2. string slices (sequences) >> 3. single character, >> 4. character families (Unicode Properties ...), and >> 5. character class ranges >> >> **Note**: Single characters and families may appear independently or within a character class. >> >> For case-insensitive (loose) matching, the implementation already applies Character.toUpperCase() and Character.toLowerCase() to **both the pattern and the input string** for back-refs, slices, and single characters. This effectively makes these constructs closed under case folding. >> >> This has been verified in the newly added test case **_test/jdk/java/util/regex/CaseFoldingTest.java_**. >> >> For example: >> >> Pattern.compile("(?ui)\u017f").matcher("S").matches(). => true >> Pattern.compile("(?ui)[\u017f]").matcher("S").matches() => true >> >> The character properties (families) are not "closed" and should remain unchanged. This is acceptable per RL1.5, if the behavior is clearly specified (TBD: update javadoc to reflect this). >> >> **Current Non-Conformance: Character Class Ranges**, as reported in the original bug report. >> >> Pattern.compile("(?ui)[\u017f-\u... > > src/java.base/share/classes/jdk/internal/util/regex/CaseFolding.java.template line 99: > >> 97: */ >> 98: public static int[] getClassRangeClosingCharacters(int start, int end) { >> 99: int[] expanded = new int[expanded_casefolding.size()]; > > Can be `Math.min(expanded_casefolding.size(), end - start)` in case the table grows large, and update the `off < expanded.length` check below too. The table itself probably isn't going to grow significantly anytime soon, and we?ll likely have enough time to adjust if CaseFolding.txt does get substantially bigger. That said, I probably should consider reversing the lookup logic: instead of iterating through [start, end], we could iterate over the expansion table and check whether any of its code points fall within the input range, at least when the range size is larger than the size of the table, kinda O(n) vs O(1)-ish. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26285#discussion_r2204044141 From sherman at openjdk.org Mon Jul 14 07:54:31 2025 From: sherman at openjdk.org (Xueming Shen) Date: Mon, 14 Jul 2025 07:54:31 GMT Subject: RFR: 8360459: UNICODE_CASE and character class with non-ASCII range does not match ASCII char [v2] In-Reply-To: References: Message-ID: <9h1T1edYRoTT3v5CPkY9DN9Lq0bSnDoU8VtK2xn4sIA=.9b57eb79-b59e-4b59-a2ed-94b68735c04f@github.com> > Regex class should conform to **_Level 1_** of [Unicode Technical Standard #18: Unicode Regular Expressions](http://www.unicode.org/reports/tr18/), plus RL2.1 Canonical Equivalents and RL2.2 Extended Grapheme Clusters. > > This PR primarily addresses conformance with RL1.5: Simple Loose Matches, which requires that simple case folding be applied to literals and (optionally) to character classes. When applied to character classes, each class is expected to be closed under simple case folding. See the standard for a detailed explanation of what it means for a class to be ?closed.? > > To conform with Level 1 of UTS #18, specifically RL1.5: Simple Loose Matches, simple case folding must be applied to literals and (optionally) to character classes. When applied to character classes, each character class is expected to **be closed under simple case folding**. See the standard for the detailed explanation and example of "closed". > > **RL1.5 states**: > > To meet this requirement, an implementation that supports case-sensitive matching should > > 1. Provide at least the simple, default Unicode case-insensitive matching, and > 2. Specify which character properties or constructs are closed under the matching. > > **In the Pattern implementation**, 5 types of constructs may be affected by case sensitivity: > > 1. back-refs > 2. string slices (sequences) > 3. single character, > 4. character families (Unicode Properties ...), and > 5. character class ranges > > **Note**: Single characters and families may appear independently or within a character class. > > For case-insensitive (loose) matching, the implementation already applies Character.toUpperCase() and Character.toLowerCase() to **both the pattern and the input string** for back-refs, slices, and single characters. This effectively makes these constructs closed under case folding. > > This has been verified in the newly added test case **_test/jdk/java/util/regex/CaseFoldingTest.java_**. > > For example: > > Pattern.compile("(?ui)\u017f").matcher("S").matches(). => true > Pattern.compile("(?ui)[\u017f]").matcher("S").matches() => true > > The character properties (families) are not "closed" and should remain unchanged. This is acceptable per RL1.5, if the behavior is clearly specified (TBD: update javadoc to reflect this). > > **Current Non-Conformance: Character Class Ranges**, as reported in the original bug report. > > Pattern.compile("(?ui)[\u017f-\u017f]").matcher("S").matches() => false > vs > Pattern.compile("(?ui)[S-S]").... Xueming Shen has updated the pull request incrementally with one additional commit since the last revision: update to address the review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26285/files - new: https://git.openjdk.org/jdk/pull/26285/files/640d7a61..735bd722 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26285&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26285&range=00-01 Stats: 40 lines in 2 files changed: 7 ins; 12 del; 21 mod Patch: https://git.openjdk.org/jdk/pull/26285.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26285/head:pull/26285 PR: https://git.openjdk.org/jdk/pull/26285 From sherman at openjdk.org Mon Jul 14 07:58:39 2025 From: sherman at openjdk.org (Xueming Shen) Date: Mon, 14 Jul 2025 07:58:39 GMT Subject: RFR: 8360459: UNICODE_CASE and character class with non-ASCII range does not match ASCII char [v2] In-Reply-To: References: Message-ID: On Mon, 14 Jul 2025 05:08:58 GMT, Chen Liang wrote: >> Xueming Shen has updated the pull request incrementally with one additional commit since the last revision: >> >> update to address the review comments > > make/jdk/src/classes/build/tools/generatecharacter/CaseFolding.java line 45: > >> 43: var caseFoldingTxt = Paths.get(args[1]); >> 44: var genSrcFile = Paths.get(args[2]); >> 45: var supportedTypes = "^.*; [CTS]; .*$"; > > Do we still need T here given you already have a hardcoded special case? Yes, there is another T entry for the 'I's that is picked by the logic ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26285#discussion_r2204103748 From naoto at openjdk.org Mon Jul 14 18:14:48 2025 From: naoto at openjdk.org (Naoto Sato) Date: Mon, 14 Jul 2025 18:14:48 GMT Subject: RFR: 8360459: UNICODE_CASE and character class with non-ASCII range does not match ASCII char [v2] In-Reply-To: <9h1T1edYRoTT3v5CPkY9DN9Lq0bSnDoU8VtK2xn4sIA=.9b57eb79-b59e-4b59-a2ed-94b68735c04f@github.com> References: <9h1T1edYRoTT3v5CPkY9DN9Lq0bSnDoU8VtK2xn4sIA=.9b57eb79-b59e-4b59-a2ed-94b68735c04f@github.com> Message-ID: On Mon, 14 Jul 2025 07:54:31 GMT, Xueming Shen wrote: >> Regex class should conform to **_Level 1_** of [Unicode Technical Standard #18: Unicode Regular Expressions](http://www.unicode.org/reports/tr18/), plus RL2.1 Canonical Equivalents and RL2.2 Extended Grapheme Clusters. >> >> This PR primarily addresses conformance with RL1.5: Simple Loose Matches, which requires that simple case folding be applied to literals and (optionally) to character classes. When applied to character classes, each class is expected to be closed under simple case folding. See the standard for a detailed explanation of what it means for a class to be ?closed.? >> >> To conform with Level 1 of UTS #18, specifically RL1.5: Simple Loose Matches, simple case folding must be applied to literals and (optionally) to character classes. When applied to character classes, each character class is expected to **be closed under simple case folding**. See the standard for the detailed explanation and example of "closed". >> >> **RL1.5 states**: >> >> To meet this requirement, an implementation that supports case-sensitive matching should >> >> 1. Provide at least the simple, default Unicode case-insensitive matching, and >> 2. Specify which character properties or constructs are closed under the matching. >> >> **In the Pattern implementation**, 5 types of constructs may be affected by case sensitivity: >> >> 1. back-refs >> 2. string slices (sequences) >> 3. single character, >> 4. character families (Unicode Properties ...), and >> 5. character class ranges >> >> **Note**: Single characters and families may appear independently or within a character class. >> >> For case-insensitive (loose) matching, the implementation already applies Character.toUpperCase() and Character.toLowerCase() to **both the pattern and the input string** for back-refs, slices, and single characters. This effectively makes these constructs closed under case folding. >> >> This has been verified in the newly added test case **_test/jdk/java/util/regex/CaseFoldingTest.java_**. >> >> For example: >> >> Pattern.compile("(?ui)\u017f").matcher("S").matches(). => true >> Pattern.compile("(?ui)[\u017f]").matcher("S").matches() => true >> >> The character properties (families) are not "closed" and should remain unchanged. This is acceptable per RL1.5, if the behavior is clearly specified (TBD: update javadoc to reflect this). >> >> **Current Non-Conformance: Character Class Ranges**, as reported in the original bug report. >> >> Pattern.compile("(?ui)[\u017f-\u... > > Xueming Shen has updated the pull request incrementally with one additional commit since the last revision: > > update to address the review comments Looks good. Thanks for adding case folding support which is long overdue ? Since this is adding a new support for casefolding for character class ranges, I think CSR and a release note should be considered. make/jdk/src/classes/build/tools/generatecharacter/CaseFolding.java line 73: > 71: StandardOpenOption.CREATE, StandardOpenOption.TRUNCATE_EXISTING); > 72: } > 73: } Needs a NL at the end test/jdk/java/util/regex/CaseFoldingTest.java line 30: > 28: * @library /lib/testlibrary/java/lang > 29: * @author Xueming Shen > 30: * @run testng CaseFoldingTest Since this is a new test, I think we prefer junit over testng test/jdk/java/util/regex/CaseFoldingTest.java line 61: > 59: > 60: var results = Files.readAllLines(UCDFiles.CASEFOLDING) > 61: .stream() Files.lines() may be more concise test/jdk/lib/testlibrary/java/lang/UCDFiles.java line 59: > 57: UCD_DIR.resolve("emoji").resolve("emoji-data.txt"); > 58: public static Path CASEFOLDING = > 59: UCD_DIR.resolve("CaseFolding.txt"); Copyright year -> 2025 ------------- PR Review: https://git.openjdk.org/jdk/pull/26285#pullrequestreview-3017279774 PR Review Comment: https://git.openjdk.org/jdk/pull/26285#discussion_r2205510750 PR Review Comment: https://git.openjdk.org/jdk/pull/26285#discussion_r2205508784 PR Review Comment: https://git.openjdk.org/jdk/pull/26285#discussion_r2205517080 PR Review Comment: https://git.openjdk.org/jdk/pull/26285#discussion_r2205521609 From sherman at openjdk.org Mon Jul 14 20:13:06 2025 From: sherman at openjdk.org (Xueming Shen) Date: Mon, 14 Jul 2025 20:13:06 GMT Subject: RFR: 8360459: UNICODE_CASE and character class with non-ASCII range does not match ASCII char [v3] In-Reply-To: References: Message-ID: > Regex class should conform to **_Level 1_** of [Unicode Technical Standard #18: Unicode Regular Expressions](http://www.unicode.org/reports/tr18/), plus RL2.1 Canonical Equivalents and RL2.2 Extended Grapheme Clusters. > > This PR primarily addresses conformance with RL1.5: Simple Loose Matches, which requires that simple case folding be applied to literals and (optionally) to character classes. When applied to character classes, each class is expected to be closed under simple case folding. See the standard for a detailed explanation of what it means for a class to be ?**_closed_**.? > > **RL1.5 states**: > > To meet this requirement, an implementation that supports case-sensitive matching should > > 1. Provide at least the simple, default Unicode case-insensitive matching, and > 2. Specify which character properties or constructs are closed under the matching. > > **In the Pattern implementation**, 5 types of constructs may be affected by case sensitivity: > > 1. back-refs > 2. string slices (sequences) > 3. single character, > 4. character families (Unicode Properties ...), and > 5. character class ranges > > **Note**: Single characters and families may appear independently or within a character class. > > For case-insensitive (loose) matching, the implementation already applies Character.toUpperCase() and Character.toLowerCase() to **both the pattern and the input string** for back-refs, slices, and single characters. This effectively makes these constructs closed under case folding. > > This has been verified in the newly added test case **_test/jdk/java/util/regex/CaseFoldingTest.java_**. > > For example: > > Pattern.compile("(?ui)\u017f").matcher("S").matches(). => true > Pattern.compile("(?ui)[\u017f]").matcher("S").matches() => true > > The character properties (families) are not "closed" and should remain unchanged. This is acceptable per RL1.5, if the behavior is clearly specified (TBD: update javadoc to reflect this). > > **Current Non-Conformance: Character Class Ranges**, as reported in the original bug report. > > Pattern.compile("(?ui)[\u017f-\u017f]").matcher("S").matches() => false > vs > Pattern.compile("(?ui)[S-S]").matcher("\u017f").matches(). => true > > vs Perl. (Perl also claims to support the Unicode's loose match with it it's "i" modifier) > > perl -C -e 'print "S" =~ /[\x{017f}-\x{017f}]/ ? "true\n" : "false\n"'. => false > perl -C -e 'print "S" =~ /[\x{017f}-\x{017f}]/**_i_** ? "true\n" : "false\n"'. => **_true_** > > The root issue is that the ran... Xueming Shen has updated the pull request incrementally with one additional commit since the last revision: update to address the review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26285/files - new: https://git.openjdk.org/jdk/pull/26285/files/735bd722..e18d2668 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26285&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26285&range=01-02 Stats: 11 lines in 3 files changed: 0 ins; 4 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/26285.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26285/head:pull/26285 PR: https://git.openjdk.org/jdk/pull/26285 From duke at openjdk.org Mon Jul 14 20:22:50 2025 From: duke at openjdk.org (mezz) Date: Mon, 14 Jul 2025 20:22:50 GMT Subject: RFR: 8359053: Implement JEP 504 - Remove the Applet API [v10] In-Reply-To: <5hv53zmZA9kKblG14CHqjUjg1tXK_y6RePXkaRKbXAI=.0dc6287e-1af8-402f-862f-cc2da1075909@github.com> References: <5hv53zmZA9kKblG14CHqjUjg1tXK_y6RePXkaRKbXAI=.0dc6287e-1af8-402f-862f-cc2da1075909@github.com> Message-ID: On Mon, 23 Jun 2025 19:42:10 GMT, Alexey Ivanov wrote: >> Phil Race has updated the pull request incrementally with one additional commit since the last revision: >> >> 8359053 > > src/java.desktop/share/classes/java/awt/doc-files/Modality.html line 352: > >> 350: Dialog(owner, true), etc. Prior to JDK 6 >> 351: the default type was toolkit-modal, >> 352: and now with single application per-VM there is no > > Why is it ?per-VM? instead of ?per VM?? > > ??Single application per VM??, in this sentence ?per? is a preposition and ?VM? is a noun, don't you agree? There should be a space, not a hyphen. > > If it were ?per-VM application?, then it would be spelt with a hyphen. Right, "per VM" means "for each VM". This is a minor mistake I've made many times Suggestion: and now with single application per VM, there is no ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25698#discussion_r2197900689 From prr at openjdk.org Mon Jul 14 20:26:54 2025 From: prr at openjdk.org (Phil Race) Date: Mon, 14 Jul 2025 20:26:54 GMT Subject: Integrated: 8359053: Implement JEP 504 - Remove the Applet API In-Reply-To: References: Message-ID: On Mon, 9 Jun 2025 18:11:13 GMT, Phil Race wrote: > This is the implementation of JEP 504 - Remove the Applet API. > API changes are > - Remove the entire java.applet package > - Remove the javax/swing/JApplet class > - Remove applet related APIs in java.beans > - Update javadoc referring to applets, including one gif image - now changed to an svg image > - > Other changes are > - Remove references to the removed classes > - Remove obsolete tests > - Update obsolete code comments > > sun.awt.AppContext is even more obsolete now than it was before, but eliminating uses of that will be is not required, > and will be follow-on internal clean up, at a later date, under unrelated bug ids, and likely not completed in the same > release as this JEP is integrated. > > I have extensively tested this - running all the automated tests used by CI tiers 1 to 8. This pull request has now been integrated. Changeset: 5cf672e7 Author: Phil Race URL: https://git.openjdk.org/jdk/commit/5cf672e7784b9a9a82f29977a072b162cc240fd1 Stats: 3261 lines in 86 files changed: 124 ins; 2966 del; 171 mod 8359053: Implement JEP 504 - Remove the Applet API Reviewed-by: aivanov, kizune, kcr, achung, serb ------------- PR: https://git.openjdk.org/jdk/pull/25698 From sherman at openjdk.org Mon Jul 14 20:40:39 2025 From: sherman at openjdk.org (Xueming Shen) Date: Mon, 14 Jul 2025 20:40:39 GMT Subject: RFR: 8360459: UNICODE_CASE and character class with non-ASCII range does not match ASCII char [v2] In-Reply-To: References: <9h1T1edYRoTT3v5CPkY9DN9Lq0bSnDoU8VtK2xn4sIA=.9b57eb79-b59e-4b59-a2ed-94b68735c04f@github.com> Message-ID: On Mon, 14 Jul 2025 18:10:53 GMT, Naoto Sato wrote: > Looks good. Thanks for adding case folding support which is long overdue ? Since this is adding a new support for casefolding for character class ranges, I think CSR and a release note should be considered. Thanks for the review. Arguably, the change I made years ago to support Level 1 + RL2.1/2 already implies that character class ranges should conform to RL1.5 ? just like other constructs (back-ref, slice, single and property) So it might be reasonable to categorize this as "just" a pure bug fix. That said, it is a behavioral change, and I?m happy to go through the CSR and release note process if strongly preferred. ? My initial thought was to defer the CSR until we fully switch to a case-folding-mapping?based implementation (replacing the current toUpperCase/toLowerCase logic), at which point we could also update the javadoc to explicitly document the behavior of each construct, as RL1.5 recommends/suggests. But if we prefer to align all of that now with this fix, I?m fine doing it together. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26285#issuecomment-3070905666 From naoto at openjdk.org Mon Jul 14 23:07:38 2025 From: naoto at openjdk.org (Naoto Sato) Date: Mon, 14 Jul 2025 23:07:38 GMT Subject: RFR: 8360459: UNICODE_CASE and character class with non-ASCII range does not match ASCII char [v3] In-Reply-To: References: Message-ID: <8xXhxlxCAZZxhZ4fzXjOY797duMUpixmRB6mtS_pPUg=.c848f4b9-59e6-4f5e-a6c7-4254b2ee253c@github.com> On Mon, 14 Jul 2025 20:13:06 GMT, Xueming Shen wrote: >> Regex class should conform to **_Level 1_** of [Unicode Technical Standard #18: Unicode Regular Expressions](http://www.unicode.org/reports/tr18/), plus RL2.1 Canonical Equivalents and RL2.2 Extended Grapheme Clusters. >> >> This PR primarily addresses conformance with RL1.5: Simple Loose Matches, which requires that simple case folding be applied to literals and (optionally) to character classes. When applied to character classes, each class is expected to be closed under simple case folding. See the standard for a detailed explanation of what it means for a class to be ?**_closed_**.? >> >> **RL1.5 states**: >> >> To meet this requirement, an implementation that supports case-sensitive matching should >> >> 1. Provide at least the simple, default Unicode case-insensitive matching, and >> 2. Specify which character properties or constructs are closed under the matching. >> >> **In the Pattern implementation**, 5 types of constructs may be affected by case sensitivity: >> >> 1. back-refs >> 2. string slices (sequences) >> 3. single character, >> 4. character families (Unicode Properties ...), and >> 5. character class ranges >> >> **Note**: Single characters and families may appear independently or within a character class. >> >> For case-insensitive (loose) matching, the implementation already applies Character.toUpperCase() and Character.toLowerCase() to **both the pattern and the input string** for back-refs, slices, and single characters. This effectively makes these constructs closed under case folding. >> >> This has been verified in the newly added test case **_test/jdk/java/util/regex/CaseFoldingTest.java_**. >> >> For example: >> >> Pattern.compile("(?ui)\u017f").matcher("S").matches(). => true >> Pattern.compile("(?ui)[\u017f]").matcher("S").matches() => true >> >> The character properties (families) are not "closed" and should remain unchanged. This is acceptable per RL1.5, if the behavior is clearly specified (TBD: update javadoc to reflect this). >> >> **Current Non-Conformance: Character Class Ranges**, as reported in the original bug report. >> >> Pattern.compile("(?ui)[\u017f-\u017f]").matcher("S").matches() => false >> vs >> Pattern.compile("(?ui)[S-S]").matcher("\u017f").matches(). => true >> >> vs Perl. (Perl also claims to support the Unicode's loose match with it it's "i" modifier) >> >> perl -C -e 'print "S" =~ /[\x{017f}-\x{017f}]/ ? "true\n" : "false\n"'. => false >> perl -C -e 'print "S" =~ /[\x{017f}-\x{0... > > Xueming Shen has updated the pull request incrementally with one additional commit since the last revision: > > update to address the review comments Changes look good to me. As to the CSR, it seems ok without it if this is a pure bug fix. ------------- Marked as reviewed by naoto (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/26285#pullrequestreview-3018037051 From sherman at openjdk.org Tue Jul 15 00:32:22 2025 From: sherman at openjdk.org (Xueming Shen) Date: Tue, 15 Jul 2025 00:32:22 GMT Subject: RFR: 8360459: UNICODE_CASE and character class with non-ASCII range does not match ASCII char [v4] In-Reply-To: References: Message-ID: > Regex class should conform to **_Level 1_** of [Unicode Technical Standard #18: Unicode Regular Expressions](http://www.unicode.org/reports/tr18/), plus RL2.1 Canonical Equivalents and RL2.2 Extended Grapheme Clusters. > > This PR primarily addresses conformance with RL1.5: Simple Loose Matches, which requires that simple case folding be applied to literals and (optionally) to character classes. When applied to character classes, each class is expected to be closed under simple case folding. See the standard for a detailed explanation of what it means for a class to be ?**_closed_**.? > > **RL1.5 states**: > > To meet this requirement, an implementation that supports case-sensitive matching should > > 1. Provide at least the simple, default Unicode case-insensitive matching, and > 2. Specify which character properties or constructs are closed under the matching. > > **In the Pattern implementation**, 5 types of constructs may be affected by case sensitivity: > > 1. back-refs > 2. string slices (sequences) > 3. single character, > 4. character families (Unicode Properties ...), and > 5. character class ranges > > **Note**: Single characters and families may appear independently or within a character class. > > For case-insensitive (loose) matching, the implementation already applies Character.toUpperCase() and Character.toLowerCase() to **both the pattern and the input string** for back-refs, slices, and single characters. This effectively makes these constructs closed under case folding. > > This has been verified in the newly added test case **_test/jdk/java/util/regex/CaseFoldingTest.java_**. > > For example: > > Pattern.compile("(?ui)\u017f").matcher("S").matches(). => true > Pattern.compile("(?ui)[\u017f]").matcher("S").matches() => true > > The character properties (families) are not "closed" and should remain unchanged. This is acceptable per RL1.5, if the behavior is clearly specified (TBD: update javadoc to reflect this). > > **Current Non-Conformance: Character Class Ranges**, as reported in the original bug report. > > Pattern.compile("(?ui)[\u017f-\u017f]").matcher("S").matches() => false > vs > Pattern.compile("(?ui)[S-S]").matcher("\u017f").matches(). => true > > vs Perl. (Perl also claims to support the Unicode's loose match with it it's "i" modifier) > > perl -C -e 'print "S" =~ /[\x{017f}-\x{017f}]/ ? "true\n" : "false\n"'. => false > perl -C -e 'print "S" =~ /[\x{017f}-\x{017f}]/**_i_** ? "true\n" : "false\n"'. => **_true_** > > The root issue is that the ran... Xueming Shen has updated the pull request incrementally with one additional commit since the last revision: update and add more test cases, and fix a test failure ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26285/files - new: https://git.openjdk.org/jdk/pull/26285/files/e18d2668..c2afc42c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26285&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26285&range=02-03 Stats: 26 lines in 2 files changed: 20 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/26285.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26285/head:pull/26285 PR: https://git.openjdk.org/jdk/pull/26285 From sherman at openjdk.org Tue Jul 15 15:11:07 2025 From: sherman at openjdk.org (Xueming Shen) Date: Tue, 15 Jul 2025 15:11:07 GMT Subject: RFR: 8360459: UNICODE_CASE and character class with non-ASCII range does not match ASCII char [v5] In-Reply-To: References: Message-ID: > Regex class should conform to **_Level 1_** of [Unicode Technical Standard #18: Unicode Regular Expressions](http://www.unicode.org/reports/tr18/), plus RL2.1 Canonical Equivalents and RL2.2 Extended Grapheme Clusters. > > This PR primarily addresses conformance with RL1.5: Simple Loose Matches, which requires that simple case folding be applied to literals and (optionally) to character classes. When applied to character classes, each class is expected to be closed under simple case folding. See the standard for a detailed explanation of what it means for a class to be ?**_closed_**.? > > **RL1.5 states**: > > To meet this requirement, an implementation that supports case-sensitive matching should > > 1. Provide at least the simple, default Unicode case-insensitive matching, and > 2. Specify which character properties or constructs are closed under the matching. > > **In the Pattern implementation**, 5 types of constructs may be affected by case sensitivity: > > 1. back-refs > 2. string slices (sequences) > 3. single character, > 4. character families (Unicode Properties ...), and > 5. character class ranges > > **Note**: Single characters and families may appear independently or within a character class. > > For case-insensitive (loose) matching, the implementation already applies Character.toUpperCase() and Character.toLowerCase() to **both the pattern and the input string** for back-refs, slices, and single characters. This effectively makes these constructs closed under case folding. > > This has been verified in the newly added test case **_test/jdk/java/util/regex/CaseFoldingTest.java_**. > > For example: > > Pattern.compile("(?ui)\u017f").matcher("S").matches(). => true > Pattern.compile("(?ui)[\u017f]").matcher("S").matches() => true > > The character properties (families) are not "closed" and should remain unchanged. This is acceptable per RL1.5, if the behavior is clearly specified (TBD: update javadoc to reflect this). > > **Current Non-Conformance: Character Class Ranges**, as reported in the original bug report. > > Pattern.compile("(?ui)[\u017f-\u017f]").matcher("S").matches() => false > vs > Pattern.compile("(?ui)[S-S]").matcher("\u017f").matches(). => true > > vs Perl. (Perl also claims to support the Unicode's loose match with it it's "i" modifier) > > perl -C -e 'print "S" =~ /[\x{017f}-\x{017f}]/ ? "true\n" : "false\n"'. => false > perl -C -e 'print "S" =~ /[\x{017f}-\x{017f}]/**_i_** ? "true\n" : "false\n"'. => **_true_** > > The root issue is that the ran... Xueming Shen has updated the pull request incrementally with one additional commit since the last revision: improve the lookup logic and test case for +00df ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26285/files - new: https://git.openjdk.org/jdk/pull/26285/files/c2afc42c..b85f581f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26285&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26285&range=03-04 Stats: 44 lines in 3 files changed: 31 ins; 3 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/26285.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26285/head:pull/26285 PR: https://git.openjdk.org/jdk/pull/26285 From sherman at openjdk.org Tue Jul 15 15:18:54 2025 From: sherman at openjdk.org (Xueming Shen) Date: Tue, 15 Jul 2025 15:18:54 GMT Subject: RFR: 8360459: UNICODE_CASE and character class with non-ASCII range does not match ASCII char [v5] In-Reply-To: <6AFn-UzXqb_oY_XGpadyiepteLDrlHJsSqZfXrybPug=.c8454bbe-15df-463a-8e6c-53022ece337b@github.com> References: <6AFn-UzXqb_oY_XGpadyiepteLDrlHJsSqZfXrybPug=.c8454bbe-15df-463a-8e6c-53022ece337b@github.com> Message-ID: On Mon, 14 Jul 2025 07:28:09 GMT, Xueming Shen wrote: >> src/java.base/share/classes/jdk/internal/util/regex/CaseFolding.java.template line 99: >> >>> 97: */ >>> 98: public static int[] getClassRangeClosingCharacters(int start, int end) { >>> 99: int[] expanded = new int[expanded_casefolding.size()]; >> >> Can be `Math.min(expanded_casefolding.size(), end - start)` in case the table grows large, and update the `off < expanded.length` check below too. > > The table itself probably isn't going to grow significantly anytime soon, and we?ll likely have enough time to adjust if CaseFolding.txt does get substantially bigger. > > That said, I probably should consider reversing the lookup logic: instead of iterating through [start, end], we could iterate over the expansion table and check whether any of its code points fall within the input range, at least when the range size is larger than the size of the table, kinda O(n) vs O(1)-ish. updated the lookup logic as discussed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26285#discussion_r2207809731 From naoto at openjdk.org Tue Jul 15 16:37:43 2025 From: naoto at openjdk.org (Naoto Sato) Date: Tue, 15 Jul 2025 16:37:43 GMT Subject: RFR: 8360459: UNICODE_CASE and character class with non-ASCII range does not match ASCII char [v5] In-Reply-To: References: Message-ID: <25jigXcio1o-HB3vf2ZMDBysoPU9whhSOgKZWnMrmd0=.575c1708-5960-441c-a109-1bceb88568b1@github.com> On Tue, 15 Jul 2025 15:11:07 GMT, Xueming Shen wrote: >> Regex class should conform to **_Level 1_** of [Unicode Technical Standard #18: Unicode Regular Expressions](http://www.unicode.org/reports/tr18/), plus RL2.1 Canonical Equivalents and RL2.2 Extended Grapheme Clusters. >> >> This PR primarily addresses conformance with RL1.5: Simple Loose Matches, which requires that simple case folding be applied to literals and (optionally) to character classes. When applied to character classes, each class is expected to be closed under simple case folding. See the standard for a detailed explanation of what it means for a class to be ?**_closed_**.? >> >> **RL1.5 states**: >> >> To meet this requirement, an implementation that supports case-sensitive matching should >> >> 1. Provide at least the simple, default Unicode case-insensitive matching, and >> 2. Specify which character properties or constructs are closed under the matching. >> >> **In the Pattern implementation**, 5 types of constructs may be affected by case sensitivity: >> >> 1. back-refs >> 2. string slices (sequences) >> 3. single character, >> 4. character families (Unicode Properties ...), and >> 5. character class ranges >> >> **Note**: Single characters and families may appear independently or within a character class. >> >> For case-insensitive (loose) matching, the implementation already applies Character.toUpperCase() and Character.toLowerCase() to **both the pattern and the input string** for back-refs, slices, and single characters. This effectively makes these constructs closed under case folding. >> >> This has been verified in the newly added test case **_test/jdk/java/util/regex/CaseFoldingTest.java_**. >> >> For example: >> >> Pattern.compile("(?ui)\u017f").matcher("S").matches(). => true >> Pattern.compile("(?ui)[\u017f]").matcher("S").matches() => true >> >> The character properties (families) are not "closed" and should remain unchanged. This is acceptable per RL1.5, if the behavior is clearly specified (TBD: update javadoc to reflect this). >> >> **Current Non-Conformance: Character Class Ranges**, as reported in the original bug report. >> >> Pattern.compile("(?ui)[\u017f-\u017f]").matcher("S").matches() => false >> vs >> Pattern.compile("(?ui)[S-S]").matcher("\u017f").matches(). => true >> >> vs Perl. (Perl also claims to support the Unicode's loose match with it it's "i" modifier) >> >> perl -C -e 'print "S" =~ /[\x{017f}-\x{017f}]/ ? "true\n" : "false\n"'. => false >> perl -C -e 'print "S" =~ /[\x{017f}-\x{0... > > Xueming Shen has updated the pull request incrementally with one additional commit since the last revision: > > improve the lookup logic and test case for +00df Updates look good to me. test/jdk/java/util/regex/CaseFoldingTest.java line 51: > 49: var excluded = Set.of( > 50: // these 'S' characters failed for known reason. they don't map to their > 51: // fording form with toUpperCase or toLowerCase, only map with case-folding. nit: fording -> folding ------------- PR Review: https://git.openjdk.org/jdk/pull/26285#pullrequestreview-3021193767 PR Review Comment: https://git.openjdk.org/jdk/pull/26285#discussion_r2207985998 From sherman at openjdk.org Tue Jul 15 16:56:44 2025 From: sherman at openjdk.org (Xueming Shen) Date: Tue, 15 Jul 2025 16:56:44 GMT Subject: RFR: 8360459: UNICODE_CASE and character class with non-ASCII range does not match ASCII char [v5] In-Reply-To: References: Message-ID: On Tue, 15 Jul 2025 15:11:07 GMT, Xueming Shen wrote: >> Regex class should conform to **_Level 1_** of [Unicode Technical Standard #18: Unicode Regular Expressions](http://www.unicode.org/reports/tr18/), plus RL2.1 Canonical Equivalents and RL2.2 Extended Grapheme Clusters. >> >> This PR primarily addresses conformance with RL1.5: Simple Loose Matches, which requires that simple case folding be applied to literals and (optionally) to character classes. When applied to character classes, each class is expected to be closed under simple case folding. See the standard for a detailed explanation of what it means for a class to be ?**_closed_**.? >> >> **RL1.5 states**: >> >> To meet this requirement, an implementation that supports case-sensitive matching should >> >> 1. Provide at least the simple, default Unicode case-insensitive matching, and >> 2. Specify which character properties or constructs are closed under the matching. >> >> **In the Pattern implementation**, 5 types of constructs may be affected by case sensitivity: >> >> 1. back-refs >> 2. string slices (sequences) >> 3. single character, >> 4. character families (Unicode Properties ...), and >> 5. character class ranges >> >> **Note**: Single characters and families may appear independently or within a character class. >> >> For case-insensitive (loose) matching, the implementation already applies Character.toUpperCase() and Character.toLowerCase() to **both the pattern and the input string** for back-refs, slices, and single characters. This effectively makes these constructs closed under case folding. >> >> This has been verified in the newly added test case **_test/jdk/java/util/regex/CaseFoldingTest.java_**. >> >> For example: >> >> Pattern.compile("(?ui)\u017f").matcher("S").matches(). => true >> Pattern.compile("(?ui)[\u017f]").matcher("S").matches() => true >> >> The character properties (families) are not "closed" and should remain unchanged. This is acceptable per RL1.5, if the behavior is clearly specified (TBD: update javadoc to reflect this). >> >> **Current Non-Conformance: Character Class Ranges**, as reported in the original bug report. >> >> Pattern.compile("(?ui)[\u017f-\u017f]").matcher("S").matches() => false >> vs >> Pattern.compile("(?ui)[S-S]").matcher("\u017f").matches(). => true >> >> vs Perl. (Perl also claims to support the Unicode's loose match with it it's "i" modifier) >> >> perl -C -e 'print "S" =~ /[\x{017f}-\x{017f}]/ ? "true\n" : "false\n"'. => false >> perl -C -e 'print "S" =~ /[\x{017f}-\x{0... > > Xueming Shen has updated the pull request incrementally with one additional commit since the last revision: > > improve the lookup logic and test case for +00df Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/26285#issuecomment-3074413884 From sherman at openjdk.org Tue Jul 15 17:47:29 2025 From: sherman at openjdk.org (Xueming Shen) Date: Tue, 15 Jul 2025 17:47:29 GMT Subject: RFR: 8360459: UNICODE_CASE and character class with non-ASCII range does not match ASCII char [v6] In-Reply-To: References: Message-ID: <5E6oBo3DXqhtFDuwQJTinQxlb0J14QjaXxbKvj5JK0Q=.64180f86-6b61-4383-8f5a-dfb71d1cbd8d@github.com> > Regex class should conform to **_Level 1_** of [Unicode Technical Standard #18: Unicode Regular Expressions](http://www.unicode.org/reports/tr18/), plus RL2.1 Canonical Equivalents and RL2.2 Extended Grapheme Clusters. > > This PR primarily addresses conformance with RL1.5: Simple Loose Matches, which requires that simple case folding be applied to literals and (optionally) to character classes. When applied to character classes, each class is expected to be closed under simple case folding. See the standard for a detailed explanation of what it means for a class to be ?**_closed_**.? > > **RL1.5 states**: > > To meet this requirement, an implementation that supports case-sensitive matching should > > 1. Provide at least the simple, default Unicode case-insensitive matching, and > 2. Specify which character properties or constructs are closed under the matching. > > **In the Pattern implementation**, 5 types of constructs may be affected by case sensitivity: > > 1. back-refs > 2. string slices (sequences) > 3. single character, > 4. character families (Unicode Properties ...), and > 5. character class ranges > > **Note**: Single characters and families may appear independently or within a character class. > > For case-insensitive (loose) matching, the implementation already applies Character.toUpperCase() and Character.toLowerCase() to **both the pattern and the input string** for back-refs, slices, and single characters. This effectively makes these constructs closed under case folding. > > This has been verified in the newly added test case **_test/jdk/java/util/regex/CaseFoldingTest.java_**. > > For example: > > Pattern.compile("(?ui)\u017f").matcher("S").matches(). => true > Pattern.compile("(?ui)[\u017f]").matcher("S").matches() => true > > The character properties (families) are not "closed" and should remain unchanged. This is acceptable per RL1.5, if the behavior is clearly specified (TBD: update javadoc to reflect this). > > **Current Non-Conformance: Character Class Ranges**, as reported in the original bug report. > > Pattern.compile("(?ui)[\u017f-\u017f]").matcher("S").matches() => false > vs > Pattern.compile("(?ui)[S-S]").matcher("\u017f").matches(). => true > > vs Perl. (Perl also claims to support the Unicode's loose match with it it's "i" modifier) > > perl -C -e 'print "S" =~ /[\x{017f}-\x{017f}]/ ? "true\n" : "false\n"'. => false > perl -C -e 'print "S" =~ /[\x{017f}-\x{017f}]/**_i_** ? "true\n" : "false\n"'. => **_true_** > > The root issue is that the ran... Xueming Shen has updated the pull request incrementally with one additional commit since the last revision: update to fix the typo ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26285/files - new: https://git.openjdk.org/jdk/pull/26285/files/b85f581f..a090888f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26285&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26285&range=04-05 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/26285.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26285/head:pull/26285 PR: https://git.openjdk.org/jdk/pull/26285 From naoto at openjdk.org Tue Jul 15 17:47:30 2025 From: naoto at openjdk.org (Naoto Sato) Date: Tue, 15 Jul 2025 17:47:30 GMT Subject: RFR: 8360459: UNICODE_CASE and character class with non-ASCII range does not match ASCII char [v6] In-Reply-To: <5E6oBo3DXqhtFDuwQJTinQxlb0J14QjaXxbKvj5JK0Q=.64180f86-6b61-4383-8f5a-dfb71d1cbd8d@github.com> References: <5E6oBo3DXqhtFDuwQJTinQxlb0J14QjaXxbKvj5JK0Q=.64180f86-6b61-4383-8f5a-dfb71d1cbd8d@github.com> Message-ID: On Tue, 15 Jul 2025 17:44:00 GMT, Xueming Shen wrote: >> Regex class should conform to **_Level 1_** of [Unicode Technical Standard #18: Unicode Regular Expressions](http://www.unicode.org/reports/tr18/), plus RL2.1 Canonical Equivalents and RL2.2 Extended Grapheme Clusters. >> >> This PR primarily addresses conformance with RL1.5: Simple Loose Matches, which requires that simple case folding be applied to literals and (optionally) to character classes. When applied to character classes, each class is expected to be closed under simple case folding. See the standard for a detailed explanation of what it means for a class to be ?**_closed_**.? >> >> **RL1.5 states**: >> >> To meet this requirement, an implementation that supports case-sensitive matching should >> >> 1. Provide at least the simple, default Unicode case-insensitive matching, and >> 2. Specify which character properties or constructs are closed under the matching. >> >> **In the Pattern implementation**, 5 types of constructs may be affected by case sensitivity: >> >> 1. back-refs >> 2. string slices (sequences) >> 3. single character, >> 4. character families (Unicode Properties ...), and >> 5. character class ranges >> >> **Note**: Single characters and families may appear independently or within a character class. >> >> For case-insensitive (loose) matching, the implementation already applies Character.toUpperCase() and Character.toLowerCase() to **both the pattern and the input string** for back-refs, slices, and single characters. This effectively makes these constructs closed under case folding. >> >> This has been verified in the newly added test case **_test/jdk/java/util/regex/CaseFoldingTest.java_**. >> >> For example: >> >> Pattern.compile("(?ui)\u017f").matcher("S").matches(). => true >> Pattern.compile("(?ui)[\u017f]").matcher("S").matches() => true >> >> The character properties (families) are not "closed" and should remain unchanged. This is acceptable per RL1.5, if the behavior is clearly specified (TBD: update javadoc to reflect this). >> >> **Current Non-Conformance: Character Class Ranges**, as reported in the original bug report. >> >> Pattern.compile("(?ui)[\u017f-\u017f]").matcher("S").matches() => false >> vs >> Pattern.compile("(?ui)[S-S]").matcher("\u017f").matches(). => true >> >> vs Perl. (Perl also claims to support the Unicode's loose match with it it's "i" modifier) >> >> perl -C -e 'print "S" =~ /[\x{017f}-\x{017f}]/ ? "true\n" : "false\n"'. => false >> perl -C -e 'print "S" =~ /[\x{017f}-\x{0... > > Xueming Shen has updated the pull request incrementally with one additional commit since the last revision: > > update to fix the typo Marked as reviewed by naoto (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/26285#pullrequestreview-3021537588 From sherman at openjdk.org Tue Jul 15 17:59:47 2025 From: sherman at openjdk.org (Xueming Shen) Date: Tue, 15 Jul 2025 17:59:47 GMT Subject: RFR: 8360459: UNICODE_CASE and character class with non-ASCII range does not match ASCII char [v6] In-Reply-To: <5E6oBo3DXqhtFDuwQJTinQxlb0J14QjaXxbKvj5JK0Q=.64180f86-6b61-4383-8f5a-dfb71d1cbd8d@github.com> References: <5E6oBo3DXqhtFDuwQJTinQxlb0J14QjaXxbKvj5JK0Q=.64180f86-6b61-4383-8f5a-dfb71d1cbd8d@github.com> Message-ID: On Tue, 15 Jul 2025 17:47:29 GMT, Xueming Shen wrote: >> Regex class should conform to **_Level 1_** of [Unicode Technical Standard #18: Unicode Regular Expressions](http://www.unicode.org/reports/tr18/), plus RL2.1 Canonical Equivalents and RL2.2 Extended Grapheme Clusters. >> >> This PR primarily addresses conformance with RL1.5: Simple Loose Matches, which requires that simple case folding be applied to literals and (optionally) to character classes. When applied to character classes, each class is expected to be closed under simple case folding. See the standard for a detailed explanation of what it means for a class to be ?**_closed_**.? >> >> **RL1.5 states**: >> >> To meet this requirement, an implementation that supports case-sensitive matching should >> >> 1. Provide at least the simple, default Unicode case-insensitive matching, and >> 2. Specify which character properties or constructs are closed under the matching. >> >> **In the Pattern implementation**, 5 types of constructs may be affected by case sensitivity: >> >> 1. back-refs >> 2. string slices (sequences) >> 3. single character, >> 4. character families (Unicode Properties ...), and >> 5. character class ranges >> >> **Note**: Single characters and families may appear independently or within a character class. >> >> For case-insensitive (loose) matching, the implementation already applies Character.toUpperCase() and Character.toLowerCase() to **both the pattern and the input string** for back-refs, slices, and single characters. This effectively makes these constructs closed under case folding. >> >> This has been verified in the newly added test case **_test/jdk/java/util/regex/CaseFoldingTest.java_**. >> >> For example: >> >> Pattern.compile("(?ui)\u017f").matcher("S").matches(). => true >> Pattern.compile("(?ui)[\u017f]").matcher("S").matches() => true >> >> The character properties (families) are not "closed" and should remain unchanged. This is acceptable per RL1.5, if the behavior is clearly specified (TBD: update javadoc to reflect this). >> >> **Current Non-Conformance: Character Class Ranges**, as reported in the original bug report. >> >> Pattern.compile("(?ui)[\u017f-\u017f]").matcher("S").matches() => false >> vs >> Pattern.compile("(?ui)[S-S]").matcher("\u017f").matches(). => true >> >> vs Perl. (Perl also claims to support the Unicode's loose match with it it's "i" modifier) >> >> perl -C -e 'print "S" =~ /[\x{017f}-\x{017f}]/ ? "true\n" : "false\n"'. => false >> perl -C -e 'print "S" =~ /[\x{017f}-\x{0... > > Xueming Shen has updated the pull request incrementally with one additional commit since the last revision: > > update to fix the typo Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/26285#issuecomment-3074710596 From sherman at openjdk.org Tue Jul 15 17:59:50 2025 From: sherman at openjdk.org (Xueming Shen) Date: Tue, 15 Jul 2025 17:59:50 GMT Subject: Integrated: 8360459: UNICODE_CASE and character class with non-ASCII range does not match ASCII char In-Reply-To: References: Message-ID: <14NDWi0xfcZsPdfzJDeupxOHKnKJVUJ6tl4jphrJT_s=.29adb663-fc33-4b19-91c1-a264ad2288c5@github.com> On Mon, 14 Jul 2025 04:53:13 GMT, Xueming Shen wrote: > Regex class should conform to **_Level 1_** of [Unicode Technical Standard #18: Unicode Regular Expressions](http://www.unicode.org/reports/tr18/), plus RL2.1 Canonical Equivalents and RL2.2 Extended Grapheme Clusters. > > This PR primarily addresses conformance with RL1.5: Simple Loose Matches, which requires that simple case folding be applied to literals and (optionally) to character classes. When applied to character classes, each class is expected to be closed under simple case folding. See the standard for a detailed explanation of what it means for a class to be ?**_closed_**.? > > **RL1.5 states**: > > To meet this requirement, an implementation that supports case-sensitive matching should > > 1. Provide at least the simple, default Unicode case-insensitive matching, and > 2. Specify which character properties or constructs are closed under the matching. > > **In the Pattern implementation**, 5 types of constructs may be affected by case sensitivity: > > 1. back-refs > 2. string slices (sequences) > 3. single character, > 4. character families (Unicode Properties ...), and > 5. character class ranges > > **Note**: Single characters and families may appear independently or within a character class. > > For case-insensitive (loose) matching, the implementation already applies Character.toUpperCase() and Character.toLowerCase() to **both the pattern and the input string** for back-refs, slices, and single characters. This effectively makes these constructs closed under case folding. > > This has been verified in the newly added test case **_test/jdk/java/util/regex/CaseFoldingTest.java_**. > > For example: > > Pattern.compile("(?ui)\u017f").matcher("S").matches(). => true > Pattern.compile("(?ui)[\u017f]").matcher("S").matches() => true > > The character properties (families) are not "closed" and should remain unchanged. This is acceptable per RL1.5, if the behavior is clearly specified (TBD: update javadoc to reflect this). > > **Current Non-Conformance: Character Class Ranges**, as reported in the original bug report. > > Pattern.compile("(?ui)[\u017f-\u017f]").matcher("S").matches() => false > vs > Pattern.compile("(?ui)[S-S]").matcher("\u017f").matches(). => true > > vs Perl. (Perl also claims to support the Unicode's loose match with it it's "i" modifier) > > perl -C -e 'print "S" =~ /[\x{017f}-\x{017f}]/ ? "true\n" : "false\n"'. => false > perl -C -e 'print "S" =~ /[\x{017f}-\x{017f}]/**_i_** ? "true\n" : "false\n"'. => **_true_** > > The root issue is that the ran... This pull request has now been integrated. Changeset: 401af27b Author: Xueming Shen URL: https://git.openjdk.org/jdk/commit/401af27b9dbc701eb48e5bc685d3ad058e0de3bc Stats: 2084 lines in 9 files changed: 2079 ins; 0 del; 5 mod 8360459: UNICODE_CASE and character class with non-ASCII range does not match ASCII char Reviewed-by: naoto ------------- PR: https://git.openjdk.org/jdk/pull/26285 From liach at openjdk.org Tue Jul 15 23:55:41 2025 From: liach at openjdk.org (Chen Liang) Date: Tue, 15 Jul 2025 23:55:41 GMT Subject: RFR: 8358880: Performance of parsing with DecimalFormat can be improved [v6] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 21:19:45 GMT, Johannes Graham wrote: >> This PR replaces construction of intermediate strings to be parsed with more direct manipulation of numbers. It also has a more streamlined mechanism of handling `Long.MIN_VALUE` when parsing longs by using `Long.parseUnsignedLong` >> >> As a small side-effect it also eliminates the use of a cached StringBuilder in DigitList. >> >> Testing: >> - GHA >> - Local run of tier 2 and jtreg:jdk/java/text >> - New benchmark: DecimalFormatParseBench > > Johannes Graham has updated the pull request incrementally with one additional commit since the last revision: > > add comments Looks reasonable from a code cleanup point of view. ------------- Marked as reviewed by liach (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25644#pullrequestreview-3022646136 From rgiulietti at openjdk.org Thu Jul 17 10:10:05 2025 From: rgiulietti at openjdk.org (Raffaello Giulietti) Date: Thu, 17 Jul 2025 10:10:05 GMT Subject: RFR: 8362448: Make use of the Double.toString(double) algorithm in java.text.DecimalFormat Message-ID: Align the behavior of `DecimalFormat` on `double`s with that of `Formatter`. ------------- Commit messages: - Merge branch 'master' into 8362448 - 8362448: Make use of the Double.toString(double) algorithm in java.text.DecimalFormat - Remove some unused methods from FloatingDecimal. - Renamed compatibility config option. - Merge branch 'master' into dtoa - Remove unused methods. - Merge branch 'master' into dtoa - Added logic for exactness and direction of rounding. Changes: https://git.openjdk.org/jdk/pull/26364/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26364&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8362448 Stats: 98 lines in 6 files changed: 47 ins; 7 del; 44 mod Patch: https://git.openjdk.org/jdk/pull/26364.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26364/head:pull/26364 PR: https://git.openjdk.org/jdk/pull/26364 From rgiulietti at openjdk.org Thu Jul 17 12:28:06 2025 From: rgiulietti at openjdk.org (Raffaello Giulietti) Date: Thu, 17 Jul 2025 12:28:06 GMT Subject: RFR: 8362448: Make use of the Double.toString(double) algorithm in java.text.DecimalFormat [v2] In-Reply-To: References: Message-ID: > Align the behavior of `DecimalFormat` on `double`s with that of `Formatter`. Raffaello Giulietti has updated the pull request incrementally with one additional commit since the last revision: Added comment to COMPAT static field. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26364/files - new: https://git.openjdk.org/jdk/pull/26364/files/21704193..0e16f050 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26364&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26364&range=00-01 Stats: 14 lines in 1 file changed: 14 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/26364.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26364/head:pull/26364 PR: https://git.openjdk.org/jdk/pull/26364 From rriggs at openjdk.org Thu Jul 17 13:43:49 2025 From: rriggs at openjdk.org (Roger Riggs) Date: Thu, 17 Jul 2025 13:43:49 GMT Subject: RFR: 8362448: Make use of the Double.toString(double) algorithm in java.text.DecimalFormat [v2] In-Reply-To: References: Message-ID: On Thu, 17 Jul 2025 12:28:06 GMT, Raffaello Giulietti wrote: >> Align the behavior of `DecimalFormat` on `double`s with that of `Formatter`. > > Raffaello Giulietti has updated the pull request incrementally with one additional commit since the last revision: > > Added comment to COMPAT static field. src/java.base/share/classes/jdk/internal/math/FormattedFPDecimal.java line 51: > 49: > 50: private boolean exact; // this decimal is an exact fp > 51: private boolean away; // this decimal has a larger magnitude than fp Drive by comment. The name "away" doesn't convey enough information nor the implications. A longer comment somewhere might be in order. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26364#discussion_r2213385118 From rgiulietti at openjdk.org Thu Jul 17 13:50:50 2025 From: rgiulietti at openjdk.org (Raffaello Giulietti) Date: Thu, 17 Jul 2025 13:50:50 GMT Subject: RFR: 8362448: Make use of the Double.toString(double) algorithm in java.text.DecimalFormat [v2] In-Reply-To: References: Message-ID: On Thu, 17 Jul 2025 13:41:06 GMT, Roger Riggs wrote: >> Raffaello Giulietti has updated the pull request incrementally with one additional commit since the last revision: >> >> Added comment to COMPAT static field. > > src/java.base/share/classes/jdk/internal/math/FormattedFPDecimal.java line 51: > >> 49: >> 50: private boolean exact; // this decimal is an exact fp >> 51: private boolean away; // this decimal has a larger magnitude than fp > > Drive by comment. > The name "away" doesn't convey enough information nor the implications. A longer comment somewhere might be in order. This is IEEE 754 jargon to say "away from zero", so towards the same signed infinity. I'll make the comment clearer in the next commit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26364#discussion_r2213403030 From jlu at openjdk.org Thu Jul 17 18:09:48 2025 From: jlu at openjdk.org (Justin Lu) Date: Thu, 17 Jul 2025 18:09:48 GMT Subject: RFR: 8362448: Make use of the Double.toString(double) algorithm in java.text.DecimalFormat [v2] In-Reply-To: References: Message-ID: <5h2vyeFRO_t6-cJOEC7LH3hy_VMz39QxXgj1Afaog7E=.34bfea42-3dbe-4881-9e0c-8bda32bbd15e@github.com> On Thu, 17 Jul 2025 12:28:06 GMT, Raffaello Giulietti wrote: >> Align the behavior of `DecimalFormat` on `double`s with that of `Formatter`. > > Raffaello Giulietti has updated the pull request incrementally with one additional commit since the last revision: > > Added comment to COMPAT static field. Thanks for working on this Raffaello, the `DecimalFormat` and `DigitList` changes look good to me. Also, the JBS issue needs a `noreg-.*` tag. src/java.base/share/classes/java/text/DigitList.java line 292: > 290: > 291: /* > 292: * This compatibility option will only be available for a *very* limited I suppose the number of releases is dependent on if we run into any issues with this change. I'm wondering when is a good time to revisit this for removal. (I guess a few releases means maybe before the next LTS.) src/java.base/share/classes/jdk/internal/math/FloatingDecimal.java line 1769: > 1767: * fields (> 550 lines). > 1768: */ > 1769: private static BinaryToASCIIConverter getCompatBinaryToASCIIConverter(double d, boolean isCompatibleFormat) { Can we just remove the `isCompatibleFormat` param and pass true to `buf.dtoa` which should always be true as indicated by the method name and its usage? ------------- Marked as reviewed by jlu (Committer). PR Review: https://git.openjdk.org/jdk/pull/26364#pullrequestreview-3030364252 PR Review Comment: https://git.openjdk.org/jdk/pull/26364#discussion_r2213876548 PR Review Comment: https://git.openjdk.org/jdk/pull/26364#discussion_r2213961792 From naoto at openjdk.org Thu Jul 17 18:28:51 2025 From: naoto at openjdk.org (Naoto Sato) Date: Thu, 17 Jul 2025 18:28:51 GMT Subject: RFR: 8362448: Make use of the Double.toString(double) algorithm in java.text.DecimalFormat [v2] In-Reply-To: References: Message-ID: On Thu, 17 Jul 2025 12:28:06 GMT, Raffaello Giulietti wrote: >> Align the behavior of `DecimalFormat` on `double`s with that of `Formatter`. > > Raffaello Giulietti has updated the pull request incrementally with one additional commit since the last revision: > > Added comment to COMPAT static field. Good to see this enhancement, Raffaello. Are you planning to provide some test cases for the change, confirming the implementation switches between old/new depending on the system property? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26364#issuecomment-3085028742 From rgiulietti at openjdk.org Thu Jul 17 19:08:48 2025 From: rgiulietti at openjdk.org (Raffaello Giulietti) Date: Thu, 17 Jul 2025 19:08:48 GMT Subject: RFR: 8362448: Make use of the Double.toString(double) algorithm in java.text.DecimalFormat [v2] In-Reply-To: <5h2vyeFRO_t6-cJOEC7LH3hy_VMz39QxXgj1Afaog7E=.34bfea42-3dbe-4881-9e0c-8bda32bbd15e@github.com> References: <5h2vyeFRO_t6-cJOEC7LH3hy_VMz39QxXgj1Afaog7E=.34bfea42-3dbe-4881-9e0c-8bda32bbd15e@github.com> Message-ID: On Thu, 17 Jul 2025 17:58:51 GMT, Justin Lu wrote: >> Raffaello Giulietti has updated the pull request incrementally with one additional commit since the last revision: >> >> Added comment to COMPAT static field. > > src/java.base/share/classes/jdk/internal/math/FloatingDecimal.java line 1769: > >> 1767: * fields (> 550 lines). >> 1768: */ >> 1769: private static BinaryToASCIIConverter getCompatBinaryToASCIIConverter(double d, boolean isCompatibleFormat) { > > Can we just remove the `isCompatibleFormat` param and pass true to `buf.dtoa` which should always be true as indicated by the method name and its usage? There are many more changes that I've planned in `FloatingDecimal`, including this one and the removal of unused methods. For this PR, however, I just wanted to keep the changes in `FloatingDecimal` minimal, focused on the replacement with the new algorithm. But I have no problems with your proposal, if you insist. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26364#discussion_r2214084343 From rgiulietti at openjdk.org Thu Jul 17 19:23:52 2025 From: rgiulietti at openjdk.org (Raffaello Giulietti) Date: Thu, 17 Jul 2025 19:23:52 GMT Subject: RFR: 8362448: Make use of the Double.toString(double) algorithm in java.text.DecimalFormat [v2] In-Reply-To: References: Message-ID: On Thu, 17 Jul 2025 18:26:14 GMT, Naoto Sato wrote: >> Raffaello Giulietti has updated the pull request incrementally with one additional commit since the last revision: >> >> Added comment to COMPAT static field. > > Good to see this enhancement, Raffaello. Are you planning to provide some test cases for the change, confirming the implementation switches between old/new depending on the system property? @naotoj Do you mean adding a test with values known to have slightly different outcomes? Sure, will do. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26364#issuecomment-3085207988 From naoto at openjdk.org Thu Jul 17 19:39:48 2025 From: naoto at openjdk.org (Naoto Sato) Date: Thu, 17 Jul 2025 19:39:48 GMT Subject: RFR: 8362448: Make use of the Double.toString(double) algorithm in java.text.DecimalFormat [v2] In-Reply-To: References: Message-ID: On Thu, 17 Jul 2025 18:26:14 GMT, Naoto Sato wrote: >> Raffaello Giulietti has updated the pull request incrementally with one additional commit since the last revision: >> >> Added comment to COMPAT static field. > > Good to see this enhancement, Raffaello. Are you planning to provide some test cases for the change, confirming the implementation switches between old/new depending on the system property? > @naotoj Do you mean adding a test with values known to have slightly different outcomes? Sure, will do. Yes, as in the JBS issue ------------- PR Comment: https://git.openjdk.org/jdk/pull/26364#issuecomment-3085253076 From jlu at openjdk.org Thu Jul 17 20:14:47 2025 From: jlu at openjdk.org (Justin Lu) Date: Thu, 17 Jul 2025 20:14:47 GMT Subject: RFR: 8362448: Make use of the Double.toString(double) algorithm in java.text.DecimalFormat [v2] In-Reply-To: References: <5h2vyeFRO_t6-cJOEC7LH3hy_VMz39QxXgj1Afaog7E=.34bfea42-3dbe-4881-9e0c-8bda32bbd15e@github.com> Message-ID: On Thu, 17 Jul 2025 19:06:25 GMT, Raffaello Giulietti wrote: >> src/java.base/share/classes/jdk/internal/math/FloatingDecimal.java line 1769: >> >>> 1767: * fields (> 550 lines). >>> 1768: */ >>> 1769: private static BinaryToASCIIConverter getCompatBinaryToASCIIConverter(double d, boolean isCompatibleFormat) { >> >> Can we just remove the `isCompatibleFormat` param and pass true to `buf.dtoa` which should always be true as indicated by the method name and its usage? > > There are many more changes that I've planned in `FloatingDecimal`, including this one and the removal of unused methods. > For this PR, however, I just wanted to keep the changes in `FloatingDecimal` minimal, focused on the replacement with the new algorithm. > > But I have no problems with your proposal, if you insist. Understood, in that case the current form is fine with me. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26364#discussion_r2214203114 From rgiulietti at openjdk.org Fri Jul 18 19:35:58 2025 From: rgiulietti at openjdk.org (Raffaello Giulietti) Date: Fri, 18 Jul 2025 19:35:58 GMT Subject: RFR: 8362448: Make use of the Double.toString(double) algorithm in java.text.DecimalFormat [v3] In-Reply-To: References: Message-ID: > Align the behavior of `DecimalFormat` on `double`s with that of `Formatter`. Raffaello Giulietti has updated the pull request incrementally with one additional commit since the last revision: Added tests. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26364/files - new: https://git.openjdk.org/jdk/pull/26364/files/0e16f050..d4f49c12 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26364&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26364&range=01-02 Stats: 119 lines in 2 files changed: 117 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/26364.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26364/head:pull/26364 PR: https://git.openjdk.org/jdk/pull/26364 From rgiulietti at openjdk.org Fri Jul 18 19:58:30 2025 From: rgiulietti at openjdk.org (Raffaello Giulietti) Date: Fri, 18 Jul 2025 19:58:30 GMT Subject: RFR: 8362448: Make use of the Double.toString(double) algorithm in java.text.DecimalFormat [v4] In-Reply-To: References: Message-ID: <9xBPVwNt2eeaWZI4Ad4y81yIXdYcy4JUhWLYMyAWzec=.acef0109-8fa8-4d2b-863d-fb37702bcbf3@github.com> > Align the behavior of `DecimalFormat` on `double`s with that of `Formatter`. Raffaello Giulietti has updated the pull request incrementally with one additional commit since the last revision: Removed temporary comment from tests. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26364/files - new: https://git.openjdk.org/jdk/pull/26364/files/d4f49c12..8a14ef2e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26364&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26364&range=02-03 Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/26364.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26364/head:pull/26364 PR: https://git.openjdk.org/jdk/pull/26364 From naoto at openjdk.org Fri Jul 18 20:42:49 2025 From: naoto at openjdk.org (Naoto Sato) Date: Fri, 18 Jul 2025 20:42:49 GMT Subject: RFR: 8362448: Make use of the Double.toString(double) algorithm in java.text.DecimalFormat [v4] In-Reply-To: <9xBPVwNt2eeaWZI4Ad4y81yIXdYcy4JUhWLYMyAWzec=.acef0109-8fa8-4d2b-863d-fb37702bcbf3@github.com> References: <9xBPVwNt2eeaWZI4Ad4y81yIXdYcy4JUhWLYMyAWzec=.acef0109-8fa8-4d2b-863d-fb37702bcbf3@github.com> Message-ID: <2lFCS0yJlMuFkLvbJtgTsQyMs_1fybkK4ZFPFtpTTB8=.d2cdf3d6-6d82-4618-854d-39ec925d9001@github.com> On Fri, 18 Jul 2025 19:58:30 GMT, Raffaello Giulietti wrote: >> Align the behavior of `DecimalFormat` on `double`s with that of `Formatter`. > > Raffaello Giulietti has updated the pull request incrementally with one additional commit since the last revision: > > Removed temporary comment from tests. Looks good to me. Some minor comments for the test follow. test/jdk/java/text/Format/DecimalFormat/DoubleFormattingTest.java line 28: > 26: * @bug 8362448 > 27: * @summary Verify DecimalFormat::format on doubles. > 28: * @run junit/othervm DoubleFormattingTest `othervm` can be removed test/jdk/java/text/Format/DecimalFormat/DoubleFormattingTest.java line 45: > 43: private static final boolean COMPAT = Boolean.getBoolean("jdk.compat.DecimalFormat"); > 44: > 45: @Test Using `@ParameterizedTest` would eliminate the duplication of code ------------- PR Review: https://git.openjdk.org/jdk/pull/26364#pullrequestreview-3034782721 PR Review Comment: https://git.openjdk.org/jdk/pull/26364#discussion_r2216886702 PR Review Comment: https://git.openjdk.org/jdk/pull/26364#discussion_r2216886783 From rgiulietti at openjdk.org Fri Jul 18 21:12:41 2025 From: rgiulietti at openjdk.org (Raffaello Giulietti) Date: Fri, 18 Jul 2025 21:12:41 GMT Subject: RFR: 8362448: Make use of the Double.toString(double) algorithm in java.text.DecimalFormat [v5] In-Reply-To: References: Message-ID: > Align the behavior of `DecimalFormat` on `double`s with that of `Formatter`. Raffaello Giulietti has updated the pull request incrementally with one additional commit since the last revision: Refactoring to paramaterized tests. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26364/files - new: https://git.openjdk.org/jdk/pull/26364/files/8a14ef2e..9cafedd1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26364&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26364&range=03-04 Stats: 56 lines in 1 file changed: 3 ins; 39 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/26364.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26364/head:pull/26364 PR: https://git.openjdk.org/jdk/pull/26364 From naoto at openjdk.org Fri Jul 18 21:31:44 2025 From: naoto at openjdk.org (Naoto Sato) Date: Fri, 18 Jul 2025 21:31:44 GMT Subject: RFR: 8362448: Make use of the Double.toString(double) algorithm in java.text.DecimalFormat [v5] In-Reply-To: References: Message-ID: <8G6nlcFszmDeiGfZ3wyLr4uN1l6bcpKbdKBPawQG8kM=.79383c87-a262-4ef5-8e01-6f65b92bd77c@github.com> On Fri, 18 Jul 2025 21:12:41 GMT, Raffaello Giulietti wrote: >> Align the behavior of `DecimalFormat` on `double`s with that of `Formatter`. > > Raffaello Giulietti has updated the pull request incrementally with one additional commit since the last revision: > > Refactoring to paramaterized tests. Marked as reviewed by naoto (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/26364#pullrequestreview-3034862374 From jlu at openjdk.org Mon Jul 21 21:16:39 2025 From: jlu at openjdk.org (Justin Lu) Date: Mon, 21 Jul 2025 21:16:39 GMT Subject: RFR: 8358880: Performance of parsing with DecimalFormat can be improved [v6] In-Reply-To: References: Message-ID: <7vuRjTHcLazcb58kpd2QUzqm5VZltpYSLpg2sLz_yL4=.3ac60ea1-3531-4ac1-827b-768291b4c904@github.com> On Mon, 16 Jun 2025 21:19:45 GMT, Johannes Graham wrote: >> This PR replaces construction of intermediate strings to be parsed with more direct manipulation of numbers. It also has a more streamlined mechanism of handling `Long.MIN_VALUE` when parsing longs by using `Long.parseUnsignedLong` >> >> As a small side-effect it also eliminates the use of a cached StringBuilder in DigitList. >> >> Testing: >> - GHA >> - Local run of tier 2 and jtreg:jdk/java/text >> - New benchmark: DecimalFormatParseBench > > Johannes Graham has updated the pull request incrementally with one additional commit since the last revision: > > add comments The current form looks good to me. The long parsing did not change much on my machine performance wise, but I think it is a good simplification to include. ------------- Marked as reviewed by jlu (Committer). PR Review: https://git.openjdk.org/jdk/pull/25644#pullrequestreview-3039876202 From naoto at openjdk.org Mon Jul 21 23:02:19 2025 From: naoto at openjdk.org (Naoto Sato) Date: Mon, 21 Jul 2025 23:02:19 GMT Subject: RFR: 8355522: Remove the `java.locale.useOldISOCodes` system property Message-ID: <6Ca8zNdgZWlcivQkpZjjp3rBFWIdyYzQEyKLeXDloVc=.4a5f20b8-49cf-4cea-962f-5b8e99f7b0af@github.com> This PR removes the system property deprecated in JDK 25. If the property is specified at runtime, a warning will be emitted at startup to inform the user that the value is ignored. A corresponding CSR has been drafted as well ------------- Commit messages: - initial commit Changes: https://git.openjdk.org/jdk/pull/26419/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26419&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8355522 Stats: 93 lines in 6 files changed: 1 ins; 61 del; 31 mod Patch: https://git.openjdk.org/jdk/pull/26419.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26419/head:pull/26419 PR: https://git.openjdk.org/jdk/pull/26419 From jlu at openjdk.org Tue Jul 22 21:31:54 2025 From: jlu at openjdk.org (Justin Lu) Date: Tue, 22 Jul 2025 21:31:54 GMT Subject: RFR: 8355522: Remove the `java.locale.useOldISOCodes` system property In-Reply-To: <6Ca8zNdgZWlcivQkpZjjp3rBFWIdyYzQEyKLeXDloVc=.4a5f20b8-49cf-4cea-962f-5b8e99f7b0af@github.com> References: <6Ca8zNdgZWlcivQkpZjjp3rBFWIdyYzQEyKLeXDloVc=.4a5f20b8-49cf-4cea-962f-5b8e99f7b0af@github.com> Message-ID: On Mon, 21 Jul 2025 22:56:50 GMT, Naoto Sato wrote: > This PR removes the system property deprecated in JDK 25. If the property is specified at runtime, a warning will be emitted at startup to inform the user that the value is ignored. A corresponding CSR has been drafted as well Should we also remove the test method, `ModuleTestUtil.runModuleWithLegacyCode` which passes the now defunct property to the process. test/jdk/java/util/Locale/UseOldISOCodesTest.java line 56: > 54: public static void main(String[] args) { > 55: // Ensure java.locale.useOldISOCodes should have no effect > 56: System.setProperty("java.locale.useOldISOCodes", "false"); IMO, it seems weird to keep this line in the test, even if it has no effect. The original goal was to ensure the property only had impact when set during startup. The current test is no longer concerned with that (since the property no longer performs any mapping). ------------- PR Review: https://git.openjdk.org/jdk/pull/26419#pullrequestreview-3044862073 PR Review Comment: https://git.openjdk.org/jdk/pull/26419#discussion_r2223817468 From naoto at openjdk.org Tue Jul 22 22:01:10 2025 From: naoto at openjdk.org (Naoto Sato) Date: Tue, 22 Jul 2025 22:01:10 GMT Subject: RFR: 8355522: Remove the `java.locale.useOldISOCodes` system property [v2] In-Reply-To: <6Ca8zNdgZWlcivQkpZjjp3rBFWIdyYzQEyKLeXDloVc=.4a5f20b8-49cf-4cea-962f-5b8e99f7b0af@github.com> References: <6Ca8zNdgZWlcivQkpZjjp3rBFWIdyYzQEyKLeXDloVc=.4a5f20b8-49cf-4cea-962f-5b8e99f7b0af@github.com> Message-ID: > This PR removes the system property deprecated in JDK 25. If the property is specified at runtime, a warning will be emitted at startup to inform the user that the value is ignored. A corresponding CSR has been drafted as well Naoto Sato has updated the pull request incrementally with one additional commit since the last revision: Reflects review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26419/files - new: https://git.openjdk.org/jdk/pull/26419/files/85d8adfb..bcfd8a7e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26419&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26419&range=00-01 Stats: 3 lines in 1 file changed: 0 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/26419.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26419/head:pull/26419 PR: https://git.openjdk.org/jdk/pull/26419 From naoto at openjdk.org Tue Jul 22 22:06:55 2025 From: naoto at openjdk.org (Naoto Sato) Date: Tue, 22 Jul 2025 22:06:55 GMT Subject: RFR: 8355522: Remove the `java.locale.useOldISOCodes` system property [v2] In-Reply-To: References: <6Ca8zNdgZWlcivQkpZjjp3rBFWIdyYzQEyKLeXDloVc=.4a5f20b8-49cf-4cea-962f-5b8e99f7b0af@github.com> Message-ID: On Tue, 22 Jul 2025 21:29:06 GMT, Justin Lu wrote: > Should we also remove the test method, `ModuleTestUtil.runModuleWithLegacyCode` which passes the now defunct property to the process. I thought about that, but decided to leave it as it is, just to make sure everything works as before. For the same reason, I did not remove the `-Djava.locale.useOldISOCodes=true` run in LocaleTest.java > test/jdk/java/util/Locale/UseOldISOCodesTest.java line 56: > >> 54: public static void main(String[] args) { >> 55: // Ensure java.locale.useOldISOCodes should have no effect >> 56: System.setProperty("java.locale.useOldISOCodes", "false"); > > IMO, it seems weird to keep this line in the test, even if it has no effect. The original goal was to ensure the property only had impact when set during startup. The current test is no longer concerned with that (since the property no longer performs any mapping). Right. I re-purposed the test but as you mentioned, the line is confusing. Removed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26419#issuecomment-3104958671 PR Review Comment: https://git.openjdk.org/jdk/pull/26419#discussion_r2223912809 From jlu at openjdk.org Tue Jul 22 22:11:55 2025 From: jlu at openjdk.org (Justin Lu) Date: Tue, 22 Jul 2025 22:11:55 GMT Subject: RFR: 8355522: Remove the `java.locale.useOldISOCodes` system property [v2] In-Reply-To: References: <6Ca8zNdgZWlcivQkpZjjp3rBFWIdyYzQEyKLeXDloVc=.4a5f20b8-49cf-4cea-962f-5b8e99f7b0af@github.com> Message-ID: On Tue, 22 Jul 2025 22:01:10 GMT, Naoto Sato wrote: >> This PR removes the system property deprecated in JDK 25. If the property is specified at runtime, a warning will be emitted at startup to inform the user that the value is ignored. A corresponding CSR has been drafted as well > > Naoto Sato has updated the pull request incrementally with one additional commit since the last revision: > > Reflects review comments src/java.base/share/classes/sun/util/locale/BaseLocale.java line 181: > 179: } > 180: > 181: public static String convertOldISOCodes(String language) { It was there before this change, but above on line 166 I think we should update the outdated comment, > // JDK uses deprecated ISO639.1 language codes for he, yi and id ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26419#discussion_r2223920817 From naoto at openjdk.org Tue Jul 22 22:25:33 2025 From: naoto at openjdk.org (Naoto Sato) Date: Tue, 22 Jul 2025 22:25:33 GMT Subject: RFR: 8355522: Remove the `java.locale.useOldISOCodes` system property [v3] In-Reply-To: <6Ca8zNdgZWlcivQkpZjjp3rBFWIdyYzQEyKLeXDloVc=.4a5f20b8-49cf-4cea-962f-5b8e99f7b0af@github.com> References: <6Ca8zNdgZWlcivQkpZjjp3rBFWIdyYzQEyKLeXDloVc=.4a5f20b8-49cf-4cea-962f-5b8e99f7b0af@github.com> Message-ID: > This PR removes the system property deprecated in JDK 25. If the property is specified at runtime, a warning will be emitted at startup to inform the user that the value is ignored. A corresponding CSR has been drafted as well Naoto Sato has updated the pull request incrementally with one additional commit since the last revision: Obsolete comment correction ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26419/files - new: https://git.openjdk.org/jdk/pull/26419/files/bcfd8a7e..93f3a8b3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26419&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26419&range=01-02 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/26419.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26419/head:pull/26419 PR: https://git.openjdk.org/jdk/pull/26419 From jlu at openjdk.org Tue Jul 22 22:25:33 2025 From: jlu at openjdk.org (Justin Lu) Date: Tue, 22 Jul 2025 22:25:33 GMT Subject: RFR: 8355522: Remove the `java.locale.useOldISOCodes` system property [v3] In-Reply-To: References: <6Ca8zNdgZWlcivQkpZjjp3rBFWIdyYzQEyKLeXDloVc=.4a5f20b8-49cf-4cea-962f-5b8e99f7b0af@github.com> Message-ID: On Tue, 22 Jul 2025 22:22:40 GMT, Naoto Sato wrote: >> This PR removes the system property deprecated in JDK 25. If the property is specified at runtime, a warning will be emitted at startup to inform the user that the value is ignored. A corresponding CSR has been drafted as well > > Naoto Sato has updated the pull request incrementally with one additional commit since the last revision: > > Obsolete comment correction LGTM ------------- Marked as reviewed by jlu (Committer). PR Review: https://git.openjdk.org/jdk/pull/26419#pullrequestreview-3045036200 From joehw at openjdk.org Wed Jul 23 16:23:00 2025 From: joehw at openjdk.org (Joe Wang) Date: Wed, 23 Jul 2025 16:23:00 GMT Subject: RFR: 8355522: Remove the `java.locale.useOldISOCodes` system property [v3] In-Reply-To: References: <6Ca8zNdgZWlcivQkpZjjp3rBFWIdyYzQEyKLeXDloVc=.4a5f20b8-49cf-4cea-962f-5b8e99f7b0af@github.com> Message-ID: On Tue, 22 Jul 2025 22:25:33 GMT, Naoto Sato wrote: >> This PR removes the system property deprecated in JDK 25. If the property is specified at runtime, a warning will be emitted at startup to inform the user that the value is ignored. A corresponding CSR has been drafted as well > > Naoto Sato has updated the pull request incrementally with one additional commit since the last revision: > > Obsolete comment correction Marked as reviewed by joehw (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/26419#pullrequestreview-3048180520 From jlu at openjdk.org Wed Jul 23 16:50:56 2025 From: jlu at openjdk.org (Justin Lu) Date: Wed, 23 Jul 2025 16:50:56 GMT Subject: RFR: 8355522: Remove the `java.locale.useOldISOCodes` system property [v3] In-Reply-To: References: <6Ca8zNdgZWlcivQkpZjjp3rBFWIdyYzQEyKLeXDloVc=.4a5f20b8-49cf-4cea-962f-5b8e99f7b0af@github.com> Message-ID: On Tue, 22 Jul 2025 22:25:33 GMT, Naoto Sato wrote: >> This PR removes the system property deprecated in JDK 25. If the property is specified at runtime, a warning will be emitted at startup to inform the user that the value is ignored. A corresponding CSR has been drafted as well > > Naoto Sato has updated the pull request incrementally with one additional commit since the last revision: > > Obsolete comment correction src/java.base/share/classes/java/util/Locale.java line 534: > 532: * their earlier, obsoleted forms: {@code he} maps to {@code iw}, > 533: * {@code yi} maps to {@code ji}, and {@code id} maps to > 534: * {@code in}. Since Java SE 17, this is no longer the case. Each *

Locale's constructors have always converted three language codes to * their earlier, obsoleted forms: {@code he} maps to {@code iw}, * {@code yi} maps to {@code ji}, and {@code id} maps to * {@code in}. Since Java SE 17, this is no longer the case. This history was relevant when the property existed. Since this is no longer the case, and we're quite a few releases away from 17, can we also remove this wording as well. Users on 26 should only be concerned with the "old" to "modern" mapping. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26419#discussion_r2226150632 From naoto at openjdk.org Wed Jul 23 17:39:13 2025 From: naoto at openjdk.org (Naoto Sato) Date: Wed, 23 Jul 2025 17:39:13 GMT Subject: RFR: 8355522: Remove the `java.locale.useOldISOCodes` system property [v3] In-Reply-To: References: <6Ca8zNdgZWlcivQkpZjjp3rBFWIdyYzQEyKLeXDloVc=.4a5f20b8-49cf-4cea-962f-5b8e99f7b0af@github.com> Message-ID: On Wed, 23 Jul 2025 16:48:13 GMT, Justin Lu wrote: >> Naoto Sato has updated the pull request incrementally with one additional commit since the last revision: >> >> Obsolete comment correction > > src/java.base/share/classes/java/util/Locale.java line 534: > >> 532: * their earlier, obsoleted forms: {@code he} maps to {@code iw}, >> 533: * {@code yi} maps to {@code ji}, and {@code id} maps to >> 534: * {@code in}. Since Java SE 17, this is no longer the case. Each > > *

Locale's constructors have always converted three language codes to > * their earlier, obsoleted forms: {@code he} maps to {@code iw}, > * {@code yi} maps to {@code ji}, and {@code id} maps to > * {@code in}. Since Java SE 17, this is no longer the case. > > > This history was relevant when the property existed. Since this is no longer the case, and we're quite a few releases away from 17, can we also remove this wording as well. Users on 26 should only be concerned with the "old" to "modern" mapping. Good point. Modified the unnecesarry history. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26419#discussion_r2226240004 From naoto at openjdk.org Wed Jul 23 17:39:12 2025 From: naoto at openjdk.org (Naoto Sato) Date: Wed, 23 Jul 2025 17:39:12 GMT Subject: RFR: 8355522: Remove the `java.locale.useOldISOCodes` system property [v4] In-Reply-To: <6Ca8zNdgZWlcivQkpZjjp3rBFWIdyYzQEyKLeXDloVc=.4a5f20b8-49cf-4cea-962f-5b8e99f7b0af@github.com> References: <6Ca8zNdgZWlcivQkpZjjp3rBFWIdyYzQEyKLeXDloVc=.4a5f20b8-49cf-4cea-962f-5b8e99f7b0af@github.com> Message-ID: > This PR removes the system property deprecated in JDK 25. If the property is specified at runtime, a warning will be emitted at startup to inform the user that the value is ignored. A corresponding CSR has been drafted as well Naoto Sato has updated the pull request incrementally with one additional commit since the last revision: Further wording refinement ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26419/files - new: https://git.openjdk.org/jdk/pull/26419/files/93f3a8b3..c68a63be Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26419&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26419&range=02-03 Stats: 6 lines in 1 file changed: 0 ins; 2 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/26419.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26419/head:pull/26419 PR: https://git.openjdk.org/jdk/pull/26419 From duke at openjdk.org Thu Jul 24 14:59:31 2025 From: duke at openjdk.org (Tatsunori Uchino) Date: Thu, 24 Jul 2025 14:59:31 GMT Subject: RFR: 8364007: Add overload without arguments to codePointCount in String etc. Message-ID: Adds `codePointCount()` overloads to `String`, `Character`, `(Abstract)StringBuilder`, and `StringBuffer` to make it possible to conveniently retrieve the length of a string as code points without extra boundary checks. if (superTremendouslyLongExpressionYieldingAString().codePointCount() > limit) { throw new Exception("exceeding length"); } Is a CSR required to this change? ------------- Commit messages: - Remove trailing spaces - 8364007: Add overload without arguments to codePointCount in String etc. Changes: https://git.openjdk.org/jdk/pull/26461/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26461&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8364007 Stats: 112 lines in 13 files changed: 95 ins; 0 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/26461.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26461/head:pull/26461 PR: https://git.openjdk.org/jdk/pull/26461 From rriggs at openjdk.org Thu Jul 24 15:15:01 2025 From: rriggs at openjdk.org (Roger Riggs) Date: Thu, 24 Jul 2025 15:15:01 GMT Subject: RFR: 8364007: Add overload without arguments to codePointCount in String etc. In-Reply-To: References: Message-ID: <_ZiQNuzt_mXMGd0GGi1XZhAjYFu0lc0ULkdlRdWQ-qE=.19c4501b-aa62-4b2f-b996-aed285ecc781@github.com> On Thu, 24 Jul 2025 14:50:07 GMT, Tatsunori Uchino wrote: > Adds `codePointCount()` overloads to `String`, `Character`, `(Abstract)StringBuilder`, and `StringBuffer` to make it possible to conveniently retrieve the length of a string as code points without extra boundary checks. > > > if (superTremendouslyLongExpressionYieldingAString().codePointCount() > limit) { > throw new Exception("exceeding length"); > } > > > Is a CSR required to this change? The recommended process for proposing new APIs is to put the proposal to the OpenJDK core-libs-dev mail alias. Putting the effort into a PR before there is some agreement on the value is premature. And yes, every change to the spec needs a CSR. To keep the proposal focused on the APIs, please drop the changes to modules other than java.base. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26461#issuecomment-3113849427 PR Comment: https://git.openjdk.org/jdk/pull/26461#issuecomment-3113856629 From liach at openjdk.org Thu Jul 24 16:24:55 2025 From: liach at openjdk.org (Chen Liang) Date: Thu, 24 Jul 2025 16:24:55 GMT Subject: RFR: 8364007: Add overload without arguments to codePointCount in String etc. In-Reply-To: References: Message-ID: On Thu, 24 Jul 2025 14:50:07 GMT, Tatsunori Uchino wrote: > Adds `codePointCount()` overloads to `String`, `Character`, `(Abstract)StringBuilder`, and `StringBuffer` to make it possible to conveniently retrieve the length of a string as code points without extra boundary checks. > > > if (superTremendouslyLongExpressionYieldingAString().codePointCount() > limit) { > throw new Exception("exceeding length"); > } > > > Is a CSR required to this change? Also, do we need `codePointCount` on `CharSequence`? src/java.base/share/classes/java/lang/AbstractStringBuilder.java line 545: > 543: byte[] value = this.value; > 544: if (isLatin1(coder)) { > 545: return value.length; Suggestion: return count; ------------- PR Review: https://git.openjdk.org/jdk/pull/26461#pullrequestreview-3052346769 PR Review Comment: https://git.openjdk.org/jdk/pull/26461#discussion_r2228974436 From myankelevich at openjdk.org Thu Jul 24 21:53:54 2025 From: myankelevich at openjdk.org (Mikhail Yankelevich) Date: Thu, 24 Jul 2025 21:53:54 GMT Subject: RFR: 8364007: Add overload without arguments to codePointCount in String etc. In-Reply-To: References: Message-ID: On Thu, 24 Jul 2025 14:50:07 GMT, Tatsunori Uchino wrote: > Adds `codePointCount()` overloads to `String`, `Character`, `(Abstract)StringBuilder`, and `StringBuffer` to make it possible to conveniently retrieve the length of a string as code points without extra boundary checks. > > > if (superTremendouslyLongExpressionYieldingAString().codePointCount() > limit) { > throw new Exception("exceeding length"); > } > > > Is a CSR required to this change? Could you please add a bug number under `@bug`? test/jdk/java/lang/StringBuilder/Supplementary.java line 216: > 214: testAppendCodePoint(Character.MAX_CODE_POINT+1, IllegalArgumentException.class); > 215: } > 216: nit, as the other copyrights are updated: * Copyright (c) 2003, 2025, Oracle and/or its affiliates. All rights reserved. ------------- PR Review: https://git.openjdk.org/jdk/pull/26461#pullrequestreview-3053452383 PR Review Comment: https://git.openjdk.org/jdk/pull/26461#discussion_r2229652045 From myankelevich at openjdk.org Thu Jul 24 22:09:56 2025 From: myankelevich at openjdk.org (Mikhail Yankelevich) Date: Thu, 24 Jul 2025 22:09:56 GMT Subject: RFR: 8364007: Add overload without arguments to codePointCount in String etc. In-Reply-To: References: Message-ID: <196EiVbN3eRqsKa6dSY6qlQuGcncN9gSihR5hTNuMVw=.ca7a4101-28b2-418f-870a-1cb25d00c0a0@github.com> On Thu, 24 Jul 2025 14:50:07 GMT, Tatsunori Uchino wrote: > Adds `codePointCount()` overloads to `String`, `Character`, `(Abstract)StringBuilder`, and `StringBuffer` to make it possible to conveniently retrieve the length of a string as code points without extra boundary checks. > > > if (superTremendouslyLongExpressionYieldingAString().codePointCount() > limit) { > throw new Exception("exceeding length"); > } > > > Is a CSR required to this change? src/java.base/share/classes/java/lang/Character.java line 9969: > 9967: int n = length; > 9968: for (int i = 0; i < length; ) { > 9969: if (isHighSurrogate(seq.charAt(i++)) && i < length && Imo this is quite hard to read, especially with `i++` inside of the if statement. What do you think about changing it to this? ```java for (int i = 0; i < length; i++) { if (isHighSurrogate(seq.charAt(i)) && i+1 < length && isLowSurrogate(seq.charAt(i+1))) { n--; i++; } } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26461#discussion_r2229676944 From duke at openjdk.org Sat Jul 26 07:30:54 2025 From: duke at openjdk.org (Tatsunori Uchino) Date: Sat, 26 Jul 2025 07:30:54 GMT Subject: RFR: 8364007: Add overload without arguments to codePointCount in String etc. In-Reply-To: <196EiVbN3eRqsKa6dSY6qlQuGcncN9gSihR5hTNuMVw=.ca7a4101-28b2-418f-870a-1cb25d00c0a0@github.com> References: <196EiVbN3eRqsKa6dSY6qlQuGcncN9gSihR5hTNuMVw=.ca7a4101-28b2-418f-870a-1cb25d00c0a0@github.com> Message-ID: On Thu, 24 Jul 2025 22:07:38 GMT, Mikhail Yankelevich wrote: >> Adds `codePointCount()` overloads to `String`, `Character`, `(Abstract)StringBuilder`, and `StringBuffer` to make it possible to conveniently retrieve the length of a string as code points without extra boundary checks. >> >> >> if (superTremendouslyLongExpressionYieldingAString().codePointCount() > limit) { >> throw new Exception("exceeding length"); >> } >> >> >> Is a CSR required to this change? > > src/java.base/share/classes/java/lang/Character.java line 9969: > >> 9967: int n = length; >> 9968: for (int i = 0; i < length; ) { >> 9969: if (isHighSurrogate(seq.charAt(i++)) && i < length && > > Imo this is quite hard to read, especially with `i++` inside of the if statement. What do you think about changing it to this? > ```java > for (int i = 1; i < length-1; i++) { > if (isHighSurrogate(seq.charAt(i)) && > isLowSurrogate(seq.charAt(i + 1))) { > n--; > i++; > } > } > ``` > > edit: fixed a typo in my example In the first place it yields an _incorrect_ result for sequences whose first character is a supplementary character. jshell> int len(CharSequence seq) { ...> final int length = seq.length(); ...> int n = length; ...> for (int i = 1; i < length-1; i++) { ...> if (isHighSurrogate(seq.charAt(i)) && ...> isLowSurrogate(seq.charAt(i + 1))) { ...> n--; ...> i++; ...> } ...> } ...> return n; ...> } | ????????: ???? len(CharSequence)????? method isHighSurrogate(char), and method isLowSurrogate(char)???????????????? jshell> boolean isHighSurrogate(char ch) { ...> return 0xd800 <= ch && ch <= 0xdbff; ...> } | ????????: ???? isHighSurrogate(char) jshell> boolean isLowSurrogate(char ch) { ...> return 0xdc00 <= ch && ch <= 0xdfff; ...> } | ????????: ???? isLowSurrogate(char) jshell> len("?"); $5 ==> 2 jshell> len("OK?"); $6 ==> 3 jshell> len("??"); $7 ==> 3 I will not change it alone unless the existing overload `int codePointCount(CharSequence seq, int beginIndex, int endIndex)` is also planned to be changed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26461#discussion_r2232751973 From duke at openjdk.org Sat Jul 26 08:36:41 2025 From: duke at openjdk.org (Tatsunori Uchino) Date: Sat, 26 Jul 2025 08:36:41 GMT Subject: RFR: 8364007: Add overload without arguments to codePointCount in String etc. [v2] In-Reply-To: References: Message-ID: > Adds `codePointCount()` overloads to `String`, `Character`, `(Abstract)StringBuilder`, and `StringBuffer` to make it possible to conveniently retrieve the length of a string as code points without extra boundary checks. > > > if (superTremendouslyLongExpressionYieldingAString().codePointCount() > limit) { > throw new Exception("exceeding length"); > } > > > Is a CSR required to this change? Tatsunori Uchino has updated the pull request incrementally with four additional commits since the last revision: - Discard changes out of than java.base - Fix copyright year Co-authored-by: Mikhail Yankelevich - Fix how to get code point count in StringBuilder Co-authored-By: Chen Liang - Fix test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26461/files - new: https://git.openjdk.org/jdk/pull/26461/files/6f2e1d2b..63eb4a7d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26461&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26461&range=00-01 Stats: 10 lines in 7 files changed: 0 ins; 0 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/26461.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26461/head:pull/26461 PR: https://git.openjdk.org/jdk/pull/26461 From duke at openjdk.org Sat Jul 26 08:36:41 2025 From: duke at openjdk.org (Tatsunori Uchino) Date: Sat, 26 Jul 2025 08:36:41 GMT Subject: RFR: 8364007: Add overload without arguments to codePointCount in String etc. [v2] In-Reply-To: References: Message-ID: On Thu, 24 Jul 2025 16:18:41 GMT, Chen Liang wrote: >> Tatsunori Uchino has updated the pull request incrementally with four additional commits since the last revision: >> >> - Discard changes out of than java.base >> - Fix copyright year >> >> Co-authored-by: Mikhail Yankelevich >> - Fix how to get code point count in StringBuilder >> >> Co-authored-By: Chen Liang >> - Fix test > > src/java.base/share/classes/java/lang/AbstractStringBuilder.java line 545: > >> 543: byte[] value = this.value; >> 544: if (isLatin1(coder)) { >> 545: return value.length; > > Suggestion: > > return count; I see, I fixed the argument passed to `StringUTF16.codePointCount` too. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26461#discussion_r2232795947 From duke at openjdk.org Sat Jul 26 08:36:41 2025 From: duke at openjdk.org (Tatsunori Uchino) Date: Sat, 26 Jul 2025 08:36:41 GMT Subject: RFR: 8364007: Add overload without arguments to codePointCount in String etc. [v2] In-Reply-To: References: Message-ID: On Thu, 24 Jul 2025 21:49:16 GMT, Mikhail Yankelevich wrote: >> Tatsunori Uchino has updated the pull request incrementally with four additional commits since the last revision: >> >> - Discard changes out of than java.base >> - Fix copyright year >> >> Co-authored-by: Mikhail Yankelevich >> - Fix how to get code point count in StringBuilder >> >> Co-authored-By: Chen Liang >> - Fix test > > test/jdk/java/lang/StringBuilder/Supplementary.java line 216: > >> 214: testAppendCodePoint(Character.MAX_CODE_POINT+1, IllegalArgumentException.class); >> 215: } >> 216: > > nit, as the other copyrights are updated: > > * Copyright (c) 2003, 2025, Oracle and/or its affiliates. All rights reserved. My fault, I renewed it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26461#discussion_r2232795594 From duke at openjdk.org Sat Jul 26 08:50:55 2025 From: duke at openjdk.org (Tatsunori Uchino) Date: Sat, 26 Jul 2025 08:50:55 GMT Subject: RFR: 8364007: Add overload without arguments to codePointCount in String etc. In-Reply-To: <_ZiQNuzt_mXMGd0GGi1XZhAjYFu0lc0ULkdlRdWQ-qE=.19c4501b-aa62-4b2f-b996-aed285ecc781@github.com> References: <_ZiQNuzt_mXMGd0GGi1XZhAjYFu0lc0ULkdlRdWQ-qE=.19c4501b-aa62-4b2f-b996-aed285ecc781@github.com> Message-ID: On Thu, 24 Jul 2025 15:10:37 GMT, Roger Riggs wrote: > The recommended process for proposing new APIs is to put the proposal to the OpenJDK core-libs-dev mail alias. I glanced over https://mail.openjdk.org/pipermail/core-libs-dev/2025-July/thread.html and those for some past months, but I did not get how to send one. According to https://mail.openjdk.org/pipermail/core-libs-dev/2025-July/149338.html and sub messages, the content in this PR seems to be transferred to the mailing list. > Also, do we need codePointCount on CharSequence? I did not add it because it does not have an existing overload and has a simple (but not efficient) workaround (`codePoints().count()`), but it would be nice if it exists. > Could you please add a bug number under @bug? Which doc comments shall I add it? > And yes, every change to the spec needs a CSR. I got it, but do you know how non-Authors like me create ones? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26461#issuecomment-3121518177 From duke at openjdk.org Sat Jul 26 10:10:40 2025 From: duke at openjdk.org (Tatsunori Uchino) Date: Sat, 26 Jul 2025 10:10:40 GMT Subject: RFR: 8364007: Add overload without arguments to codePointCount in String etc. [v3] In-Reply-To: References: Message-ID: > Adds `codePointCount()` overloads to `String`, `Character`, `(Abstract)StringBuilder`, and `StringBuffer` to make it possible to conveniently retrieve the length of a string as code points without extra boundary checks. > > > if (superTremendouslyLongExpressionYieldingAString().codePointCount() > limit) { > throw new Exception("exceeding length"); > } > > > Is a CSR required to this change? Tatsunori Uchino has updated the pull request incrementally with four additional commits since the last revision: - Update `@bug` in correct file - Add default implementation on codePointCount in CharSequence - Update `@bug` entries in test class doc comments - Discard changes on code whose form is not `str.codePointCount(0, str.length())` ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26461/files - new: https://git.openjdk.org/jdk/pull/26461/files/63eb4a7d..0e55e35c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26461&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26461&range=01-02 Stats: 32 lines in 5 files changed: 26 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/26461.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26461/head:pull/26461 PR: https://git.openjdk.org/jdk/pull/26461 From duke at openjdk.org Sat Jul 26 10:22:53 2025 From: duke at openjdk.org (Tatsunori Uchino) Date: Sat, 26 Jul 2025 10:22:53 GMT Subject: RFR: 8364007: Add overload without arguments to codePointCount in String etc. [v3] In-Reply-To: References: Message-ID: On Sat, 26 Jul 2025 10:10:40 GMT, Tatsunori Uchino wrote: >> Adds `codePointCount()` overloads to `String`, `Character`, `(Abstract)StringBuilder`, and `StringBuffer` to make it possible to conveniently retrieve the length of a string as code points without extra boundary checks. >> >> >> if (superTremendouslyLongExpressionYieldingAString().codePointCount() > limit) { >> throw new Exception("exceeding length"); >> } >> >> >> Is a CSR required to this change? > > Tatsunori Uchino has updated the pull request incrementally with four additional commits since the last revision: > > - Update `@bug` in correct file > - Add default implementation on codePointCount in CharSequence > - Update `@bug` entries in test class doc comments > - Discard changes on code whose form is not `str.codePointCount(0, str.length())` How and where can I add tests for default implementing methods in `CharSequence`? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26461#issuecomment-3121593297 From jpai at openjdk.org Sat Jul 26 14:24:00 2025 From: jpai at openjdk.org (Jaikiran Pai) Date: Sat, 26 Jul 2025 14:24:00 GMT Subject: RFR: 8364007: Add overload without arguments to codePointCount in String etc. [v3] In-Reply-To: References: Message-ID: On Sat, 26 Jul 2025 10:20:20 GMT, Tatsunori Uchino wrote: >> Tatsunori Uchino has updated the pull request incrementally with four additional commits since the last revision: >> >> - Update `@bug` in correct file >> - Add default implementation on codePointCount in CharSequence >> - Update `@bug` entries in test class doc comments >> - Discard changes on code whose form is not `str.codePointCount(0, str.length())` > > How and where can I add tests for default implementing methods in `CharSequence`? Hello @tats-u, > > The recommended process for proposing new APIs is to put the proposal to the OpenJDK core-libs-dev mail alias. > > I glanced over https://mail.openjdk.org/pipermail/core-libs-dev/2025-July/thread.html and those for some past months, but I did not get how to send one. > The OpenJDK contribution guide has the necessary details on how to contribute to the project. Specifically this section https://openjdk.org/guide/#socialize-your-change is of relevance. In order to send a mail to the core-libs-dev mailing list, please first subscribe to that mailing list https://mail.openjdk.org/mailman/listinfo/core-libs-dev and initiate a discussion explaining the need and motivation for this new API. After there's some agreement about this proposal, the implementation changes in this PR can be pursued further. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26461#issuecomment-3121977983 From alanb at openjdk.org Sat Jul 26 14:29:54 2025 From: alanb at openjdk.org (Alan Bateman) Date: Sat, 26 Jul 2025 14:29:54 GMT Subject: RFR: 8364007: Add overload without arguments to codePointCount in String etc. [v3] In-Reply-To: References: Message-ID: On Sat, 26 Jul 2025 10:10:40 GMT, Tatsunori Uchino wrote: >> Adds `codePointCount()` overloads to `String`, `Character`, `(Abstract)StringBuilder`, and `StringBuffer` to make it possible to conveniently retrieve the length of a string as code points without extra boundary checks. >> >> >> if (superTremendouslyLongExpressionYieldingAString().codePointCount() > limit) { >> throw new Exception("exceeding length"); >> } >> >> >> Is a CSR required to this change? > > Tatsunori Uchino has updated the pull request incrementally with four additional commits since the last revision: > > - Update `@bug` in correct file > - Add default implementation on codePointCount in CharSequence > - Update `@bug` entries in test class doc comments > - Discard changes on code whose form is not `str.codePointCount(0, str.length())` The addition to CharSequence will require static analysis to check for conflicts with implementation. It will also likely impact the CharBuffer spec. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26461#issuecomment-3121986320 From duke at openjdk.org Sun Jul 27 10:25:54 2025 From: duke at openjdk.org (Tatsunori Uchino) Date: Sun, 27 Jul 2025 10:25:54 GMT Subject: RFR: 8364007: Add no-argument codePointCount method to CharSequence and String [v3] In-Reply-To: References: Message-ID: On Sat, 26 Jul 2025 14:21:39 GMT, Jaikiran Pai wrote: > please first subscribe to that mailing list https://mail.openjdk.org/mailman/listinfo/core-libs-dev Does this mailing list system require us to subscribe the list to post a new mail to the list? I would like to leave it at least after this PR is merged because I would not like my mailbox to be messed up by emails not related to this change. > The addition to CharSequence will require static analysis to check for conflicts with implementation. It will also likely impact the CharBuffer spec. The title of the JBS issue seems to be changed by you but it looks like the default method for `CharSequence` should be stripped for this time according to your concerns. No `codePointCount` methods have been added to `CharSequence` so it may be too early for us to add one to `CharSequence`. Do you think that you should replace `CharSequence` in the title with another class name? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26461#issuecomment-3124289446 From alanb at openjdk.org Sun Jul 27 15:21:55 2025 From: alanb at openjdk.org (Alan Bateman) Date: Sun, 27 Jul 2025 15:21:55 GMT Subject: RFR: 8364007: Add no-argument codePointCount method to CharSequence and String [v3] In-Reply-To: References: Message-ID: On Sun, 27 Jul 2025 10:23:03 GMT, Tatsunori Uchino wrote: > No `codePointCount` methods have been added to `CharSequence` so it may be too early for us to add one to `CharSequence`. Do you think that you should replace `CharSequence` in the title with another class name? Can you clarify what you mean? Right now your PR is proposing to add a default method named codePointCount to CharSequence. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26461#issuecomment-3124488657 From duke at openjdk.org Sun Jul 27 23:07:57 2025 From: duke at openjdk.org (Tatsunori Uchino) Date: Sun, 27 Jul 2025 23:07:57 GMT Subject: RFR: 8364007: Add no-argument codePointCount method to CharSequence and String [v3] In-Reply-To: References: Message-ID: On Sun, 27 Jul 2025 15:19:13 GMT, Alan Bateman wrote: > Right now your PR is proposing to add a default method named codePointCount to CharSequence. If it should be excluded for this time, I will push an additional commit to _remove_ it from the content in this PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26461#issuecomment-3124780267 From alanb at openjdk.org Mon Jul 28 09:19:05 2025 From: alanb at openjdk.org (Alan Bateman) Date: Mon, 28 Jul 2025 09:19:05 GMT Subject: RFR: 8364007: Add no-argument codePointCount method to CharSequence and String [v3] In-Reply-To: References: Message-ID: On Sun, 27 Jul 2025 23:05:25 GMT, Tatsunori Uchino wrote: > > Right now your PR is proposing to add a default method named codePointCount to CharSequence. > > If it should be excluded for this time, I will push an additional commit to _remove_ it from the content in this PR. I think we should mull over the addition of CharSequence::codePointCount. On the surface it looks like it fits but we can't rush it (CharSequence is widely implemented and additions to this interface have a history of disruption in the eco system). What is the reason for proposing Character.codePointCount(CharSequence) aswell? ------------- PR Comment: https://git.openjdk.org/jdk/pull/26461#issuecomment-3126308040 From duke at openjdk.org Mon Jul 28 12:30:56 2025 From: duke at openjdk.org (Tatsunori Uchino) Date: Mon, 28 Jul 2025 12:30:56 GMT Subject: RFR: 8364007: Add no-argument codePointCount method to CharSequence and String [v3] In-Reply-To: References: Message-ID: On Mon, 28 Jul 2025 09:16:15 GMT, Alan Bateman wrote: > I think we should mull over the addition of CharSequence::codePointCount. On the surface it looks like it fits but we can't rush it (CharSequence is widely implemented and additions to this interface have a history of disruption in the eco system). We might as well defer it until another JBS issue if it is too difficult to decide whether it should be included in this PR. > What is the reason for proposing Character.codePointCount(CharSequence) aswell? 1. It already has an overload with the start and end indices unlike `CharSequence` like `String` and `AbstractStringBuilder` 2. Less harmful than `CharSequence::codePointCount` because it is just a static method. 3. There are already the `(CharSequence, int, int)` and `(char[], int, int)` overloads and the `(char[], int, int)` overload is used for the test for `String::codePointCount(int, int)`. We should add the `(char[])` overload for test and also add the `(CharSequence)` for consistency. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26461#issuecomment-3126999990 From naoto at openjdk.org Mon Jul 28 16:11:56 2025 From: naoto at openjdk.org (Naoto Sato) Date: Mon, 28 Jul 2025 16:11:56 GMT Subject: RFR: 8364007: Add no-argument codePointCount method to CharSequence and String [v3] In-Reply-To: References: Message-ID: On Sun, 27 Jul 2025 10:23:03 GMT, Tatsunori Uchino wrote: > The addition to CharSequence will require static analysis to check for conflicts with implementation. It will also likely impact the CharBuffer spec. Looking at the original JSR 204 issue: https://bugs.openjdk.org/browse/JDK-4985217, it is interesting that the problem description included `CharSequence` but not in the proposed API. Tried to find the reason behind, but could not find any relevant information so far. As to the general comment, I am not so sure adding the no-arg overrides, as they would simply be convenience methods to `codePointCount(0, length())` which to me adding not a significant benefit. My $0.02 ------------- PR Comment: https://git.openjdk.org/jdk/pull/26461#issuecomment-3127954858 From pikolasikolatest2 at gmail.com Mon Jul 28 19:55:35 2025 From: pikolasikolatest2 at gmail.com (walid falcon) Date: Mon, 28 Jul 2025 20:55:35 +0100 Subject: =?utf-8?b?2LTZh9in2K/YqSDYp9mE2KPYqNmI2Kkg2YHZiiA=?= =?utf-8?b?2KfZhNmF2LrYsdio?= Message-ID: ?? ?? ????? ?????? ??? ????? ?????? ??????? ????? ????? ????? ?? ???? ?????? ??????? https://www.targir.com/2025/04/blog-post_14.html ??????? ????? ???????? ????? ?????? ?????????? ????????? ???????? ??? ?? ?? ????? ?????? ?? ??????? ????? ?????? ????? ??????? ???? ????? ??? ??? ?????? ?????? ??????? ??????? ??? ????? ?????? ?????????? ?? ??? ??? ???? ??? ???? ???? ?? ????? ????. ????? ??? ??????? ????? ?? ????????? ???????? ???? ???? ??? ????? ?????? ???????. ??????? ???? ?????? ??????? ????? ?????? 1. ??????? ??? ???????? ?? ?????? ??????? ????? ???? ??? ???? ???? ?????? ?? ?? ??????? ???? ??????? ?? ???? ?????? ???????? ????? ???? ??? ??? ???? ?????? ??? ????? ??????. 2. ????? ????? ?? ???? ??? ???? ?? ????? ??? ???? ????? ?????? ??? ???????? ????? ??? ??????? ?????? ??????? ???????. ??????? ?????? ??? ????? ?????? ????: ??? ???? ???? ??????? ????? ??? ???? ?? ??? ??? ????? ????? ??? ??? ??????? ?????????? ???? ??? ???? ??? ??? ????? ?? ????? ???? ?? ??? ????? ??? ????? ?????? ??????. ??????: ??????? ???????? ???? ?? ????? ??????? ??????? ??? ???????? ????? ?????? (?? ????) ????? ???? ??????? ????? ?????? ????? ?? ??????? ???? ????? ???? ?????? ????? ????? ?????? pdf ???? ??? ??????? ??????????? ????? ????? ????? ?????? PDF ???? ???????? ??? ?? ??????? ?????? ?? ??? ??? ??? ??? ?????? ???? ??????? ?????? ?????? ??????? ?????? ?????. ??????? ??????? ??? ????? ?????? ?? ???? ????? ??? ????? ?????? ??????????? ??? ??????? ??? ?????? ?????? ?????? ???? ???????. ?? ?? ??? ?????? ??? ????? ??????? ?? ????? ????? ??? ??? ???????? ???? ???? ?? ?????? ??? ??? ??? ????? ????? ??? ????? ?????. ??? ??? ????? ?????? ???? ???????? ????????? ?????? ?? ????? ??? ??? ?????? ??? ????? ?????? ???? ???????? ????????? ?????? ???? ???? ???? ??????? ???: 1. ??????? ?????? ?????? ????? ?????? ?? ????? ?? ??? ??????? ????????? ????? ????? ???? ??? ??????? ????? ??????? ??? ?? ????? ??: ??????? ???????? ???????? ????? ????? ?? ????? ??????? ????? ???? ?????? ??????? ?????? ?????????? 2. ??????? ??????? ????? ????? ?? ????? ??? ???? ????? ???? ????? ??? ??????? ?? ????? ??????? ??? ????? ?? ????? ?????? ?? ????? ?????? ?????????? ??? ?? ???? ????? ?????? ?????? ?? ??? ??????? ????????? ??????. ????? ?????? ?????? ?? ??? ??????? ?? ????? ????? ?????? ?????? ???? ??? ????? ?? ????? ?????? ?? ???? ?? ??? ??????? ????????. ???? ????? ??????? ?????? ??????? ??????? ???????? ??????? ?????? : ? ????? ??????? ? ? ?????? ??????? ??????? ?????? : ? ????? ????? ???? ? ? ?????? ??????? . ??????? ??????? ??????? ??????? ???????? ??????? ???????? ??????? ???? ????? ???? ????? ??????? ??????? ?????? ?????? ??????? ???????? ??????? ???? ????? ???? ????? ?? ???? ???? ?????? ??? ?????? ????? ?????? ???? ??????? ????? ?????? ???????? ???????? ???? ????????? (??????? ???????) ????? ?????? ???????? ????????_????? ????? ????? ????? ?? ???? ?????? ??????? ????? ????? ?????? ??????? ????? ?????? ??: [??? ???????] ??? ?????: [????? ???????] ??? ??????: [????? ???????] ???? ????? ??????? ??? ?????? ?????: ????? ??????: [????? ?????? ????] ??? ????? ??????? ???????: [??? CIN] ????? ????? ????????: [????? ????? ????? ????] ??????? ??????: [????? ???????] ??????: [?????? ???????] ?????? ????? ???? ?????? ???? ??????? ??????????? ??? ???: ?? ????? (?) ?????? (?): [??? ?????/?] ??????? (?) ??????: [????? ???????] ??: [???? ???????] ??(??) ????/????? ?????????(?)? ??? ????(?) ????? ????? ????? ????? ???? ???? ??????: [??? ????]? ???????? ??????: [????? ????? ????]? ???????? ???? ????? ??????? ???????: [??? CIN ????]. ?????? ???? ????????? ????????? ???????? ???? ??? ?????/?? ????? ?? ??????? ??????? ????? ??? ????? ???? ??? ????? ??? ??????? ?? ??????? ?? ?????? ???????. ??????? ????? ??????: _____________________ ???????: ___________________________ ???????: ____ / ____ / ________ ??????? : ????? ??? ??????? ???????? ??????? ??: ???? ??????? ???????? ????? ??????? ????? ???????? ??????. ?????? ??? ????? ??? ??????? ?????????? (??? ???? ??????) ???? ??? ??? ???????? ??? ????? ????? ?? ??? ??? ????. -------------- next part -------------- An HTML attachment was scrubbed... URL: From liach at openjdk.org Tue Jul 29 14:36:08 2025 From: liach at openjdk.org (Chen Liang) Date: Tue, 29 Jul 2025 14:36:08 GMT Subject: RFR: 8358880: Performance of parsing with DecimalFormat can be improved [v6] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 21:19:45 GMT, Johannes Graham wrote: >> This PR replaces construction of intermediate strings to be parsed with more direct manipulation of numbers. It also has a more streamlined mechanism of handling `Long.MIN_VALUE` when parsing longs by using `Long.parseUnsignedLong` >> >> As a small side-effect it also eliminates the use of a cached StringBuilder in DigitList. >> >> Testing: >> - GHA >> - Local run of tier 2 and jtreg:jdk/java/text >> - New benchmark: DecimalFormatParseBench > > Johannes Graham has updated the pull request incrementally with one additional commit since the last revision: > > add comments @rgiulietti Would you mind reviewing this? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25644#issuecomment-3132838883 From rgiulietti at openjdk.org Tue Jul 29 14:42:59 2025 From: rgiulietti at openjdk.org (Raffaello Giulietti) Date: Tue, 29 Jul 2025 14:42:59 GMT Subject: RFR: 8358880: Performance of parsing with DecimalFormat can be improved [v6] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 21:19:45 GMT, Johannes Graham wrote: >> This PR replaces construction of intermediate strings to be parsed with more direct manipulation of numbers. It also has a more streamlined mechanism of handling `Long.MIN_VALUE` when parsing longs by using `Long.parseUnsignedLong` >> >> As a small side-effect it also eliminates the use of a cached StringBuilder in DigitList. >> >> Testing: >> - GHA >> - Local run of tier 2 and jtreg:jdk/java/text >> - New benchmark: DecimalFormatParseBench > > Johannes Graham has updated the pull request incrementally with one additional commit since the last revision: > > add comments Ah yes, I'll take a look at latest tomorrow. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25644#issuecomment-3132865733 From rgiulietti at openjdk.org Wed Jul 30 13:43:00 2025 From: rgiulietti at openjdk.org (Raffaello Giulietti) Date: Wed, 30 Jul 2025 13:43:00 GMT Subject: RFR: 8358880: Performance of parsing with DecimalFormat can be improved [v3] In-Reply-To: References: <-6TYWikmLNBtyQt6_xeJ8KoziHxA8Ijr067NIc740X0=.2f9e4cae-7f39-46b8-b3a9-f381fb2e0518@github.com> Message-ID: On Mon, 16 Jun 2025 18:19:29 GMT, Justin Lu wrote: >> I don't have a specific example, so I've reverted to my original check. I'm a bit unsettled by the check for an extreme value later in `doubleValue()` comparing against `MIN_DECIMAL_EXPONENT - 1` > > IMO, the original check you had is easier to understand what is happening without further context, so I prefer your switch back. > > I think we are fine from (negative) "extreme values" in `doubleValue()` because of the check you have implemented in the first place. i.e. we avoid any potential underflow from `int exp = decExponent - kDigits;`. I think we do need a comment to accompany the check. (Why do we check? why not check the max exponent value?) > > Also, should the check be against `MIN_DECIMAL_EXPONENT - 1` for consistency with `doubleValue()`? (Functionally, I don't think it matters.) Suggestion: if (decExp <= MIN_DECIMAL_EXPONENT) { This is just a note for future enhancements. No need for a new commit, as the current version is correct and avoids the underflow in `doubleValue()`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/25644#discussion_r2242727180 From rgiulietti at openjdk.org Wed Jul 30 13:42:59 2025 From: rgiulietti at openjdk.org (Raffaello Giulietti) Date: Wed, 30 Jul 2025 13:42:59 GMT Subject: RFR: 8358880: Performance of parsing with DecimalFormat can be improved [v6] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 21:19:45 GMT, Johannes Graham wrote: >> This PR replaces construction of intermediate strings to be parsed with more direct manipulation of numbers. It also has a more streamlined mechanism of handling `Long.MIN_VALUE` when parsing longs by using `Long.parseUnsignedLong` >> >> As a small side-effect it also eliminates the use of a cached StringBuilder in DigitList. >> >> Testing: >> - GHA >> - Local run of tier 2 and jtreg:jdk/java/text >> - New benchmark: DecimalFormatParseBench > > Johannes Graham has updated the pull request incrementally with one additional commit since the last revision: > > add comments Thanks @j3graham for your contribution. ------------- Marked as reviewed by rgiulietti (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/25644#pullrequestreview-3071606450 From duke at openjdk.org Thu Jul 31 15:02:57 2025 From: duke at openjdk.org (duke) Date: Thu, 31 Jul 2025 15:02:57 GMT Subject: RFR: 8358880: Performance of parsing with DecimalFormat can be improved [v6] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 21:19:45 GMT, Johannes Graham wrote: >> This PR replaces construction of intermediate strings to be parsed with more direct manipulation of numbers. It also has a more streamlined mechanism of handling `Long.MIN_VALUE` when parsing longs by using `Long.parseUnsignedLong` >> >> As a small side-effect it also eliminates the use of a cached StringBuilder in DigitList. >> >> Testing: >> - GHA >> - Local run of tier 2 and jtreg:jdk/java/text >> - New benchmark: DecimalFormatParseBench > > Johannes Graham has updated the pull request incrementally with one additional commit since the last revision: > > add comments @j3graham Your change (at version b7faa3b86c320aa979da5003002575011e278081) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25644#issuecomment-3140312350 From duke at openjdk.org Thu Jul 31 17:53:04 2025 From: duke at openjdk.org (Johannes Graham) Date: Thu, 31 Jul 2025 17:53:04 GMT Subject: Integrated: 8358880: Performance of parsing with DecimalFormat can be improved In-Reply-To: References: Message-ID: On Wed, 4 Jun 2025 18:18:39 GMT, Johannes Graham wrote: > This PR replaces construction of intermediate strings to be parsed with more direct manipulation of numbers. It also has a more streamlined mechanism of handling `Long.MIN_VALUE` when parsing longs by using `Long.parseUnsignedLong` > > As a small side-effect it also eliminates the use of a cached StringBuilder in DigitList. > > Testing: > - GHA > - Local run of tier 2 and jtreg:jdk/java/text > - New benchmark: DecimalFormatParseBench This pull request has now been integrated. Changeset: d1944239 Author: Johannes Graham Committer: Raffaello Giulietti URL: https://git.openjdk.org/jdk/commit/d19442399c004c78bff8a5ccf7c6975c7e583a07 Stats: 184 lines in 4 files changed: 126 ins; 48 del; 10 mod 8358880: Performance of parsing with DecimalFormat can be improved Reviewed-by: jlu, liach, rgiulietti ------------- PR: https://git.openjdk.org/jdk/pull/25644 From duke at openjdk.org Thu Jul 31 18:18:04 2025 From: duke at openjdk.org (Johannes Graham) Date: Thu, 31 Jul 2025 18:18:04 GMT Subject: RFR: 8358880: Performance of parsing with DecimalFormat can be improved [v6] In-Reply-To: References: Message-ID: On Mon, 16 Jun 2025 21:19:45 GMT, Johannes Graham wrote: >> This PR replaces construction of intermediate strings to be parsed with more direct manipulation of numbers. It also has a more streamlined mechanism of handling `Long.MIN_VALUE` when parsing longs by using `Long.parseUnsignedLong` >> >> As a small side-effect it also eliminates the use of a cached StringBuilder in DigitList. >> >> Testing: >> - GHA >> - Local run of tier 2 and jtreg:jdk/java/text >> - New benchmark: DecimalFormatParseBench > > Johannes Graham has updated the pull request incrementally with one additional commit since the last revision: > > add comments Thank you all for your attention. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25644#issuecomment-3140917372 From naoto at openjdk.org Thu Jul 31 18:48:31 2025 From: naoto at openjdk.org (Naoto Sato) Date: Thu, 31 Jul 2025 18:48:31 GMT Subject: RFR: 8363972: Loose matching of dash/minusSign in number parsing Message-ID: Enabling lenient minus sign matching when parsing numbers. In some locales, e.g. Finnish, the default minus sign is the Unicode "Minus Sign" (U+2212), which is not the "Hyphen Minus" (U+002D) that users type in from keyboard. Thus the parsing of user input numbers may fail. This change utilizes CLDR's `parseLenient` element for minus signs and loosely matches them with the hyphen-minus so that user input numbers can parse. As this is a behavioral change, a corresponding CSR has been drafted. ------------- Commit messages: - Merge branch 'master' into JDK-8363972-Loose-matching-dash - tidying up - test location - spec update - readObject - Merge branch 'master' into JDK-8363972-Loose-matching-dash - lenientminus -> NumberElements - lenientMinusSign -> serial, moved to NumberElements - tentative - initial commit Changes: https://git.openjdk.org/jdk/pull/26580/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26580&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8363972 Stats: 386 lines in 8 files changed: 351 ins; 20 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/26580.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26580/head:pull/26580 PR: https://git.openjdk.org/jdk/pull/26580 From duke at openjdk.org Thu Jul 31 21:00:59 2025 From: duke at openjdk.org (Francesco Andreuzzi) Date: Thu, 31 Jul 2025 21:00:59 GMT Subject: RFR: 8363972: Loose matching of dash/minusSign in number parsing In-Reply-To: References: Message-ID: On Thu, 31 Jul 2025 18:41:47 GMT, Naoto Sato wrote: > Enabling lenient minus sign matching when parsing numbers. In some locales, e.g. Finnish, the default minus sign is the Unicode "Minus Sign" (U+2212), which is not the "Hyphen Minus" (U+002D) that users type in from keyboard. Thus the parsing of user input numbers may fail. This change utilizes CLDR's `parseLenient` element for minus signs and loosely matches them with the hyphen-minus so that user input numbers can parse. As this is a behavioral change, a corresponding CSR has been drafted. make/jdk/src/classes/build/tools/cldrconverter/LDMLParseHandler.java line 851: > 849: { > 850: String level = attributes.getValue("level"); > 851: if (level != null && level.equals("lenient")) { This could be slightly simplified: Suggestion: if ("lenient".equals(level)) { src/java.base/share/classes/java/text/DecimalFormat.java line 3526: > 3524: } > 3525: > 3526: if (!parseStrict) { Possible early return here: by inverting the `if` you could `return text.regionMatches(...)` immediately, and remove one level of indentation from the big block in L3527-3543 src/java.base/share/classes/java/text/DecimalFormatSymbols.java line 1002: > 1000: > 1001: if (loadNumberData(locale) instanceof Object[] d && > 1002: d[0] instanceof String[] numberElements) { Should the size be validated here, before accessing `d[0]`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26580#discussion_r2246351119 PR Review Comment: https://git.openjdk.org/jdk/pull/26580#discussion_r2246365822 PR Review Comment: https://git.openjdk.org/jdk/pull/26580#discussion_r2246361036 From naoto at openjdk.org Thu Jul 31 22:04:56 2025 From: naoto at openjdk.org (Naoto Sato) Date: Thu, 31 Jul 2025 22:04:56 GMT Subject: RFR: 8363972: Loose matching of dash/minusSign in number parsing [v2] In-Reply-To: References: Message-ID: <-jShvVkEpQ1sPrcEvsMOTj-91dL1vm_ZFhw7NSUJ8jE=.7368c6c6-6126-410e-9fa7-b694122c9bc9@github.com> > Enabling lenient minus sign matching when parsing numbers. In some locales, e.g. Finnish, the default minus sign is the Unicode "Minus Sign" (U+2212), which is not the "Hyphen Minus" (U+002D) that users type in from keyboard. Thus the parsing of user input numbers may fail. This change utilizes CLDR's `parseLenient` element for minus signs and loosely matches them with the hyphen-minus so that user input numbers can parse. As this is a behavioral change, a corresponding CSR has been drafted. Naoto Sato has updated the pull request incrementally with one additional commit since the last revision: Address review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26580/files - new: https://git.openjdk.org/jdk/pull/26580/files/33e99461..235138d5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26580&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26580&range=00-01 Stats: 48 lines in 3 files changed: 18 ins; 19 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/26580.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26580/head:pull/26580 PR: https://git.openjdk.org/jdk/pull/26580 From naoto at openjdk.org Thu Jul 31 22:11:55 2025 From: naoto at openjdk.org (Naoto Sato) Date: Thu, 31 Jul 2025 22:11:55 GMT Subject: RFR: 8363972: Loose matching of dash/minusSign in number parsing [v2] In-Reply-To: References: Message-ID: On Thu, 31 Jul 2025 20:55:01 GMT, Francesco Andreuzzi wrote: >> Naoto Sato has updated the pull request incrementally with one additional commit since the last revision: >> >> Address review comments > > src/java.base/share/classes/java/text/DecimalFormatSymbols.java line 1002: > >> 1000: >> 1001: if (loadNumberData(locale) instanceof Object[] d && >> 1002: d[0] instanceof String[] numberElements) { > > Should the size be validated here, before accessing `d[0]`? This should be fine, as there would be no situation where empty array would be returned: https://github.com/openjdk/jdk/blob/724e8c076e1aed05de893ef9366af0e62cc2ac2b/src/java.base/share/classes/sun/util/locale/provider/LocaleResources.java#L223 I modified the `else` case, where the field was not initialized, btw. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/26580#discussion_r2246478764 From naoto at openjdk.org Thu Jul 31 22:23:36 2025 From: naoto at openjdk.org (Naoto Sato) Date: Thu, 31 Jul 2025 22:23:36 GMT Subject: RFR: 8363972: Loose matching of dash/minusSign in number parsing [v3] In-Reply-To: References: Message-ID: > Enabling lenient minus sign matching when parsing numbers. In some locales, e.g. Finnish, the default minus sign is the Unicode "Minus Sign" (U+2212), which is not the "Hyphen Minus" (U+002D) that users type in from keyboard. Thus the parsing of user input numbers may fail. This change utilizes CLDR's `parseLenient` element for minus signs and loosely matches them with the hyphen-minus so that user input numbers can parse. As this is a behavioral change, a corresponding CSR has been drafted. Naoto Sato has updated the pull request incrementally with one additional commit since the last revision: flipped the size check ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26580/files - new: https://git.openjdk.org/jdk/pull/26580/files/235138d5..516403e8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26580&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26580&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/26580.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26580/head:pull/26580 PR: https://git.openjdk.org/jdk/pull/26580 From naoto at openjdk.org Thu Jul 31 22:30:35 2025 From: naoto at openjdk.org (Naoto Sato) Date: Thu, 31 Jul 2025 22:30:35 GMT Subject: RFR: 8363972: Loose matching of dash/minusSign in number parsing [v4] In-Reply-To: References: Message-ID: > Enabling lenient minus sign matching when parsing numbers. In some locales, e.g. Finnish, the default minus sign is the Unicode "Minus Sign" (U+2212), which is not the "Hyphen Minus" (U+002D) that users type in from keyboard. Thus the parsing of user input numbers may fail. This change utilizes CLDR's `parseLenient` element for minus signs and loosely matches them with the hyphen-minus so that user input numbers can parse. As this is a behavioral change, a corresponding CSR has been drafted. Naoto Sato has updated the pull request incrementally with one additional commit since the last revision: flipped again, which was correct ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26580/files - new: https://git.openjdk.org/jdk/pull/26580/files/516403e8..38f3286f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26580&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26580&range=02-03 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/26580.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26580/head:pull/26580 PR: https://git.openjdk.org/jdk/pull/26580 From jlu at openjdk.org Thu Jul 31 23:17:25 2025 From: jlu at openjdk.org (Justin Lu) Date: Thu, 31 Jul 2025 23:17:25 GMT Subject: RFR: 8364370: java.text.DecimalFormat specification indentation correction Message-ID: Please review this doc only PR. java.text.DecimalFormat uses an implSpec tag in the middle of the class description. This location was on purpose as the contents related to the surrounding section. However, this has caused slight indentation in the rest of the class description below the tag (as pointed out by @naotoj) . Using the implSpec tag at the bottom of the class is preferable for formatting purposes. There are no contract changes, simply a re-organization of existing contents, thus no CSR is filed. ------------- Commit messages: - init Changes: https://git.openjdk.org/jdk/pull/26585/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26585&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8364370 Stats: 13 lines in 1 file changed: 7 ins; 6 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/26585.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26585/head:pull/26585 PR: https://git.openjdk.org/jdk/pull/26585 From liach at openjdk.org Thu Jul 31 23:23:58 2025 From: liach at openjdk.org (Chen Liang) Date: Thu, 31 Jul 2025 23:23:58 GMT Subject: RFR: 8364370: java.text.DecimalFormat specification indentation correction In-Reply-To: References: Message-ID: On Thu, 31 Jul 2025 23:12:56 GMT, Justin Lu wrote: > Please review this doc only PR. > > java.text.DecimalFormat uses an implSpec tag in the middle of the class description. This location was on purpose as the contents related to the surrounding section. However, this has caused slight indentation in the rest of the class description below the tag (as pointed out by @naotoj) . Using the implSpec tag at the bottom of the class is preferable for formatting purposes. > > There are no contract changes, simply a re-organization of existing contents, thus no CSR is filed. In fact, per the [specs](https://docs.oracle.com/en/java/javase/24/docs/specs/javadoc/doc-comment-spec.html#block-tags): > The content of a block tag is any text following the tag up to, but not including, either the next block tag, or the end of the documentation comment. It may be surprising that block tags have higher precedence over HTML headers. Just a note for the future. ------------- PR Comment: https://git.openjdk.org/jdk/pull/26585#issuecomment-3141617957