From erikj at openjdk.org Mon Dec 1 23:30:26 2025 From: erikj at openjdk.org (Erik Joelsson) Date: Mon, 1 Dec 2025 23:30:26 GMT Subject: Integrated: 2610: Mailman 3 support In-Reply-To: <0_8bnwSGe0yaGssu8cS2faqUa9AkpHodGzFf57sG71Y=.ac496be8-f6ed-473d-9dea-a4e40855069b@github.com> References: <0_8bnwSGe0yaGssu8cS2faqUa9AkpHodGzFf57sG71Y=.ac496be8-f6ed-473d-9dea-a4e40855069b@github.com> Message-ID: On Fri, 21 Nov 2025 23:14:50 GMT, Erik Joelsson wrote: > The OpenJDK mail archive will move to Mailman 3 at some point in the future. To prepare for that, Skara needs to be made Mailman 3 compatible, specifically for reading emails from the archive to be able to publish them as comments in PRs. There are two major changes > > ### 1. REST API > The REST API for reading mbox archives is different. It's a lot better in Mailman3. Instead of having to read in fixed monthly chunks, we can request a custom date interval. The data is also returned gzipped instead of in plain text. To retain backwards compatibility, I chose to implement the Mailman 2 and 3 support as subclasses with some shared implementation. This caused a lot of mechanical changes in tests, just updating method names or signatures since the API for creating a Mailman server object changes. It also forced some cleanup in bots where MailmanServer objects were created unnecessarily. I updated some tests to use the Mailman3 implementation where possible to exercise the new code more. > > The new implementation is somewhat taking advantage of the new capabilities in the API, but it could maybe be done more efficiently when polling for updates. > > ### 2. Mbox format > The format of the mail bodies in the returned mbox format is quite different. It's now MIME encoded and can be encoded in a few different ways ("7bit", "8bit" and "quoted-printable" have been observed so far). I've tried to implement support for decoding all the ways I've so far observed. > > Some examples of this patch in action can be found in https://github.com/openjdk/playground/pull/246. Note that the earlier comments have problems that eventually got resolved as I ironed out the details. > > An observation I've made is that the new server introduces hard line breaks in emails as they are archived. Those are visible already when browsing the archive. I'm not sure if this is a setting in Mailman itself or not. Those line breaks are not part of the email I received when subscribing to a list, just in the archive, and so also in any comments posted by Skara. > > This is a big patch, so sorry in advance. This pull request has now been integrated. Changeset: db92d060 Author: Erik Joelsson URL: https://git.openjdk.org/skara/commit/db92d06064a5b4dbe9403663669c7dc47560220e Stats: 1451 lines in 23 files changed: 1061 ins; 232 del; 158 mod 2610: Mailman 3 support Reviewed-by: zsong ------------- PR: https://git.openjdk.org/skara/pull/1743 From dnsimon at openjdk.org Tue Dec 2 09:17:07 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 2 Dec 2025 09:17:07 GMT Subject: RFR: 2630: Remove graal-dev list Message-ID: This removes graal-dev as a watcher of various openjdk directories. ------------- Commit messages: - remove graal mailing list matchers Changes: https://git.openjdk.org/skara/pull/1745/files Webrev: https://webrevs.openjdk.org/?repo=skara&pr=1745&range=00 Issue: https://bugs.openjdk.org/browse/SKARA-2630 Stats: 8 lines in 1 file changed: 0 ins; 8 del; 0 mod Patch: https://git.openjdk.org/skara/pull/1745.diff Fetch: git fetch https://git.openjdk.org/skara.git pull/1745/head:pull/1745 PR: https://git.openjdk.org/skara/pull/1745 From zsong at openjdk.org Tue Dec 2 17:31:07 2025 From: zsong at openjdk.org (Zhao Song) Date: Tue, 2 Dec 2025 17:31:07 GMT Subject: RFR: 2630: Remove graal-dev list In-Reply-To: References: Message-ID: On Tue, 2 Dec 2025 09:07:36 GMT, Doug Simon wrote: > This removes graal-dev as a watcher of various openjdk directories. Marked as reviewed by zsong (Reviewer). ------------- PR Review: https://git.openjdk.org/skara/pull/1745#pullrequestreview-3531347283 From iris at openjdk.org Tue Dec 2 17:36:26 2025 From: iris at openjdk.org (Iris Clark) Date: Tue, 2 Dec 2025 17:36:26 GMT Subject: RFR: 2630: Remove graal-dev list In-Reply-To: References: Message-ID: <7ocDyAe8dgDdX3DK0kELdv4j1zEnTYzfpn7-WN8Wzsk=.a8da8b76-c66a-4f2a-8380-4deadf8772a6@github.com> On Tue, 2 Dec 2025 09:07:36 GMT, Doug Simon wrote: > This removes graal-dev as a watcher of various openjdk directories. Marked as reviewed by iris (no project role). ------------- PR Review: https://git.openjdk.org/skara/pull/1745#pullrequestreview-3531366221 From duke at openjdk.org Tue Dec 2 17:46:56 2025 From: duke at openjdk.org (duke) Date: Tue, 2 Dec 2025 17:46:56 GMT Subject: RFR: 2630: Remove graal-dev list In-Reply-To: References: Message-ID: <7uBgyRU9Z6I3oNTgGNcifOvsvq5zPuWWvI9pZR18kOQ=.b7d5701b-f42b-41be-8344-0f4b4f8b33b0@github.com> On Tue, 2 Dec 2025 09:07:36 GMT, Doug Simon wrote: > This removes graal-dev as a watcher of various openjdk directories. @dougxc Your change (at version bee8ec992a1ebbb31618c96c5926faacd8260f4f) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/skara/pull/1745#issuecomment-3603253152 From dnsimon at openjdk.org Tue Dec 2 17:54:20 2025 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 2 Dec 2025 17:54:20 GMT Subject: Integrated: 2630: Remove graal-dev list In-Reply-To: References: Message-ID: On Tue, 2 Dec 2025 09:07:36 GMT, Doug Simon wrote: > This removes graal-dev as a watcher of various openjdk directories. This pull request has now been integrated. Changeset: cfb032c4 Author: Doug Simon Committer: Zhao Song URL: https://git.openjdk.org/skara/commit/cfb032c4626d69817053056aaebb77b0c27c9180 Stats: 8 lines in 1 file changed: 0 ins; 8 del; 0 mod 2630: Remove graal-dev list Reviewed-by: zsong, iris ------------- PR: https://git.openjdk.org/skara/pull/1745 From erikj at openjdk.org Fri Dec 12 19:15:57 2025 From: erikj at openjdk.org (Erik Joelsson) Date: Fri, 12 Dec 2025 19:15:57 GMT Subject: RFR: 2636: Mailing list names should be an EmailAddress Message-ID: Further testing of Mailman 3 uncovered a bug in how Skara represents email lists in an email list server. In the configuration it's defined as an email address, but then we cut out the domain part and only store the local part of the email as a String. Later when processing emails, the domain of the mailing list server is implicitly added to form an email address when needed. This has worked so far, because the domain of every mailing list has been the same as the domain of the archive server (openjdk.org), but during testing, we have a mix of mailing list names and test servers where this isn't matching, so we need to handle it correctly. The fix is to change the type and representation of mailing lists to be an actual EmailAddress. It resolves all the problems from what I can tell. It does create a rather big patch however. ------------- Commit messages: - SKARA-2636 Changes: https://git.openjdk.org/skara/pull/1746/files Webrev: https://webrevs.openjdk.org/?repo=skara&pr=1746&range=00 Issue: https://bugs.openjdk.org/browse/SKARA-2636 Stats: 210 lines in 17 files changed: 11 ins; 12 del; 187 mod Patch: https://git.openjdk.org/skara/pull/1746.diff Fetch: git fetch https://git.openjdk.org/skara.git pull/1746/head:pull/1746 PR: https://git.openjdk.org/skara/pull/1746 From erikj at openjdk.org Fri Dec 12 19:24:41 2025 From: erikj at openjdk.org (Erik Joelsson) Date: Fri, 12 Dec 2025 19:24:41 GMT Subject: RFR: 2637: Decoding emails from quoted-printable is broken Message-ID: During my initial implementation of Mailman 3 support, I made an attempt at decoding quoted-printable encoded email bodies. That implementation isn't working that well. I only took 2 byte encoded UTF-8 characters into account, but we of course need to also handle 3 and 4 byte characters. Instead of trying to do this with regular expressions, I bit the bullet and started working on a byte array, byte by byte. That actually makes it a lot simpler as we just need to translate each encoded triplet (`=XX`) at a time and then just convert the resulting byte array using Java's built in character set decoder. ------------- Commit messages: - comment - SKARA-2637 Changes: https://git.openjdk.org/skara/pull/1747/files Webrev: https://webrevs.openjdk.org/?repo=skara&pr=1747&range=00 Issue: https://bugs.openjdk.org/browse/SKARA-2637 Stats: 61 lines in 2 files changed: 43 ins; 8 del; 10 mod Patch: https://git.openjdk.org/skara/pull/1747.diff Fetch: git fetch https://git.openjdk.org/skara.git pull/1747/head:pull/1747 PR: https://git.openjdk.org/skara/pull/1747 From tbell at openjdk.org Fri Dec 12 20:58:58 2025 From: tbell at openjdk.org (Tim Bell) Date: Fri, 12 Dec 2025 20:58:58 GMT Subject: RFR: 2636: Mailing list names should be an EmailAddress In-Reply-To: References: Message-ID: On Fri, 12 Dec 2025 19:05:38 GMT, Erik Joelsson wrote: > Further testing of Mailman 3 uncovered a bug in how Skara represents email lists in an email list server. In the configuration it's defined as an email address, but then we cut out the domain part and only store the local part of the email as a String. Later when processing emails, the domain of the mailing list server is implicitly added to form an email address when needed. This has worked so far, because the domain of every mailing list has been the same as the domain of the archive server (openjdk.org), but during testing, we have a mix of mailing list names and test servers where this isn't matching, so we need to handle it correctly. > > The fix is to change the type and representation of mailing lists to be an actual EmailAddress. It resolves all the problems from what I can tell. It does create a rather big patch however. Looks good ------------- Marked as reviewed by tbell (Reviewer). PR Review: https://git.openjdk.org/skara/pull/1746#pullrequestreview-3573581088 From tbell at openjdk.org Fri Dec 12 21:08:53 2025 From: tbell at openjdk.org (Tim Bell) Date: Fri, 12 Dec 2025 21:08:53 GMT Subject: RFR: 2637: Decoding emails from quoted-printable is broken In-Reply-To: References: Message-ID: <97K1TNvoQaKlOb1sLLTY-Y6PF-SmFscJDK-O-kLGSAE=.5d07db05-0bd3-4ca1-b433-64efc22e0c01@github.com> On Fri, 12 Dec 2025 19:13:15 GMT, Erik Joelsson wrote: > During my initial implementation of Mailman 3 support, I made an attempt at decoding quoted-printable encoded email bodies. That implementation isn't working that well. I only took 2 byte encoded UTF-8 characters into account, but we of course need to also handle 3 and 4 byte characters. > > Instead of trying to do this with regular expressions, I bit the bullet and started working on a byte array, byte by byte. That actually makes it a lot simpler as we just need to translate each encoded triplet (`=XX`) at a time and then just convert the resulting byte array using Java's built in character set decoder. Nice Marked as reviewed by tbell (Reviewer). ------------- Marked as reviewed by tbell (Reviewer). PR Review: https://git.openjdk.org/skara/pull/1747#pullrequestreview-3573602274 PR Review: https://git.openjdk.org/skara/pull/1747#pullrequestreview-3573603485 From zsong at openjdk.org Fri Dec 12 22:43:38 2025 From: zsong at openjdk.org (Zhao Song) Date: Fri, 12 Dec 2025 22:43:38 GMT Subject: RFR: 2636: Mailing list names should be an EmailAddress In-Reply-To: References: Message-ID: On Fri, 12 Dec 2025 19:05:38 GMT, Erik Joelsson wrote: > Further testing of Mailman 3 uncovered a bug in how Skara represents email lists in an email list server. In the configuration it's defined as an email address, but then we cut out the domain part and only store the local part of the email as a String. Later when processing emails, the domain of the mailing list server is implicitly added to form an email address when needed. This has worked so far, because the domain of every mailing list has been the same as the domain of the archive server (openjdk.org), but during testing, we have a mix of mailing list names and test servers where this isn't matching, so we need to handle it correctly. > > The fix is to change the type and representation of mailing lists to be an actual EmailAddress. It resolves all the problems from what I can tell. It does create a rather big patch however. Looks good! ------------- Marked as reviewed by zsong (Reviewer). PR Review: https://git.openjdk.org/skara/pull/1746#pullrequestreview-3573824231 From zsong at openjdk.org Fri Dec 12 23:10:03 2025 From: zsong at openjdk.org (Zhao Song) Date: Fri, 12 Dec 2025 23:10:03 GMT Subject: RFR: 2637: Decoding emails from quoted-printable is broken In-Reply-To: References: Message-ID: On Fri, 12 Dec 2025 19:13:15 GMT, Erik Joelsson wrote: > During my initial implementation of Mailman 3 support, I made an attempt at decoding quoted-printable encoded email bodies. That implementation isn't working that well. I only took 2 byte encoded UTF-8 characters into account, but we of course need to also handle 3 and 4 byte characters. > > Instead of trying to do this with regular expressions, I bit the bullet and started working on a byte array, byte by byte. That actually makes it a lot simpler as we just need to translate each encoded triplet (`=XX`) at a time and then just convert the resulting byte array using Java's built in character set decoder. email/src/main/java/org/openjdk/skara/email/Email.java line 148: > 146: } > 147: default : { > 148: out[j++] = (byte) Integer.parseInt("" + (char) in[i++] + (char) in[i], 16); There is no boundary check here, so it always assumes there are two digits following the "=". I don't know if it's possible for mailman server to return malformed data, but if it happens, the bot will endlessly process the malformed input. ------------- PR Review Comment: https://git.openjdk.org/skara/pull/1747#discussion_r2615808445 From zsong at openjdk.org Fri Dec 12 23:13:58 2025 From: zsong at openjdk.org (Zhao Song) Date: Fri, 12 Dec 2025 23:13:58 GMT Subject: RFR: 2637: Decoding emails from quoted-printable is broken In-Reply-To: References: Message-ID: On Fri, 12 Dec 2025 19:13:15 GMT, Erik Joelsson wrote: > During my initial implementation of Mailman 3 support, I made an attempt at decoding quoted-printable encoded email bodies. That implementation isn't working that well. I only took 2 byte encoded UTF-8 characters into account, but we of course need to also handle 3 and 4 byte characters. > > Instead of trying to do this with regular expressions, I bit the bullet and started working on a byte array, byte by byte. That actually makes it a lot simpler as we just need to translate each encoded triplet (`=XX`) at a time and then just convert the resulting byte array using Java's built in character set decoder. Marked as reviewed by zsong (Reviewer). ------------- PR Review: https://git.openjdk.org/skara/pull/1747#pullrequestreview-3573880869 From zsong at openjdk.org Fri Dec 12 23:13:59 2025 From: zsong at openjdk.org (Zhao Song) Date: Fri, 12 Dec 2025 23:13:59 GMT Subject: RFR: 2637: Decoding emails from quoted-printable is broken In-Reply-To: References: Message-ID: <0DhcOESomiPkl07uJymdFd6n5v7KiT_gZvWjt6aGku4=.b008b983-1e1b-48e4-8746-32355bb5e6c1@github.com> On Fri, 12 Dec 2025 23:07:48 GMT, Zhao Song wrote: >> During my initial implementation of Mailman 3 support, I made an attempt at decoding quoted-printable encoded email bodies. That implementation isn't working that well. I only took 2 byte encoded UTF-8 characters into account, but we of course need to also handle 3 and 4 byte characters. >> >> Instead of trying to do this with regular expressions, I bit the bullet and started working on a byte array, byte by byte. That actually makes it a lot simpler as we just need to translate each encoded triplet (`=XX`) at a time and then just convert the resulting byte array using Java's built in character set decoder. > > email/src/main/java/org/openjdk/skara/email/Email.java line 148: > >> 146: } >> 147: default : { >> 148: out[j++] = (byte) Integer.parseInt("" + (char) in[i++] + (char) in[i], 16); > > There is no boundary check here, so it always assumes there are two digits following the "=". I don't know if it's possible for mailman server to return malformed data, but if it happens, the bot will endlessly process the malformed input. Oh, I was wrong, the exception will be catched at Mbox#splitMbox(), so the bot won't process the malformed data endlessly. ------------- PR Review Comment: https://git.openjdk.org/skara/pull/1747#discussion_r2615815364 From erikj at openjdk.org Fri Dec 12 23:56:24 2025 From: erikj at openjdk.org (Erik Joelsson) Date: Fri, 12 Dec 2025 23:56:24 GMT Subject: RFR: 2637: Decoding emails from quoted-printable is broken In-Reply-To: <0DhcOESomiPkl07uJymdFd6n5v7KiT_gZvWjt6aGku4=.b008b983-1e1b-48e4-8746-32355bb5e6c1@github.com> References: <0DhcOESomiPkl07uJymdFd6n5v7KiT_gZvWjt6aGku4=.b008b983-1e1b-48e4-8746-32355bb5e6c1@github.com> Message-ID: <0cnZOQd5pxeNCSfxNkuH7VI9307u403BVAm1ncp0fx0=.612362f2-069a-4ce7-a9d4-56c9489bd7f1@github.com> On Fri, 12 Dec 2025 23:11:38 GMT, Zhao Song wrote: >> email/src/main/java/org/openjdk/skara/email/Email.java line 148: >> >>> 146: } >>> 147: default : { >>> 148: out[j++] = (byte) Integer.parseInt("" + (char) in[i++] + (char) in[i], 16); >> >> There is no boundary check here, so it always assumes there are two digits following the "=". I don't know if it's possible for mailman server to return malformed data, but if it happens, the bot will endlessly process the malformed input. > > Oh, I was wrong, the exception will be catched at Mbox#splitMbox(), so the bot won't process the malformed data endlessly. Hm, not sure what's better, ignoring the email or trying our best to handle a malformed encoding. We don't even log the issue. Should probably do that at least in Mbox#splitMbox. ------------- PR Review Comment: https://git.openjdk.org/skara/pull/1747#discussion_r2615871120 From erikj at openjdk.org Tue Dec 16 23:04:46 2025 From: erikj at openjdk.org (Erik Joelsson) Date: Tue, 16 Dec 2025 23:04:46 GMT Subject: Integrated: 2636: Mailing list names should be an EmailAddress In-Reply-To: References: Message-ID: <6tDcSAfLvgGT7jJb_4qASblgEWFiyb6Tae1q6_tt7AY=.d4c89a8c-6cd5-4ec3-a9d6-bac6602a91e0@github.com> On Fri, 12 Dec 2025 19:05:38 GMT, Erik Joelsson wrote: > Further testing of Mailman 3 uncovered a bug in how Skara represents email lists in an email list server. In the configuration it's defined as an email address, but then we cut out the domain part and only store the local part of the email as a String. Later when processing emails, the domain of the mailing list server is implicitly added to form an email address when needed. This has worked so far, because the domain of every mailing list has been the same as the domain of the archive server (openjdk.org), but during testing, we have a mix of mailing list names and test servers where this isn't matching, so we need to handle it correctly. > > The fix is to change the type and representation of mailing lists to be an actual EmailAddress. It resolves all the problems from what I can tell. It does create a rather big patch however. This pull request has now been integrated. Changeset: 80d8e449 Author: Erik Joelsson URL: https://git.openjdk.org/skara/commit/80d8e44994be3fefa008a04ea17cb49961240f04 Stats: 210 lines in 17 files changed: 11 ins; 12 del; 187 mod 2636: Mailing list names should be an EmailAddress Reviewed-by: tbell, zsong ------------- PR: https://git.openjdk.org/skara/pull/1746 From erikj at openjdk.org Tue Dec 16 23:18:17 2025 From: erikj at openjdk.org (Erik Joelsson) Date: Tue, 16 Dec 2025 23:18:17 GMT Subject: RFR: 2637: Decoding emails from quoted-printable is broken [v2] In-Reply-To: References: Message-ID: > During my initial implementation of Mailman 3 support, I made an attempt at decoding quoted-printable encoded email bodies. That implementation isn't working that well. I only took 2 byte encoded UTF-8 characters into account, but we of course need to also handle 3 and 4 byte characters. > > Instead of trying to do this with regular expressions, I bit the bullet and started working on a byte array, byte by byte. That actually makes it a lot simpler as we just need to translate each encoded triplet (`=XX`) at a time and then just convert the resulting byte array using Java's built in character set decoder. Erik Joelsson has updated the pull request incrementally with one additional commit since the last revision: Added logging of failed email parsing ------------- Changes: - all: https://git.openjdk.org/skara/pull/1747/files - new: https://git.openjdk.org/skara/pull/1747/files/6c95afdf..651fa7dd Webrevs: - full: https://webrevs.openjdk.org/?repo=skara&pr=1747&range=01 - incr: https://webrevs.openjdk.org/?repo=skara&pr=1747&range=00-01 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod Patch: https://git.openjdk.org/skara/pull/1747.diff Fetch: git fetch https://git.openjdk.org/skara.git pull/1747/head:pull/1747 PR: https://git.openjdk.org/skara/pull/1747 From erikj at openjdk.org Tue Dec 16 23:18:17 2025 From: erikj at openjdk.org (Erik Joelsson) Date: Tue, 16 Dec 2025 23:18:17 GMT Subject: RFR: 2637: Decoding emails from quoted-printable is broken [v2] In-Reply-To: <0cnZOQd5pxeNCSfxNkuH7VI9307u403BVAm1ncp0fx0=.612362f2-069a-4ce7-a9d4-56c9489bd7f1@github.com> References: <0DhcOESomiPkl07uJymdFd6n5v7KiT_gZvWjt6aGku4=.b008b983-1e1b-48e4-8746-32355bb5e6c1@github.com> <0cnZOQd5pxeNCSfxNkuH7VI9307u403BVAm1ncp0fx0=.612362f2-069a-4ce7-a9d4-56c9489bd7f1@github.com> Message-ID: On Fri, 12 Dec 2025 23:54:19 GMT, Erik Joelsson wrote: >> Oh, I was wrong, the exception will be catched at Mbox#splitMbox(), so the bot won't process the malformed data endlessly. > > Hm, not sure what's better, ignoring the email or trying our best to handle a malformed encoding. We don't even log the issue. Should probably do that at least in Mbox#splitMbox. Added logging when catching exception during email parsing. ------------- PR Review Comment: https://git.openjdk.org/skara/pull/1747#discussion_r2625043669 From zsong at openjdk.org Wed Dec 17 01:32:23 2025 From: zsong at openjdk.org (Zhao Song) Date: Wed, 17 Dec 2025 01:32:23 GMT Subject: RFR: 2637: Decoding emails from quoted-printable is broken [v2] In-Reply-To: References: Message-ID: On Tue, 16 Dec 2025 23:18:17 GMT, Erik Joelsson wrote: >> During my initial implementation of Mailman 3 support, I made an attempt at decoding quoted-printable encoded email bodies. That implementation isn't working that well. I only took 2 byte encoded UTF-8 characters into account, but we of course need to also handle 3 and 4 byte characters. >> >> Instead of trying to do this with regular expressions, I bit the bullet and started working on a byte array, byte by byte. That actually makes it a lot simpler as we just need to translate each encoded triplet (`=XX`) at a time and then just convert the resulting byte array using Java's built in character set decoder. > > Erik Joelsson has updated the pull request incrementally with one additional commit since the last revision: > > Added logging of failed email parsing Marked as reviewed by zsong (Reviewer). ------------- PR Review: https://git.openjdk.org/skara/pull/1747#pullrequestreview-3585619708 From erikj at openjdk.org Wed Dec 17 18:49:02 2025 From: erikj at openjdk.org (Erik Joelsson) Date: Wed, 17 Dec 2025 18:49:02 GMT Subject: Integrated: 2637: Decoding emails from quoted-printable is broken In-Reply-To: References: Message-ID: On Fri, 12 Dec 2025 19:13:15 GMT, Erik Joelsson wrote: > During my initial implementation of Mailman 3 support, I made an attempt at decoding quoted-printable encoded email bodies. That implementation isn't working that well. I only took 2 byte encoded UTF-8 characters into account, but we of course need to also handle 3 and 4 byte characters. > > Instead of trying to do this with regular expressions, I bit the bullet and started working on a byte array, byte by byte. That actually makes it a lot simpler as we just need to translate each encoded triplet (`=XX`) at a time and then just convert the resulting byte array using Java's built in character set decoder. This pull request has now been integrated. Changeset: 28811a90 Author: Erik Joelsson URL: https://git.openjdk.org/skara/commit/28811a90fe396cf3b2fa2c38fd2ecec86f063b7b Stats: 64 lines in 3 files changed: 45 ins; 8 del; 11 mod 2637: Decoding emails from quoted-printable is broken Reviewed-by: zsong, tbell ------------- PR: https://git.openjdk.org/skara/pull/1747 From erikj at openjdk.org Fri Dec 19 15:16:10 2025 From: erikj at openjdk.org (Erik Joelsson) Date: Fri, 19 Dec 2025 15:16:10 GMT Subject: RFR: 2641: Mbox parser fails on headers that start with newline Message-ID: It seems that at least for Mailman 3 mbox data, header values will sometimes start on the next line, after a single space, instead of directly after a single space following the `:`. E.g. this: Subject: foo Instead of this: Subject: foo This patch adjusts the regexp used to parse mbox email headers to take this into account, by optionally matching this newline. Added a some more test to verify the various cases. ------------- Commit messages: - SKARA-2641 Changes: https://git.openjdk.org/skara/pull/1748/files Webrev: https://webrevs.openjdk.org/?repo=skara&pr=1748&range=00 Issue: https://bugs.openjdk.org/browse/SKARA-2641 Stats: 52 lines in 2 files changed: 51 ins; 0 del; 1 mod Patch: https://git.openjdk.org/skara/pull/1748.diff Fetch: git fetch https://git.openjdk.org/skara.git pull/1748/head:pull/1748 PR: https://git.openjdk.org/skara/pull/1748 From zsong at openjdk.org Fri Dec 19 17:25:19 2025 From: zsong at openjdk.org (Zhao Song) Date: Fri, 19 Dec 2025 17:25:19 GMT Subject: RFR: 2641: Mbox parser fails on headers that start with newline In-Reply-To: References: Message-ID: On Fri, 19 Dec 2025 15:06:00 GMT, Erik Joelsson wrote: > It seems that at least for Mailman 3 mbox data, header values will sometimes start on the next line, after a single space, instead of directly after a single space following the `:`. E.g. this: > > > Subject: > foo > > > Instead of this: > > > Subject: foo > > > This patch adjusts the regexp used to parse mbox email headers to take this into account, by optionally matching this newline. Added a some more test to verify the various cases. Looks good. ------------- Marked as reviewed by zsong (Reviewer). PR Review: https://git.openjdk.org/skara/pull/1748#pullrequestreview-3599441946 From erikj at openjdk.org Fri Dec 19 19:23:08 2025 From: erikj at openjdk.org (Erik Joelsson) Date: Fri, 19 Dec 2025 19:23:08 GMT Subject: Integrated: 2641: Mbox parser fails on headers that start with newline In-Reply-To: References: Message-ID: <063QEJyQhRKwWpPpeBcIfhHhZu5B6BEKW219USZ5AFQ=.25eff816-97b3-4e25-b83c-40dfdd7bdaa3@github.com> On Fri, 19 Dec 2025 15:06:00 GMT, Erik Joelsson wrote: > It seems that at least for Mailman 3 mbox data, header values will sometimes start on the next line, after a single space, instead of directly after a single space following the `:`. E.g. this: > > > Subject: > foo > > > Instead of this: > > > Subject: foo > > > This patch adjusts the regexp used to parse mbox email headers to take this into account, by optionally matching this newline. Added a some more test to verify the various cases. This pull request has now been integrated. Changeset: a177a13d Author: Erik Joelsson URL: https://git.openjdk.org/skara/commit/a177a13df82a078740585683e430acce38596c2e Stats: 52 lines in 2 files changed: 51 ins; 0 del; 1 mod 2641: Mbox parser fails on headers that start with newline Reviewed-by: zsong ------------- PR: https://git.openjdk.org/skara/pull/1748