RFR: 2637: Decoding emails from quoted-printable is broken
Zhao Song
zsong at openjdk.org
Fri Dec 12 23:10:03 UTC 2025
On Fri, 12 Dec 2025 19:13:15 GMT, Erik Joelsson <erikj at openjdk.org> wrote:
> During my initial implementation of Mailman 3 support, I made an attempt at decoding quoted-printable encoded email bodies. That implementation isn't working that well. I only took 2 byte encoded UTF-8 characters into account, but we of course need to also handle 3 and 4 byte characters.
>
> Instead of trying to do this with regular expressions, I bit the bullet and started working on a byte array, byte by byte. That actually makes it a lot simpler as we just need to translate each encoded triplet (`=XX`) at a time and then just convert the resulting byte array using Java's built in character set decoder.
email/src/main/java/org/openjdk/skara/email/Email.java line 148:
> 146: }
> 147: default : {
> 148: out[j++] = (byte) Integer.parseInt("" + (char) in[i++] + (char) in[i], 16);
There is no boundary check here, so it always assumes there are two digits following the "=". I don't know if it's possible for mailman server to return malformed data, but if it happens, the bot will endlessly process the malformed input.
-------------
PR Review Comment: https://git.openjdk.org/skara/pull/1747#discussion_r2615808445
More information about the skara-dev
mailing list