RFR: 2637: Decoding emails from quoted-printable is broken

Zhao Song zsong at openjdk.org
Fri Dec 12 23:10:03 UTC 2025


On Fri, 12 Dec 2025 19:13:15 GMT, Erik Joelsson <erikj at openjdk.org> wrote:

> During my initial implementation of Mailman 3 support, I made an attempt at decoding quoted-printable encoded email bodies. That implementation isn't working that well. I only took 2 byte encoded UTF-8 characters into account, but we of course need to also handle 3 and 4 byte characters.
> 
> Instead of trying to do this with regular expressions, I bit the bullet and started working on a byte array, byte by byte. That actually makes it a lot simpler as we just need to translate each encoded triplet (`=XX`) at a time and then just convert the resulting byte array using Java's built in character set decoder.

email/src/main/java/org/openjdk/skara/email/Email.java line 148:

> 146:                     }
> 147:                     default : {
> 148:                         out[j++] = (byte) Integer.parseInt("" + (char) in[i++] + (char) in[i], 16);

There is no boundary check here, so it always assumes there are two digits following the "=". I don't know if it's  possible for mailman server to return malformed data, but if it happens, the bot will  endlessly process the malformed input.

-------------

PR Review Comment: https://git.openjdk.org/skara/pull/1747#discussion_r2615808445


More information about the skara-dev mailing list