RFR: 1513: Reduce polling of mailing list archives

Erik Joelsson erikj at openjdk.org
Fri Jul 29 17:35:58 UTC 2022

This patch changes the strategy used by the MailmanListReader for polling the mailman archives. The current implementation relies on the server supporting "etag" in order to trust any cached results. Recent testing has shown that etags aren't supported by mail.openjdk.org, which means no results are ever cached, we just keep spamming the mailman archives for the last 12 months over and over.

My new implementation assumes that new emails will only ever appear in the current and previous months archives. (If this proves to be wrong, I still think that would be rare enough that it doesn't matter, as the full 12 months will be re-evaluated on bot restart.) So for anything older than the previous month, all successful (200) or non-existent (404) results will be cached and never re-queried.

The reason mlbridge needs to query emails for up to a year back is that it needs to piece together conversations and trace them back to the original post in order to correctly identify the PR link associating the conversation with a certain PR. (It's possible that this could be made more efficient in a separate change.)

The change itself is rather small, but in order to test it, I needed to expand functionality in the TestMailmanServer. The existing tests did not verify any calls to archives for months other than the current, so I needed to add support for actually handling that. I also moved the data to in memory storage in HashMaps instead of writing to temp files.

My only worry here is that I messed up with the test so that it will start failing on certain days of the year.


Commit messages:
 - SKARA-1513

Changes: https://git.openjdk.org/skara/pull/1343/files
 Webrev: https://webrevs.openjdk.org/?repo=skara&pr=1343&range=00
  Issue: https://bugs.openjdk.org/browse/SKARA-1513
  Stats: 126 lines in 3 files changed: 110 ins; 6 del; 10 mod
  Patch: https://git.openjdk.org/skara/pull/1343.diff
  Fetch: git fetch https://git.openjdk.org/skara pull/1343/head:pull/1343

PR: https://git.openjdk.org/skara/pull/1343

More information about the skara-dev mailing list