RFR: 1532: CSRBot is too inefficient

Erik Joelsson erikj at openjdk.org
Fri Aug 12 22:28:12 UTC 2022

This patch is a pretty major redesign of how the CSRBot polls for work. The motivation for this is described in the bug description. It's quite big, so I will try to break down the major changes. 

1. The `CSRBot` has been split into `CSRIssueBot` and `CSRPullRequestBot`. The issue bot polls the issue tracker (JBS) for work, while the pull request bot polls the forge (github). The WorkItem part of the old CSRBot has been moved to the new `PullRequestWorkItem`, and except for now only handling one single PR per WorkItem instance (instead of everything in a repo), has been left pretty much intact. All actual updates on PRs are (still) handled by this WorkItem. The new `IssueWorkItem` is created when a change to a CSR Issue has been detected (in JBS). It walks the JBS issue and PR links to identify all PRs that could possibly be related to that CSR, and creates `PullRequestWorkItem`s for them.

2. To be able to find PRs from JBS issues, the CSRBot needs to know how to parse the PR comment links. To better share this knowledge, I moved the creation and deletion logic from the NotifyBot to PullRequestUtils (in the forge module), which is available to all bots that need it. I also added a method for parsing the URL from such a comment.

3. To efficiently query for updated CSR issues from Jira, I added two new query methods on IssueProject. A longer explanation of the new polling mechanism for issues can be found in the first bug comment. I've also tried to reasonably explain things in comments in the code.

4. In `JiraProject`, there was a pretty serious bug in all query methods for Issues. The `JiraIssue` constructor takes a `RestRequest` as a parameter. This request object is supposed to be used by `JiraIssue` for generating all the subqueries for the Jira REST API. The problem was that the query methods in `JiraProject` just sent in whatever `RestRequest` they used to find issues, which broke all methods that tried to make further calls through the returned `JiraIssue` to Jira. I fixed this by always supplying a `RestRequest` object with the correct URL when creating a JiraIssue.

5. To be able to call Jira with absolute timestamps in queries, we need to know the timezone of the user we are calling as. Luckily this information was available with a simple call, so I added this to the `JiraHost` class.

6. Similar to how care needs to be taken when polling for updates to issues in Jira, I applied a similar solution when polling PRs. This required a new method `HostedRepository.openPullRequestsAfter`. Unfortunately, it doesn't help much on Github, where the API doesn't support such a parameter, but it works for Gitlab.

I have modified the tests that are affected by this change so that they still pass. This does verify that changes to issues in the IssueTracker are detected. Perhaps I should add some new tests to specifically verify the polling logic. I have verified the new queries and complete functionality manually by running the bot against the playground repo and bugs-stage.


Commit messages:
 - SKARA-1532

Changes: https://git.openjdk.org/skara/pull/1357/files
 Webrev: https://webrevs.openjdk.org/?repo=skara&pr=1357&range=00
  Issue: https://bugs.openjdk.org/browse/SKARA-1532
  Stats: 1099 lines in 21 files changed: 798 ins; 256 del; 45 mod
  Patch: https://git.openjdk.org/skara/pull/1357.diff
  Fetch: git fetch https://git.openjdk.org/skara pull/1357/head:pull/1357

PR: https://git.openjdk.org/skara/pull/1357

More information about the skara-dev mailing list