RFR: 1532: CSRBot is too inefficient

Guoxiong Li gli at openjdk.org
Mon Aug 15 14:04:30 UTC 2022

On Fri, 12 Aug 2022 22:19:58 GMT, Erik Joelsson <erikj at openjdk.org> wrote:

> This patch is a pretty major redesign of how the CSRBot polls for work. The motivation for this is described in the bug description. It's quite big, so I will try to break down the major changes. 
> 1. The `CSRBot` has been split into `CSRIssueBot` and `CSRPullRequestBot`. The issue bot polls the issue tracker (JBS) for work, while the pull request bot polls the forge (github). The WorkItem part of the old CSRBot has been moved to the new `PullRequestWorkItem`, and except for now only handling one single PR per WorkItem instance (instead of everything in a repo), has been left pretty much intact. All actual updates on PRs are (still) handled by this WorkItem. The new `IssueWorkItem` is created when a change to a CSR Issue has been detected (in JBS). It walks the JBS issue and PR links to identify all PRs that could possibly be related to that CSR, and creates `PullRequestWorkItem`s for them.
> 2. To be able to find PRs from JBS issues, the CSRBot needs to know how to parse the PR comment links. To better share this knowledge, I moved the creation and deletion logic from the NotifyBot to PullRequestUtils (in the forge module), which is available to all bots that need it. I also added a method for parsing the URL from such a comment.
> 3. To efficiently query for updated CSR issues from Jira, I added two new query methods on IssueProject. A longer explanation of the new polling mechanism for issues can be found in the first bug comment. I've also tried to reasonably explain things in comments in the code.
> 4. In `JiraProject`, there was a pretty serious bug in all query methods for Issues. The `JiraIssue` constructor takes a `RestRequest` as a parameter. This request object is supposed to be used by `JiraIssue` for generating all the subqueries for the Jira REST API. The problem was that the query methods in `JiraProject` just sent in whatever `RestRequest` they used to find issues, which broke all methods that tried to make further calls through the returned `JiraIssue` to Jira. I fixed this by always supplying a `RestRequest` object with the correct URL when creating a JiraIssue.
> 5. To be able to call Jira with absolute timestamps in queries, we need to know the timezone of the user we are calling as. Luckily this information was available with a simple call, so I added this to the `JiraHost` class.
> 6. Similar to how care needs to be taken when polling for updates to issues in Jira, I applied a similar solution when polling PRs. This required a new method `HostedRepository.openPullRequestsAfter`. Unfortunately, it doesn't help much on Github, where the API doesn't support such a parameter, but it works for Gitlab.
> I have modified the tests that are affected by this change so that they still pass. This does verify that changes to issues in the IssueTracker are detected. Perhaps I should add some new tests to specifically verify the polling logic. I have verified the new queries and complete functionality manually by running the bot against the playground repo and bugs-stage.

I don't really know whether it is a good idea to has both CSRIssueBot and CSRPullRequestBot. It seems that one CSRBot contains two work items, `IssueWorkItem` and `PullRequestWorkItem`, works well too.

Or we can seperate the `CSRIssueBot` to a new module, like `IssueBot`, to be reused in the future. But it may exceed the range of this patch.

bots/csr/src/main/java/org/openjdk/skara/bots/csr/CSRBotFactory.java line 33:

> 31: import java.util.logging.Logger;
> 32: import org.openjdk.skara.forge.HostedRepository;
> 33: import org.openjdk.skara.issuetracker.IssueProject;

The import statements can be polished. The same/similar package could be put nearly.

bots/csr/src/main/java/org/openjdk/skara/bots/csr/CSRIssueBot.java line 1:

> 1: package org.openjdk.skara.bots.csr;

Need a copyright here.

bots/csr/src/main/java/org/openjdk/skara/bots/csr/CSRIssueBot.java line 56:

> 54:                 issueUpdatedAt.put(issue.id(), issue.updatedAt());
> 55:                 log.fine("Setting lastUpdatedAt from last updated issue " + issue.id() + " updated at " + lastUpdatedAt);
> 56:             } else {

When setting the `lastUpdateAt` to the date of the last updated issue firstly, the method `issueProject.csrIssues` can not get the previous csr issues at the beginning. Is it a intent?

bots/csr/src/main/java/org/openjdk/skara/bots/csr/CSRIssueBot.java line 73:

> 71:             if (issue.updatedAt().isAfter(lastUpdatedAt)) {
> 72:                 lastUpdatedAt = issue.updatedAt();
> 73:             }

A race condition may let the bot ignore some updated issues. Considering the following situation:

step 1: the method `issueProject.csrIssues` get the issue-1 and issue-2 at time 00:00:001
step 2: the issue-3 is updated at 00:00:002 and the issue-2 is updated at 00:00:003
step 3: at time 00:00:004, the statement `lastUpdatedAt = issue.updatedAt();` is run and updates the lastUpdatedAt to 00:00:003, then in the next round, the issue-3 will be ignored.

bots/csr/src/main/java/org/openjdk/skara/bots/csr/CSRPullRequestBot.java line 76:

> 74:             if (lastUpdatedAt == null || pr.updatedAt().isAfter(lastUpdatedAt)) {
> 75:                 lastUpdatedAt = pr.updatedAt();
> 76:             }

The logic is same as `CSRIssueBot`. The same race condition seems also exist.

bots/csr/src/main/java/org/openjdk/skara/bots/csr/IssueWorkItem.java line 1:

> 1: package org.openjdk.skara.bots.csr;

Need a copyright here.

bots/csr/src/main/java/org/openjdk/skara/bots/csr/IssueWorkItem.java line 51:

> 49:                 .filter(l -> l.relationship().isPresent() && "csr of".equals(l.relationship().get())).findAny();
> 50:         var issue = link.flatMap(Link::issue);
> 51:         var mainIssue = issue.flatMap(Backports::findMainIssue);

These two `flatMap` methods seem can be put into the first stream statement because the variables `link` and `issue` never be used twice.

bots/csr/src/main/java/org/openjdk/skara/bots/csr/PullRequestWorkItem.java line 1:

> 1: package org.openjdk.skara.bots.csr;

Need a copyright here.

forge/src/test/java/org/openjdk/skara/forge/PullRequestUtilsTests.java line 1:

> 1: package org.openjdk.skara.forge;

Need a copyright here.


Changes requested by gli (Committer).

PR: https://git.openjdk.org/skara/pull/1357

More information about the skara-dev mailing list