RFR: 1606: PullRequestPoller always processes the last updated MR for GitLab
Erik Joelsson
erikj at openjdk.org
Thu Sep 22 21:29:02 UTC 2022
The new PullRequestPoller from [SKARA-1565](https://bugs.openjdk.org/browse/SKARA-1565) is inefficient in combination with GitLab. In the most common case, when no MRs have been updated, you would expect it to only do one GET query and be done. Instead, that initial query will always return the last updated MR, and then continue to fetch all the metadata for it, which adds up to a lot of time.
This is made even worse by having to fetch comments and reviews for every MR to be able to compare them for potential updates. In GitLab, both comments and reviews are stored in the same set of "notes" for a merge request. Since we are only fetching them for comparing between different snapshots, we can make this faster by just fetching all notes once (which will also increase accuracy as there are other kinds of notes too, which we probably should check for updates too).
The repeated refetching of the last updated MR is basically the same problem as the IssuePoller is already solving. The resolution for timestamp based queries is limited (1s on GitLab) and the query is inclusive. In contrast, the current implementation for GitLabRepository treats the timestamp as exclusive.
This patch does the following:
1. Change all timestamp based pull request and issue queries (including for test implementations) to be inclusive, for consistency.
2. Implement the same kind of "positive" padding to the updatedAfter parameter in PullRequestPoller as is already present in IssuePoller. I'm also modifying tests to actually verify this behavior by exposing some of the internal data of the poller to the test.
3. Add `PullRequest::comparisonSnapshot` which returns an `Object` that represents all data that needs to be considered when evaluating if a snapshot is equal to another. In GitLabMergeRequest, this is cached lazily to avoid repeated remote calls.
While working on this, I've realized that IssuePoller and PullRequestPoller are basically doing the exact same thing now, with pretty small variations. Ideally we should combine these into a common super class, but I'm leaving that for a separate change to make the current changes easier to follow.
I'm going to run this change in "staging" for a bit to see how it behaves.
-------------
Commit messages:
- SKARA-1606
Changes: https://git.openjdk.org/skara/pull/1380/files
Webrev: https://webrevs.openjdk.org/?repo=skara&pr=1380&range=00
Issue: https://bugs.openjdk.org/browse/SKARA-1606
Stats: 320 lines in 17 files changed: 140 ins; 131 del; 49 mod
Patch: https://git.openjdk.org/skara/pull/1380.diff
Fetch: git fetch https://git.openjdk.org/skara pull/1380/head:pull/1380
PR: https://git.openjdk.org/skara/pull/1380
More information about the skara-dev
mailing list