RFR: 1606: PullRequestPoller always processes the last updated MR for GitLab [v4]

Erik Joelsson erikj at openjdk.org
Tue Oct 4 16:32:21 UTC 2022


On Fri, 23 Sep 2022 19:20:11 GMT, Erik Joelsson <erikj at openjdk.org> wrote:

>> The new PullRequestPoller from [SKARA-1565](https://bugs.openjdk.org/browse/SKARA-1565) is inefficient in combination with GitLab. In the most common case, when no MRs have been updated, you would expect it to only do one GET query and be done. Instead, that initial query will always return the last updated MR, and then continue to fetch all the metadata for it, which adds up to a lot of time. 
>> 
>> This is made even worse by having to fetch comments and reviews for every MR to be able to compare them for potential updates. In GitLab, both comments and reviews are stored in the same set of "notes" for a merge request. Since we are only fetching them for comparing between different snapshots, we can make this faster by just fetching all notes once (which will also increase accuracy as there are other kinds of notes too, which we probably should check for updates too).
>> 
>> The repeated refetching of the last updated MR is basically the same problem as the IssuePoller is already solving. The resolution for timestamp based queries is limited (1s on GitLab) and the query is inclusive. In contrast, the current implementation for GitLabRepository treats the timestamp as exclusive.
>> 
>> This patch does the following:
>> 
>> 1. Change all timestamp based pull request and issue queries (including for test implementations) to be inclusive, for consistency.
>> 2. Implement the same kind of "positive" padding to the updatedAfter parameter in PullRequestPoller as is already present in IssuePoller. I'm also modifying tests to actually verify this behavior by exposing some of the internal data of the poller to the test.
>> 3. Add `PullRequest::comparisonSnapshot` which returns an `Object` that represents all data that needs to be considered when evaluating if a snapshot is equal to another. In GitLabMergeRequest, this is cached lazily to avoid repeated remote calls.
>> 
>> While working on this, I've realized that IssuePoller and PullRequestPoller are basically doing the exact same thing now, with pretty small variations. Ideally we should combine these into a common super class, but I'm leaving that for a separate change to make the current changes easier to follow.
>> 
>> I'm going to run this change in "staging" for a bit to see how it behaves.
>
> Erik Joelsson has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Applying the same fixes to IssuePoller

This change is currently blocking deployment of new changes to Skara bots, unless we back out [SKARA-1590](https://bugs.openjdk.org/browse/SKARA-1590). This is because the PullRequestPoller isn't behaving well enough to actually be used yet. It would certainly be better to get this fix in however, so we can move forward with more efficient polling.

-------------

PR: https://git.openjdk.org/skara/pull/1380


More information about the skara-dev mailing list