RFR: 1962: JiraProject#queryIssues may return duplicate issues

Tue Oct 24 21:33:31 UTC 2023

On Tue, 24 Oct 2023 17:46:39 GMT, Zhao Song <zsong at openjdk.org> wrote:

> Since [SKARA-1912](https://bugs.openjdk.org/browse/SKARA-1912) deployed into prod, we have noticed occurrences of duplicate key exceptions in the IssueProjectPoller#updatedIssues.
> 
> After investigation, we found that `JiraProject#queryIssues` may return duplicate issues and I think the pagination in this method is the root cause.
> 
> The problem here is that, sometimes, issues will be updated during the pagination process. For example, the default page size is 50 and we assume that issue#50 is the last issue in the first page. And before querying the second page, another issue which not in the first page got updated, so issue#50 will also be the first issue in the second page. Therefore, the method may return duplicate issues.
> 
> Erik's solution is right. We just need to order by updatedTime DESC and remove the duplicate issues. Ordering by updatedTime DESC would make sure that even some issues are updated during the pagination, the issues will be processed in the future round.

issuetracker/src/main/java/org/openjdk/skara/issuetracker/jira/JiraProject.java line 510:

> 508: 
> 509:     private List<IssueTrackerIssue> queryIssues(String jql) {
> 510:         var ret = new HashMap<String, IssueTrackerIssue>();

I think we should try to preserve the order we received issues. We can do that if we use a `LinkedHashMap` here. It would also make more sense to me to use ascending order. Then we will include any concurrently updated issue in this call instead of waiting for next call. I also think it makes sense for the bot to process issues in ascending order, so the first updated issue is processed first.

-------------

PR Review Comment: https://git.openjdk.org/skara/pull/1575#discussion_r1370838440