RFR: 2340: CommitCommentsWorkItem executing in GitLab takes a lot of time [v2]
Erik Joelsson
erikj at openjdk.org
Mon Aug 12 22:12:29 UTC 2024
On Mon, 12 Aug 2024 20:06:58 GMT, Zhao Song <zsong at openjdk.org> wrote:
>> Currently, if a user comments under a commit named "Merge" in GitLab, the CommitCommentsWorkItem will take about 8 hours to complete.
>> After investigation, I think there is a bug in GitLabRepository#getCommitTitleToCommitsMap
>>
>> For GitLab repos, we need to build a hash map for mapping commit title to commits. The implementation of the map is Map<String, LinkedHashSet<Hash>>. We use LinkedHashSet here because we want the bot to check latest commits first.
>>
>> If the map has already been built, then the bot will query new commits in the repo and adds the commits into the map. However, when adding the new commit hash to the map, the bot add it to the end of the LinkedHashSet. So new commits will be checked very late.
>>
>> For commit titles like "Merge", sometimes, there can be around 50K commits in a repo. If the latest "Merge" commit is added to the tail of the LinkedHashSet, the bot must check all previous commits first, which takes a lot of time.
>
> Zhao Song has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
>
> SKARA-2340
forge/src/main/java/org/openjdk/skara/forge/gitlab/GitLabRepository.java line 645:
> 643: var hash = new Hash(commit.get("id").asString());
> 644: var title = commit.get("title").asString();
> 645: tempCommitTitleToCommits.computeIfAbsent(title, t -> new LinkedHashSet<>()).add(hash);
I read the bug description and I think I understand the problem now. The initial creation of the map iterates commits from newest to oldest, while this loop iterates from oldest to newest.
Instead of creating a temporary map, couldn't we just call `LinkedHashSet::addFirst` to get the new commits added at the front in the correct order?
-------------
PR Review Comment: https://git.openjdk.org/skara/pull/1682#discussion_r1714418346
More information about the skara-dev
mailing list