RFR: 2065: Update PR labels when new files are touched [v4]

Fri Aug 29 13:45:46 UTC 2025

On Thu, 28 Aug 2025 23:04:05 GMT, Zhao Song <zsong at openjdk.org> wrote:

>> This patch is trying to make the pr bot be able to update PR labels when new files are touched.
>> The main idea is from Erik. For a new PR, the bot will run a labelerWorkItem to auto label the PR first, and the commit hash will be stored in a comment.
>> 
>> With this patch, here are some new behaviors of the bot:
>> (1) The user didn't issue manual label command before auto labeling, LabelerWorkItem will do the initial auto labeling and store the commit hash in the comment. 
>> (2) If the user issued manual label command before auto labeling, the initial auto labeling will be skipped, the bot would still post a comment and store the hash in the comment.
>> (3) The user pushes a new commit or few commits to a pr that already auto labeled, the bot will evaluate the diff between stored hash and current head, then add new labels or upgrading labels to group labels, in the end, update the stored hash in the comment.
>> (4) The user force pushes to the pr that already auto labeled, the bot will evaluate the diff between baseHash and current head.
>> (5) The user issues a command to add a label to the pr(or even the user add the label via the forge UI), the bot will check if the labels can be upgraded to group labels.
>> 
>> The side effect of this feature I can imagine is that a user thinks a file is not related to a component and removed it manually, but later, every time when he touches the file, the label will be added back.
>
> Zhao Song has updated the pull request incrementally with one additional commit since the last revision:
> 
>   review comment

What Magnus said got me thinking. Since we are now redoing auto labeling every time the code changes, I think the concept of having the bot class track which PRs have been auto labeled doesn't make much sense. A PR is either up to date with the labeler, or it isn't. There shouldn't be anything special about the first run really. In CheckRun, to check if the PR has been auto labeled or not, it can check for the existence of an auto label comment. In LabelerWorkItem, I think the code paths could be more unified to make it clear that there isn't a big difference.

I'm thinking that when running the LabelerWorkItem again, due to a new commit, if nothing new is found, we just silently update the commit hash in the existing (last) auto label comment. If however we find that a new label needs to be added, we may want to create a new comment to make it clearer that a new action was taken, without erasing/hiding the history of the previous comment. That would make it easier to incorporate the kind of messaging Magnus is asking for.

For the case of running LabelerWorkItem after executing a LabelCommand, just to collapse groups, I'm starting to think that it would be better to just have the LabelCommand automatically collapse groups for you. The implementation might be shared, but by having it happen through the LabelCommand, we don't need to worry about updating the auto label comments, or racing against bot restarts. In that case the LabelCommand reply would include a message of combining the groups.

I might have missed something, but these are my current thoughts.

-------------

PR Comment: https://git.openjdk.org/skara/pull/1735#issuecomment-3237094437