Request for assistance: Verify and update mailing list filter rules for jdk/jdk in preparation for Skara transition
Erik Helin
erik.helin at oracle.com
Fri Aug 28 14:20:08 UTC 2020
Hi all,
Thanks for all the feedback, based on this discussion we have been able
to make a number of improvements in this area!
But to begin with, it would probably have been good to clarify the
intended use of this list of rules a bit more clearly: The intent of
these rules is to create a conservative “minimal” initial set of lists
that quite likely would want to be included when sending out the review
request. A good example of this would be the fairly well known rule that
“all Makefile changes should be reviewed on build-dev” which is readily
encoded in this format as “build-dev”: “make/“. However, files that are
shared between several lists and where you select lists based on what
actually changed should not be on here. You may for example have noticed
the absence of the various ProblemList.txt files - they certainly fit
that criteria. But these are the two clear opposites, there are of
course many places in the jdk source tree where this is much more of a
gray area.
It is also quite likely that the original list that we posted here was
dauntingly verbose - it was our hope that it would be reasonably
straightforward for a subject matter expert to quickly delete a lot of
the “noise”. Based on the change suggestions received so far, it would
appear to not have been that unreasonable: the latest version has
dropped over a hundred lines while still improving accuracy by a fair
bit. Ideally there should probably only be a handful of rules for any
given list, which is already true for most, but the rules for lists like
“core-libs-dev” are certainly not there yet.
We should also mention how the initial list was created - it was *not*
crated manually by us :) Instead we parsed the RFR e-mails for thousands
of commits pushed to the jdk/jdk repository. This allowed us to create a
program that could see the files that were changed by a commit and the
lists that that the RFR e-mail were sent to. We then manually curated
the list, but realized that domain experts would make a better job of
the curation (hence the "request for assitance" e-mail).
We have also validated our approach by comparing the results of the
rules engine for selecting mailing lists with the last 30 days of RFR
e-mails for patches targeting the jdk/jdk repository. So far it seems
like the rules engine perform equally or better than humans at selecting
mailing lists, but it could also be that we are missing some nuances in
RFR e-mails that look like they have forgotten to CC at least one
mailing list.
Still, if you are an experienced contributor, you may not want to have
these suggestions added only to have to replace them manually with the
correct lists. To help with that, we have adjusted the “git pr create”
command-line tool to display the suggestions before the PR is created,
and provide the option to override them. If you create a PR from the
command-line you can now use the --cc flag:
$ git pr create --cc=hotspot-dev,build-dev
Furthermore, the "git pr create" command will now ask the user to verify
the mailing lists before creating the pull request:
$ git pr create
The following mailing lists will be CC:d for the "RFR" e-mail:
- build-dev at openjdk.java.net
- hotspot-dev at openjdk.java.net
Do you want to proceed with this mailing list selection? [Y/n]:
A user can opt-in to always use the inferred mailing lists by speciyfing
--cc=auto. This means that it is now *opt-in* to use the mailing lists
inferred from the patch when creating a PR from the command-line.
When creating a PR from the GitHub web UI, it’s also possible to
override these suggestions by adding a “/cc” command at the end of the
PR body. There is also a window of a few minutes to issue a command to
edit the suggestions after they have been displayed, but before any
email is sent to any list. The semantics of the “/cc” command have also
been changed, to ensure that it always overrides the automatic choices.
We would also like to reply to a few of the additional questions raised:
Q: Shouldn't this file with the rules be placed in the jdk/jdk
repository?
A: Yes, we think so too, but we started out by having it in the skara
repository. The Skara tooling does not care where the file resides,
we can easily update the tooling once/if we move the file into the
jdk/jdk repository.
Q: How will the file be maintained?
A: We imagine that this will be a collective effort once the file has
landed in the jdk/jdk repository. If a contributor makes a large
change that moves a lot of files around, then it seems fitting to
also consider if the rules for choosing mailing lists should perhaps
be updated.
Q: Why not use the CODEOWNERS file format that GitHub already supports?
A: This is something we considered but rejected for several reasons. The
first and primary reason is that the rule file is not meant to
convey ownership, it is merely stating the mailing lists that should
be notified when certain files change. Secondly we will run into
problems with GitHub _also_ sending e-mails if we use the CODEOWNERS
format, something we would not want. Lastly the rules file also
support specifying mailing lists groups that should be coalesced into
a single mailing list.
Q: Doesn't other OpenJDK projects that already have transitioned have
this problem?
A: No, for all other OpenJDK projects there is only one "primary"
mailing list. For example skara-dev for Skara, amber-dev for Amber
and so on.
Q: Do we have too many mailing lists?
A: Perhaps, but it is not our intent to propose any changes to the
number of mailing lists we have.
Q: Will the rules engine will be 100% correct every time?
A: No, but neither are humans when they choose mailing lists. The goal
of the rules engine, as stated at the start of this e-mail, is to
choose a reasonable minimal set of mailing lists that a patch should
be sent to. A contributor can always override the choice via the CLI
or use the `/cc` pull request command.
Q: Is the automatic selection of mailing lists opt-in or opt-out?
A: It is currently opt-in if you create a PR from the command-line and
opt-out if you create a PR via a web browser. We believe that many
experienced contributors will use the CLI tools and there have an
easy way to control the selected mailing lists. In constrast many
newcomers are likely to create pull requests via a web browser. As
stated above, there is a "cooldown" period after a PR has been
created when the PR author has time to fine-tune the selected mailing
lists regardless of how the PR was created.
Q: Shouldn't the information about to which mailing list to send a patch
be on the wiki instead?
A: We strongly prefer to have it in a format and place where the
information can be processed by programs. This way we can use the
information to automate and improve upon our workflows.
Again, thanks everyone for the great feedback and to everyone who helped
out tweaking the rules!
Best regards,
Erik and Robin
On 8/27/20 12:34 PM, Robin Westberg wrote:
> Hi all,
>
> As part of transitioning the jdk/jdk repository to Git with project Skara, we have created a set of rules that determine which mailing list(s) should be the default recipient of a review request, depending on which files are changed. The initial version of these rules was created by looking at historical commits and matching them with existing mailing list review threads. This has produced a reasonable basis, but it can most certainly be made better with some additional manual curating.
>
> Therefore, it would be very helpful if people with good knowledge of the various subsystems and source files that make up the JDK would be willing to take a look at these rules, and also suggest improvements where needed. In addition, lists like [1] would also be very useful insofar they exist.
>
> The current version of these rules is located in a JSON file in the Skara repository at [2]. In order to check the validity of the rules, there is also a CLI tool that can be used to apply it to either a subset of files or existing commits and produce a suggestion [3] [4]. If you are interested in helping out with curating these rules, these are the steps to get started:
>
> 1. Install the Skara CLI tools: https://wiki.openjdk.java.net/display/SKARA/CLI+Tools
> 2. Locate a suitable clone of the jdk/jdk repository (either Mercurial or Git is fine)
> 3. Change (cd) to the root of your jdk/jdk repository
> 3. Run the “debug mlrules” command on your favorite subset of files, for example like this (use the actual location of jdk.json on your system):
>
> $ git skara debug mlrules -v ~/git/skara/config/mailinglist/rules/jdk.json src/hotspot/share/jfr/
> Reading rules file...
> src/hotspot/share/jfr/dcmd: [hotspot-jfr-dev]
> src/hotspot/share/jfr/instrumentation/jfrEventClassTransformer.cpp: [hotspot-jfr-dev]
> src/hotspot/share/jfr/instrumentation/jfrEventClassTransformer.hpp: [hotspot-jfr-dev]
> src/hotspot/share/jfr/instrumentation/jfrJvmtiAgent.cpp: [hotspot-jfr-dev, serviceability-dev]
> …
> Final list suggestion is: [hotspot-jfr-dev, serviceability-dev]
>
> The command accepts multiple folder and/or file names to make it possible to simulate a potential change to a given set of files:
>
> $ git skara debug mlrules -v ../skara/config/mailinglist/rules/jdk.json doc/ide.md src/hotspot/cpu/x86/x86.ad src/hotspot/os/linux/gc/z/zNUMA_linux.cpp
> Reading rules file...
> doc: [build-dev]
> src/hotspot/cpu: [hotspot-compiler-dev]
> src/hotspot/os: [hotspot-runtime-dev, hotspot-gc-dev]
>
> Combined list suggestion: [build-dev, hotspot-compiler-dev, hotspot-gc-dev, hotspot-runtime-dev]
> Final list suggestion is: [build-dev, hotspot-dev]
>
> If the suggestions look fine, all is well. If not, you are welcome to propose a change to the rules, preferably by editing the jdk.json file [6] and creating a pull request towards the Skara project as described in [5]. Coincidentally, this is the same way that future changes to the jdk/jdk repository will be integrated, so this exercise could also serve as a way of getting started with Git / Skara!
>
> Best regards,
> Robin
>
> [1] https://openjdk.java.net/groups/2d/2dawtfiles.html
> [2] https://git.openjdk.java.net/skara/blob/master/config/mailinglist/rules/jdk.json
>
> [3]
> $ git skara debug mlrules -v ~/git/skara/config/mailinglist/rules/jdk.json src/java.desktop/unix/native
> Reading rules file...
> src/java.desktop/unix/native/common: [2d-dev]
> src/java.desktop/unix/native/include: []
> src/java.desktop/unix/native/libawt: [2d-dev]
> src/java.desktop/unix/native/libawt_headless: [awt-dev]
> src/java.desktop/unix/native/libawt_xawt: [awt-dev]
> src/java.desktop/unix/native/libfontmanager: [2d-dev]
> src/java.desktop/unix/native/libjawt: [awt-dev]
> src/java.desktop/unix/native/libsplashscreen: [awt-dev]
>
> Combined list suggestion: [2d-dev, awt-dev]
> Final list suggestion is: [2d-dev, awt-dev]
>
> [4]
> $ git skara debug mlrules -d 30 -v ~/git/skara/config/mailinglist/rules/jdk.json .
> ...
> ✅ [2d-dev, awt-dev, serviceability-dev] c32923e0: 8240487: Cleanup whitespace in .cc, .hh, .m, and .mm files
> ❌ [awt-dev] 7f74c7dd: 8212226: SurfaceManager throws "Invalid Image variant" for MultiResolutionImage (Windows)
> Suggested lists: [2d-dev, awt-dev]
> Rules matching unmentioned lists [2d-dev]:
> src/java.desktop/share/classes/sun/java2d/SunGraphics2D.java - [2d-dev: ^src/java.desktop/share/classes/sun/java2d/]
> …
>
> [5] https://wiki.openjdk.java.net/display/SKARA/Skara#Skara-Workflow
>
> [6]
> The rules are made up of sets of regular expressions for the various mailing lists that are used for reviewing changes going into the JDK. If any rule matches, that mailing list will get a copy of the review request email. For directories containing files that belong to different subsystems, it’s usually a good idea to write the rules in a complementary fashion if possible, so that anything not explicitly mentioned gets a reasonable default. As an example, see these rules for a subset of awt / 2d / i18n files:
>
> “awt-dev”: "src/java.desktop/share/classes/sun/awt/(?!font|sunhints|color/|font/|geom/|im/|image/|print/)”
> “2d-dev”: "src/java.desktop/share/classes/sun/awt/(font|sunhints|color/|font/|geom/|image/|print/)"
> “I18n-dev”: "src/java.desktop/share/classes/sun/awt/im/“
>
> In this example, anything not explicitly indicated as belonging to either 2d-dev or i18-dev will be matched by awt-dev.
>
More information about the jdk-dev
mailing list