Request for assistance: Verify and update mailing list filter rules for jdk/jdk in preparation for Skara transition
David Holmes
david.holmes at oracle.com
Mon Aug 31 05:17:45 UTC 2020
Hi Erik,
This updated process seems far better to me - thanks.
David
On 29/08/2020 12:20 am, Erik Helin wrote:
> Hi all,
>
> Thanks for all the feedback, based on this discussion we have been able
> to make a number of improvements in this area!
>
> But to begin with, it would probably have been good to clarify the
> intended use of this list of rules a bit more clearly: The intent of
> these rules is to create a conservative “minimal” initial set of lists
> that quite likely would want to be included when sending out the review
> request. A good example of this would be the fairly well known rule that
> “all Makefile changes should be reviewed on build-dev” which is readily
> encoded in this format as “build-dev”: “make/“. However, files that are
> shared between several lists and where you select lists based on what
> actually changed should not be on here. You may for example have noticed
> the absence of the various ProblemList.txt files - they certainly fit
> that criteria. But these are the two clear opposites, there are of
> course many places in the jdk source tree where this is much more of a
> gray area.
>
> It is also quite likely that the original list that we posted here was
> dauntingly verbose - it was our hope that it would be reasonably
> straightforward for a subject matter expert to quickly delete a lot of
> the “noise”. Based on the change suggestions received so far, it would
> appear to not have been that unreasonable: the latest version has
> dropped over a hundred lines while still improving accuracy by a fair
> bit. Ideally there should probably only be a handful of rules for any
> given list, which is already true for most, but the rules for lists like
> “core-libs-dev” are certainly not there yet.
>
> We should also mention how the initial list was created - it was *not*
> crated manually by us :) Instead we parsed the RFR e-mails for thousands
> of commits pushed to the jdk/jdk repository. This allowed us to create a
> program that could see the files that were changed by a commit and the
> lists that that the RFR e-mail were sent to. We then manually curated
> the list, but realized that domain experts would make a better job of
> the curation (hence the "request for assitance" e-mail).
>
> We have also validated our approach by comparing the results of the
> rules engine for selecting mailing lists with the last 30 days of RFR
> e-mails for patches targeting the jdk/jdk repository. So far it seems
> like the rules engine perform equally or better than humans at selecting
> mailing lists, but it could also be that we are missing some nuances in
> RFR e-mails that look like they have forgotten to CC at least one
> mailing list.
>
> Still, if you are an experienced contributor, you may not want to have
> these suggestions added only to have to replace them manually with the
> correct lists. To help with that, we have adjusted the “git pr create”
> command-line tool to display the suggestions before the PR is created,
> and provide the option to override them. If you create a PR from the
> command-line you can now use the --cc flag:
>
> $ git pr create --cc=hotspot-dev,build-dev
>
> Furthermore, the "git pr create" command will now ask the user to verify
> the mailing lists before creating the pull request:
>
> $ git pr create
> The following mailing lists will be CC:d for the "RFR" e-mail:
> - build-dev at openjdk.java.net
> - hotspot-dev at openjdk.java.net
>
> Do you want to proceed with this mailing list selection? [Y/n]:
>
> A user can opt-in to always use the inferred mailing lists by speciyfing
> --cc=auto. This means that it is now *opt-in* to use the mailing lists
> inferred from the patch when creating a PR from the command-line.
>
> When creating a PR from the GitHub web UI, it’s also possible to
> override these suggestions by adding a “/cc” command at the end of the
> PR body. There is also a window of a few minutes to issue a command to
> edit the suggestions after they have been displayed, but before any
> email is sent to any list. The semantics of the “/cc” command have also
> been changed, to ensure that it always overrides the automatic choices.
>
> We would also like to reply to a few of the additional questions raised:
>
> Q: Shouldn't this file with the rules be placed in the jdk/jdk
> repository?
> A: Yes, we think so too, but we started out by having it in the skara
> repository. The Skara tooling does not care where the file resides,
> we can easily update the tooling once/if we move the file into the
> jdk/jdk repository.
>
> Q: How will the file be maintained?
> A: We imagine that this will be a collective effort once the file has
> landed in the jdk/jdk repository. If a contributor makes a large
> change that moves a lot of files around, then it seems fitting to
> also consider if the rules for choosing mailing lists should perhaps
> be updated.
>
> Q: Why not use the CODEOWNERS file format that GitHub already supports?
> A: This is something we considered but rejected for several reasons. The
> first and primary reason is that the rule file is not meant to
> convey ownership, it is merely stating the mailing lists that should
> be notified when certain files change. Secondly we will run into
> problems with GitHub _also_ sending e-mails if we use the CODEOWNERS
> format, something we would not want. Lastly the rules file also
> support specifying mailing lists groups that should be coalesced into
> a single mailing list.
>
> Q: Doesn't other OpenJDK projects that already have transitioned have
> this problem?
> A: No, for all other OpenJDK projects there is only one "primary"
> mailing list. For example skara-dev for Skara, amber-dev for Amber
> and so on.
>
> Q: Do we have too many mailing lists?
> A: Perhaps, but it is not our intent to propose any changes to the
> number of mailing lists we have.
>
> Q: Will the rules engine will be 100% correct every time?
> A: No, but neither are humans when they choose mailing lists. The goal
> of the rules engine, as stated at the start of this e-mail, is to
> choose a reasonable minimal set of mailing lists that a patch should
> be sent to. A contributor can always override the choice via the CLI
> or use the `/cc` pull request command.
>
> Q: Is the automatic selection of mailing lists opt-in or opt-out?
> A: It is currently opt-in if you create a PR from the command-line and
> opt-out if you create a PR via a web browser. We believe that many
> experienced contributors will use the CLI tools and there have an
> easy way to control the selected mailing lists. In constrast many
> newcomers are likely to create pull requests via a web browser. As
> stated above, there is a "cooldown" period after a PR has been
> created when the PR author has time to fine-tune the selected mailing
> lists regardless of how the PR was created.
>
> Q: Shouldn't the information about to which mailing list to send a patch
> be on the wiki instead?
> A: We strongly prefer to have it in a format and place where the
> information can be processed by programs. This way we can use the
> information to automate and improve upon our workflows.
>
> Again, thanks everyone for the great feedback and to everyone who helped
> out tweaking the rules!
>
> Best regards,
> Erik and Robin
>
> On 8/27/20 12:34 PM, Robin Westberg wrote:
>> Hi all,
>>
>> As part of transitioning the jdk/jdk repository to Git with project
>> Skara, we have created a set of rules that determine which mailing
>> list(s) should be the default recipient of a review request, depending
>> on which files are changed. The initial version of these rules was
>> created by looking at historical commits and matching them with
>> existing mailing list review threads. This has produced a reasonable
>> basis, but it can most certainly be made better with some additional
>> manual curating.
>>
>> Therefore, it would be very helpful if people with good knowledge of
>> the various subsystems and source files that make up the JDK would be
>> willing to take a look at these rules, and also suggest improvements
>> where needed. In addition, lists like [1] would also be very useful
>> insofar they exist.
>>
>> The current version of these rules is located in a JSON file in the
>> Skara repository at [2]. In order to check the validity of the rules,
>> there is also a CLI tool that can be used to apply it to either a
>> subset of files or existing commits and produce a suggestion [3] [4].
>> If you are interested in helping out with curating these rules, these
>> are the steps to get started:
>>
>> 1. Install the Skara CLI tools:
>> https://wiki.openjdk.java.net/display/SKARA/CLI+Tools
>> 2. Locate a suitable clone of the jdk/jdk repository (either Mercurial
>> or Git is fine)
>> 3. Change (cd) to the root of your jdk/jdk repository
>> 3. Run the “debug mlrules” command on your favorite subset of files,
>> for example like this (use the actual location of jdk.json on your
>> system):
>>
>> $ git skara debug mlrules -v
>> ~/git/skara/config/mailinglist/rules/jdk.json src/hotspot/share/jfr/
>> Reading rules file...
>> src/hotspot/share/jfr/dcmd: [hotspot-jfr-dev]
>> src/hotspot/share/jfr/instrumentation/jfrEventClassTransformer.cpp:
>> [hotspot-jfr-dev]
>> src/hotspot/share/jfr/instrumentation/jfrEventClassTransformer.hpp:
>> [hotspot-jfr-dev]
>> src/hotspot/share/jfr/instrumentation/jfrJvmtiAgent.cpp:
>> [hotspot-jfr-dev, serviceability-dev]
>> …
>> Final list suggestion is: [hotspot-jfr-dev, serviceability-dev]
>>
>> The command accepts multiple folder and/or file names to make it
>> possible to simulate a potential change to a given set of files:
>>
>> $ git skara debug mlrules -v
>> ../skara/config/mailinglist/rules/jdk.json doc/ide.md
>> src/hotspot/cpu/x86/x86.ad src/hotspot/os/linux/gc/z/zNUMA_linux.cpp
>> Reading rules file...
>> doc: [build-dev]
>> src/hotspot/cpu: [hotspot-compiler-dev]
>> src/hotspot/os: [hotspot-runtime-dev, hotspot-gc-dev]
>>
>> Combined list suggestion: [build-dev, hotspot-compiler-dev,
>> hotspot-gc-dev, hotspot-runtime-dev]
>> Final list suggestion is: [build-dev, hotspot-dev]
>>
>> If the suggestions look fine, all is well. If not, you are welcome to
>> propose a change to the rules, preferably by editing the jdk.json file
>> [6] and creating a pull request towards the Skara project as described
>> in [5]. Coincidentally, this is the same way that future changes to
>> the jdk/jdk repository will be integrated, so this exercise could also
>> serve as a way of getting started with Git / Skara!
>>
>> Best regards,
>> Robin
>>
>> [1] https://openjdk.java.net/groups/2d/2dawtfiles.html
>> [2]
>> https://git.openjdk.java.net/skara/blob/master/config/mailinglist/rules/jdk.json
>>
>>
>> [3]
>> $ git skara debug mlrules -v
>> ~/git/skara/config/mailinglist/rules/jdk.json
>> src/java.desktop/unix/native
>> Reading rules file...
>> src/java.desktop/unix/native/common: [2d-dev]
>> src/java.desktop/unix/native/include: []
>> src/java.desktop/unix/native/libawt: [2d-dev]
>> src/java.desktop/unix/native/libawt_headless: [awt-dev]
>> src/java.desktop/unix/native/libawt_xawt: [awt-dev]
>> src/java.desktop/unix/native/libfontmanager: [2d-dev]
>> src/java.desktop/unix/native/libjawt: [awt-dev]
>> src/java.desktop/unix/native/libsplashscreen: [awt-dev]
>>
>> Combined list suggestion: [2d-dev, awt-dev]
>> Final list suggestion is: [2d-dev, awt-dev]
>>
>> [4]
>> $ git skara debug mlrules -d 30 -v
>> ~/git/skara/config/mailinglist/rules/jdk.json .
>> ...
>> ✅ [2d-dev, awt-dev, serviceability-dev] c32923e0: 8240487: Cleanup
>> whitespace in .cc, .hh, .m, and .mm files
>> ❌ [awt-dev] 7f74c7dd: 8212226: SurfaceManager throws "Invalid Image
>> variant" for MultiResolutionImage (Windows)
>> Suggested lists: [2d-dev, awt-dev]
>> Rules matching unmentioned lists [2d-dev]:
>> src/java.desktop/share/classes/sun/java2d/SunGraphics2D.java -
>> [2d-dev: ^src/java.desktop/share/classes/sun/java2d/]
>> …
>>
>> [5] https://wiki.openjdk.java.net/display/SKARA/Skara#Skara-Workflow
>>
>> [6]
>> The rules are made up of sets of regular expressions for the various
>> mailing lists that are used for reviewing changes going into the JDK.
>> If any rule matches, that mailing list will get a copy of the review
>> request email. For directories containing files that belong to
>> different subsystems, it’s usually a good idea to write the rules in a
>> complementary fashion if possible, so that anything not explicitly
>> mentioned gets a reasonable default. As an example, see these rules
>> for a subset of awt / 2d / i18n files:
>>
>> “awt-dev”:
>> "src/java.desktop/share/classes/sun/awt/(?!font|sunhints|color/|font/|geom/|im/|image/|print/)”
>>
>> “2d-dev”:
>> "src/java.desktop/share/classes/sun/awt/(font|sunhints|color/|font/|geom/|image/|print/)"
>>
>> “I18n-dev”: "src/java.desktop/share/classes/sun/awt/im/“
>>
>> In this example, anything not explicitly indicated as belonging to
>> either 2d-dev or i18-dev will be matched by awt-dev.
>>
More information about the jdk-dev
mailing list