Request for assistance: Verify and update mailing list filter rules for jdk/jdk in preparation for Skara transition

David Holmes david.holmes at oracle.com
Mon Aug 31 05:17:45 UTC 2020


Hi Erik,

This updated process seems far better to me - thanks.

David

On 29/08/2020 12:20 am, Erik Helin wrote:
> Hi all,
> 
> Thanks for all the feedback, based on this discussion we have been able 
> to make a number of improvements in this area!
> 
> But to begin with, it would probably have been good to clarify the 
> intended use of this list of rules a bit more clearly: The intent of 
> these rules is to create a conservative “minimal” initial set of lists 
> that quite likely would want to be included when sending out the review 
> request. A good example of this would be the fairly well known rule that 
> “all Makefile changes should be reviewed on build-dev” which is readily 
> encoded in this format as “build-dev”: “make/“. However, files that are 
> shared between several lists and where you select lists based on what 
> actually changed should not be on here. You may for example have noticed 
> the absence of the various ProblemList.txt files - they certainly fit 
> that criteria. But these are the two clear opposites, there are of 
> course many places in the jdk source tree where this is much more of a 
> gray area.
> 
> It is also quite likely that the original list that we posted here was 
> dauntingly verbose - it was our hope that it would be reasonably 
> straightforward for a subject matter expert to quickly delete a lot of 
> the “noise”. Based on the change suggestions received so far, it would 
> appear to not have been that unreasonable: the latest version has 
> dropped over a hundred lines while still improving accuracy by a fair 
> bit. Ideally there should probably only be a handful of rules for any 
> given list, which is already true for most, but the rules for lists like 
> “core-libs-dev” are certainly not there yet.
> 
> We should also mention how the initial list was created - it was *not* 
> crated manually by us :) Instead we parsed the RFR e-mails for thousands 
> of commits pushed to the jdk/jdk repository. This allowed us to create a 
> program that could see the files that were changed by a commit and the 
> lists that that the RFR e-mail were sent to. We then manually curated 
> the list, but realized that domain experts would make a better job of 
> the curation (hence the "request for assitance" e-mail).
> 
> We have also validated our approach by comparing the results of the 
> rules engine for selecting mailing lists with the last 30 days of RFR 
> e-mails for patches targeting the jdk/jdk repository. So far it seems 
> like the rules engine perform equally or better than humans at selecting 
> mailing lists, but it could also be that we are missing some nuances in 
> RFR e-mails that look like they have forgotten to CC at least one 
> mailing list.
> 
> Still, if you are an experienced contributor, you may not want to have 
> these suggestions added only to have to replace them manually with the 
> correct lists. To help with that, we have adjusted the “git pr create” 
> command-line tool to display the suggestions before the PR is created,
> and provide the option to override them. If you create a PR from the 
> command-line you can now use the --cc flag:
> 
>    $ git pr create --cc=hotspot-dev,build-dev
> 
> Furthermore, the "git pr create" command will now ask the user to verify 
> the mailing lists before creating the pull request:
> 
>    $ git pr create
>    The following mailing lists will be CC:d for the "RFR" e-mail:
>    - build-dev at openjdk.java.net
>    - hotspot-dev at openjdk.java.net
> 
>    Do you want to proceed with this mailing list selection? [Y/n]:
> 
> A user can opt-in to always use the inferred mailing lists by speciyfing
> --cc=auto. This means that it is now *opt-in* to use the mailing lists 
> inferred from the patch when creating a PR from the command-line.
> 
> When creating a PR from the GitHub web UI, it’s also possible to 
> override these suggestions by adding a “/cc” command at the end of the 
> PR body. There is also a window of a few minutes to issue a command to 
> edit the suggestions after they have been displayed, but before any 
> email is sent to any list. The semantics of the “/cc” command have also 
> been changed, to ensure that it always overrides the automatic choices.
> 
> We would also like to reply to a few of the additional questions raised:
> 
> Q: Shouldn't this file with the rules be placed in the jdk/jdk
>     repository?
> A: Yes, we think so too, but we started out by having it in the skara
>     repository. The Skara tooling does not care where the file resides,
>     we can easily update the tooling once/if we move the file into the
>     jdk/jdk repository.
> 
> Q: How will the file be maintained?
> A: We imagine that this will be a collective effort once the file has
>     landed in the jdk/jdk repository. If a contributor makes a large
>     change that moves a lot of files around, then it seems fitting to
>     also consider if the rules for choosing mailing lists should perhaps
>     be updated.
> 
> Q: Why not use the CODEOWNERS file format that GitHub already supports?
> A: This is something we considered but rejected for several reasons. The
>     first and primary reason is that the rule file is not meant to
>     convey ownership, it is merely stating the mailing lists that should
>     be notified when certain files change. Secondly we will run into
>     problems with GitHub _also_ sending e-mails if we use the CODEOWNERS
>     format, something we would not want. Lastly the rules file also
>     support specifying mailing lists groups that should be coalesced into
>     a single mailing list.
> 
> Q: Doesn't other OpenJDK projects that already have transitioned have
>     this problem?
> A: No, for all other OpenJDK projects there is only one "primary"
>     mailing list. For example skara-dev for Skara, amber-dev for Amber
>     and so on.
> 
> Q: Do we have too many mailing lists?
> A: Perhaps, but it is not our intent to propose any changes to the
>     number of mailing lists we have.
> 
> Q: Will the rules engine will be 100% correct every time?
> A: No, but neither are humans when they choose mailing lists. The goal
>     of the rules engine, as stated at the start of this e-mail, is to
>     choose a reasonable minimal set of mailing lists that a patch should
>     be sent to. A contributor can always override the choice via the CLI
>     or use the `/cc` pull request command.
> 
> Q: Is the automatic selection of mailing lists opt-in or opt-out?
> A: It is currently opt-in if you create a PR from the command-line and
>     opt-out if you create a PR via a web browser. We believe that many
>     experienced contributors will use the CLI tools and there have an
>     easy way to control the selected mailing lists. In constrast many
>     newcomers are likely to create pull requests via a web browser. As
>     stated above, there is a "cooldown" period after a PR has been
>     created when the PR author has time to fine-tune the selected mailing
>     lists regardless of how the PR was created.
> 
> Q: Shouldn't the information about to which mailing list to send a patch
>     be on the wiki instead?
> A: We strongly prefer to have it in a format and place where the
>     information can be processed by programs. This way we can use the
>     information to automate and improve upon our workflows.
> 
> Again, thanks everyone for the great feedback and to everyone who helped 
> out tweaking the rules!
> 
> Best regards,
> Erik and Robin
> 
> On 8/27/20 12:34 PM, Robin Westberg wrote:
>> Hi all,
>>
>> As part of transitioning the jdk/jdk repository to Git with project 
>> Skara, we have created a set of rules that determine which mailing 
>> list(s) should be the default recipient of a review request, depending 
>> on which files are changed. The initial version of these rules was 
>> created by looking at historical commits and matching them with 
>> existing mailing list review threads. This has produced a reasonable 
>> basis, but it can most certainly be made better with some additional 
>> manual curating.
>>
>> Therefore, it would be very helpful if people with good knowledge of 
>> the various subsystems and source files that make up the JDK would be 
>> willing to take a look at these rules, and also suggest improvements 
>> where needed. In addition, lists like [1] would also be very useful 
>> insofar they exist.
>>
>> The current version of these rules is located in a JSON file in the 
>> Skara repository at [2]. In order to check the validity of the rules, 
>> there is also a CLI tool that can be used to apply it to either a 
>> subset of files or existing commits and produce a suggestion [3] [4]. 
>> If you are interested in helping out with curating these rules, these 
>> are the steps to get started:
>>
>> 1. Install the Skara CLI tools: 
>> https://wiki.openjdk.java.net/display/SKARA/CLI+Tools
>> 2. Locate a suitable clone of the jdk/jdk repository (either Mercurial 
>> or Git is fine)
>> 3. Change (cd) to the root of your jdk/jdk repository
>> 3. Run the “debug mlrules” command on your favorite subset of files, 
>> for example like this (use the actual location of jdk.json on your 
>> system):
>>
>> $ git skara debug mlrules -v 
>> ~/git/skara/config/mailinglist/rules/jdk.json src/hotspot/share/jfr/
>> Reading rules file...
>> src/hotspot/share/jfr/dcmd: [hotspot-jfr-dev]
>> src/hotspot/share/jfr/instrumentation/jfrEventClassTransformer.cpp: 
>> [hotspot-jfr-dev]
>> src/hotspot/share/jfr/instrumentation/jfrEventClassTransformer.hpp: 
>> [hotspot-jfr-dev]
>> src/hotspot/share/jfr/instrumentation/jfrJvmtiAgent.cpp: 
>> [hotspot-jfr-dev, serviceability-dev]
>>>> Final list suggestion is: [hotspot-jfr-dev, serviceability-dev]
>>
>> The command accepts multiple folder and/or file names to make it 
>> possible to simulate a potential change to a given set of files:
>>
>> $ git skara debug mlrules -v 
>> ../skara/config/mailinglist/rules/jdk.json doc/ide.md 
>> src/hotspot/cpu/x86/x86.ad src/hotspot/os/linux/gc/z/zNUMA_linux.cpp
>> Reading rules file...
>> doc: [build-dev]
>> src/hotspot/cpu: [hotspot-compiler-dev]
>> src/hotspot/os: [hotspot-runtime-dev, hotspot-gc-dev]
>>
>> Combined list suggestion: [build-dev, hotspot-compiler-dev, 
>> hotspot-gc-dev, hotspot-runtime-dev]
>> Final list suggestion is: [build-dev, hotspot-dev]
>>
>> If the suggestions look fine, all is well. If not, you are welcome to 
>> propose a change to the rules, preferably by editing the jdk.json file 
>> [6] and creating a pull request towards the Skara project as described 
>> in [5]. Coincidentally, this is the same way that future changes to 
>> the jdk/jdk repository will be integrated, so this exercise could also 
>> serve as a way of getting started with Git / Skara!
>>
>> Best regards,
>> Robin
>>
>> [1] https://openjdk.java.net/groups/2d/2dawtfiles.html
>> [2] 
>> https://git.openjdk.java.net/skara/blob/master/config/mailinglist/rules/jdk.json 
>>
>>
>> [3]
>> $ git skara debug mlrules -v 
>> ~/git/skara/config/mailinglist/rules/jdk.json 
>> src/java.desktop/unix/native
>> Reading rules file...
>> src/java.desktop/unix/native/common: [2d-dev]
>> src/java.desktop/unix/native/include: []
>> src/java.desktop/unix/native/libawt: [2d-dev]
>> src/java.desktop/unix/native/libawt_headless: [awt-dev]
>> src/java.desktop/unix/native/libawt_xawt: [awt-dev]
>> src/java.desktop/unix/native/libfontmanager: [2d-dev]
>> src/java.desktop/unix/native/libjawt: [awt-dev]
>> src/java.desktop/unix/native/libsplashscreen: [awt-dev]
>>
>> Combined list suggestion: [2d-dev, awt-dev]
>> Final list suggestion is: [2d-dev, awt-dev]
>>
>> [4]
>> $ git skara debug mlrules -d 30 -v 
>> ~/git/skara/config/mailinglist/rules/jdk.json .
>> ...
>> ✅ [2d-dev, awt-dev, serviceability-dev] c32923e0: 8240487: Cleanup 
>> whitespace in .cc, .hh, .m, and .mm files
>> ❌ [awt-dev] 7f74c7dd: 8212226: SurfaceManager throws "Invalid Image 
>> variant" for MultiResolutionImage (Windows)
>>      Suggested lists: [2d-dev, awt-dev]
>>      Rules matching unmentioned lists [2d-dev]:
>>        src/java.desktop/share/classes/sun/java2d/SunGraphics2D.java - 
>> [2d-dev: ^src/java.desktop/share/classes/sun/java2d/]
>>>>
>> [5] https://wiki.openjdk.java.net/display/SKARA/Skara#Skara-Workflow
>>
>> [6]
>> The rules are made up of sets of regular expressions for the various 
>> mailing lists that are used for reviewing changes going into the JDK. 
>> If any rule matches, that mailing list will get a copy of the review 
>> request email. For directories containing files that belong to 
>> different subsystems, it’s usually a good idea to write the rules in a 
>> complementary fashion if possible, so that anything not explicitly 
>> mentioned gets a reasonable default. As an example, see these rules 
>> for a subset of awt / 2d / i18n files:
>>
>> “awt-dev”: 
>> "src/java.desktop/share/classes/sun/awt/(?!font|sunhints|color/|font/|geom/|im/|image/|print/)” 
>>
>> “2d-dev”: 
>> "src/java.desktop/share/classes/sun/awt/(font|sunhints|color/|font/|geom/|image/|print/)" 
>>
>> “I18n-dev”: "src/java.desktop/share/classes/sun/awt/im/“
>>
>> In this example, anything not explicitly indicated as belonging to 
>> either 2d-dev or i18-dev will be matched by awt-dev.
>>


More information about the jdk-dev mailing list