RFR: Add tool for importing webrev patches

Fri Jul 5 15:38:41 UTC 2019

On Fri, 5 Jul 2019 13:20:51 GMT, Erik Duveblad via github.com <duke at openjdk.java.net> wrote:

> On Fri, 5 Jul 2019 10:01:57 GMT, JornVernee via github.com <duke at openjdk.java.net> wrote:
> 
>> As discussed on the mailing list: https://mail.openjdk.java.net/pipermail/skara-dev/2019-July/000176.html
>> 
>> This PR adds a tool for importing webrev `.patch` files, either from a webrev url or local file path into a repository.
>> 
>> I've named the tool `wimport` as short for `webrev-import` (I thought the full name sounded a bit too much like a sub-command of `webrev`). This will first checkout a new branch, then apply the patch, and then create a commit for the patch with the message `Imported patch 'NAME'`, similar to what is done by mq. Afterwards a user can just delete the new branch when they're done with e.g. reviewing the webrev. Although, there is also an option to directly apply the webrev without creating a branch or commit.
>> 
>> Currently  the following options are supported:
>> 
>>     usage: git wimport [options] <webrev url|file path>
>>         -n, --name NAME  Use NAME as a name for the patch (default is patch file name)
>>         -f, --file       Input is a file path
>>         -k, --keep       Keep downloaded patch file in current directory
>>         -d, --direct     Directly apply patch, without creating a branch or commit
>> 
>> ----------------
>> 
>> Commits:
>>  - 7b20ce75:	Wimport prototype
>> 
>> Pull request:
>> http://git.openjdk.java.net/skara/pull/32
>> 
>> Webrev:
>> https://openjdk.github.io/cr/skara/32/webrev.00
>> 
>> Patch:
>> http://git.openjdk.java.net/skara/pull/32.diff
>> 
>> Fetch command:
>> git fetch https://github.com/openjdk/skara.git pull/32/head:pull/32
> 
> Thanks Jorn for you contribution!
> 
> I just had time to take quick look at the code, but wanted to share some high-level thoughts before diving in deeper. I would suggest splitting this tool into two:
> 1. git-webrev-apply
> 2. git-webrev-fetch
> 
> The motivation would be to two piggyback on already known git commands apply and fetch and try
> to match the semantics as good as possible. The first tool, git-webrev-apply is very similar to what you have done in this PR _except_ that it would not do a commit. The second tool, git-webrev-fetch, would be more similar to what you have done here, but here I have few more considerations to make the tool as safe as possible:
> - we want git-webrev-fetch to apply directly to index (with `git apply --cached`) so we avoid potentially changing files in the user's working tree.
> - if the user already has a dirty index (`git diff --cached` is not empty) then I would consider two options:
>   - abort and notify the user that the index is dirty (this differs from how `git fetch` behaves)
>   - temporarily store the user's dirty index with `git write-tree`, apply the patch with `git apply --cached`, commit (more on how to commit later), restore the user's index with `git read-tree`
> - I think we can find out a bit more about the webrev. For author and committer we can definitely use the username, `https://cr.openjdk.java.net/~(.*)/`. You could potentially use the `census` module if you want to get the real name for the user (to be able to use "Real Name <username at openjdk.org>" form).
> - For the commit message I would either go with just the URL of the webrev or try to apply some heuristic to find out if the webrev has corresponding [JBS](https://bugs.openjdk.java.net/) issue.
> - For committing the dirty index I'm not fully sure yet of the best way. Ideally we want `origin/master` to be the parent since `master` might have advanced due to a user's local commits. One way to achieve this could be to first commit and then rebase the resulting commit onto `origin/master`. Another, perhaps more correct, but also slower, would be to use `git worktree` to create a new temporary worktree from `origin/master` in /tmp, apply the patch, and then commit.
> - I think we can automatically deduce the name of the ref, for a webrev from a URL <https://cr.openjdk.java.net/~ehelin/JDK-01234567/00> I think the ref `webrevs/ehelin/JDK-01234567/00` is fairly natural. This is somewhat similar to how `pulls/` is used for refs related to pull requests.
> 
> Regarding the verbose naming of the tools, I actually intended for them to look like webrev subcommands :smiley: The reason for this is if these two tools turn out well then I think we can repurpose `git-webrev` into having subcommands (similar to `git-pr`):
> - git webrev generate
> - git webrev apply <URL>
> - git webrev fetch <URL>
> 
> `generate` would be the default command so that `git webrev` stays backwards compatible (just typing `git webrev` would mean `git webrev generate`). But before embarking on this refactoring I wanted us to start with implementing `git webrev-apply` and `git webrev-fetch`.
> 
> I would suggest slightly re-purposing this PR into becoming the PR for `git-webrev-apply` and then tackle `git-webrev-fetch` in another PR.
> 
> What do you think?
> 
> PR: http://git.openjdk.java.net/skara/pull/32

Response inline...

> I just had time to take quick look at the code, but wanted to share some high-level thoughts before diving in deeper. I would suggest splitting this tool into two:
> 
>     1. git-webrev-apply
> 
>     2. git-webrev-fetch
> 
> 
> The motivation would be to two piggyback on already known git commands apply and fetch and try
> to match the semantics as good as possible. The first tool, git-webrev-apply is very similar to what you have done in this PR _except_ that it would not do a commit.

Okay, so it would just be like `git apply`, but with a webrev url as an argument? I suppose it's reasonable to drop support for local `.patch` files in that case, since that behaviour can be achieved by just using `git apply`.

> The second tool, git-webrev-fetch, would be more similar to what you have done here, but here I have few more considerations to make the tool as safe as possible:
> 
>     * we want git-webrev-fetch to apply directly to index (with `git apply --cached`) so we avoid potentially changing files in the user's working tree.

Oh yeah, good point!

>     * if the user already has a dirty index (`git diff --cached` is not empty) then I would consider two options:
>       
>       * abort and notify the user that the index is dirty (this differs from how `git fetch` behaves)
>       * temporarily store the user's dirty index with `git write-tree`, apply the patch with `git apply --cached`, commit (more on how to commit later), restore the user's index with `git read-tree`

Can't the `git read-tree` step create conflicts? I'm a little more in favour of the former option of aborting, then the user can decided either to reset the tree or do a `write-tree` (or something else).

>     * I think we can find out a bit more about the webrev. For author and committer we can definitely use the username, `https://cr.openjdk.java.net/~(.*)/`. You could potentially use the `census` module if you want to get the real name for the user (to be able to use "Real Name [username at openjdk.org](mailto:username at openjdk.org)" form).

If were going with deriving more information from the webrev, I'd rather try to find it in the webrev body, we can possibly get:

 * the author
 * the revision the webrev was generated on
 * the branch the webrev is targeting
 * the associated JBS bug link
 * the patch file link (already doing this for this PR)

>     * For the commit message I would either go with just the URL of the webrev or try to apply some heuristic to find out if the webrev has corresponding [JBS](https://bugs.openjdk.java.net/) issue.

I think some heuristic is good, but we should have a good fallback as well, since we're not guaranteed to find the wanted information. Using the webrev URL as a commit message seems like a good one!

>     * For committing the dirty index I'm not fully sure yet of the best way. Ideally we want `origin/master` to be the parent since `master` might have advanced due to a user's local commits. One way to achieve this could be to first commit and then rebase the resulting commit onto `origin/master`. Another, perhaps more correct, but also slower, would be to use `git worktree` to create a new temporary worktree from `origin/master` in /tmp, apply the patch, and then commit.

Well, not all webrevs target the master branch, but I get your point. An easy way to sidestep this problem is to use the current HEAD as the parent (and whichever branch this might be).

>     * I think we can automatically deduce the name of the ref, for a webrev from a URL https://cr.openjdk.java.net/~ehelin/JDK-01234567/00 I think the ref `webrevs/ehelin/JDK-01234567/00` is fairly natural. This is somewhat similar to how `pulls/` is used for refs related to pull requests.

Sounds good! Overall I think we can use several heuristics to create a `WebrevMetaData` object that we can then generate this kind of information from that.

> Regarding the verbose naming of the tools, I actually intended for them to look like webrev subcommands ðŸ˜ƒ The reason for this is if these two tools turn out well then I think we can repurpose `git-webrev` into having subcommands (similar to `git-pr`):
> 
>     * git webrev generate
> 
>     * git webrev apply
> 
>     * git webrev fetch
> 
> 
> `generate` would be the default command so that `git webrev` stays backwards compatible (just typing `git webrev` would mean `git webrev generate`). But before embarking on this refactoring I wanted us to start with implementing `git webrev-apply` and `git webrev-fetch`.

Okay, sounds great to me! (just having the usual case of new-contributor-syndrome, not trying to step on anyone's toes ðŸ˜„)

> I would suggest slightly re-purposing this PR into becoming the PR for `git-webrev-apply` and then tackle `git-webrev-fetch` in another PR.
> 
> What do you think?

Sounds good.

Also, I wanted to ask what you want to do for tests? I noticed there are no other test in the `cli` module yet, so I've tested this PR manually, but maybe now is a good time to start adding them?

FWIW, not all the tests pass on Windows currently, particularly some tests that rely on unix tools/path-separators, and quite a bit of the `vcs` tests file (I haven't looked into it yet). In the meantime it might be nice to set up some CI on GitHub, e.g. Travis?

PR: http://git.openjdk.java.net/skara/pull/32