New candidate JEP: 369: Migrate to GitHub
Erik Helin
erik.helin at oracle.com
Fri Nov 15 11:56:53 UTC 2019
On 11/14/19 10:04 PM, Per Bothner wrote:
> On 11/13/19 10:31 AM, Andrew John Hughes wrote:
>> Each developer has a fork of the actual repo as it stands now; this is
>> the very nature of using a distributed version control system (DVCS).
>> What differs with the pull request model is that fork is made public
>> rather than patches being generated from it and posted. The advantage of
>> that is that other developers can check out the published repository
>> rather than attempting to recreate it by applying the patch to their own
>> fork.
>
> The straight-forward way to "check out the published repository"
> unfortunately
> requires a lot of duplicated (wasted) bandwidth and disk space, for
> every time
> you want to test a patch. Probably there is tooling that can mitigate
> that.
Anytime I clone a git repository I use the `--reference-if-able` flag
[0] if I already have a local clone of the repository. For example:
$ time git clone https://git.openjdk.java.net/jdk
Cloning into 'jdk'...
warning: redirecting to https://github.com/openjdk/jdk/
remote: Enumerating objects: 338, done.
remote: Counting objects: 100% (338/338), done.
remote: Compressing objects: 100% (162/162), done.
remote: Total 997493 (delta 98), reused 219 (delta 79), pack-reused
997155
Receiving objects: 100% (997493/997493), 354.40 MiB | 24.43 MiB/s,
done.
Resolving deltas: 100% (744950/744950), done.
Updating files: 100% (67977/67977), done.
real 0m38.987s
user 0m51.312s
sys 0m4.512s
$ time git clone --reference-if-able jdk \
https://git.openjdk.java.net/loom
Cloning into 'loom'...
warning: redirecting to https://github.com/openjdk/loom/
remote: Enumerating objects: 4975, done.
remote: Counting objects: 100% (4975/4975), done.
remote: Compressing objects: 100% (237/237), done.
remote: Total 12878 (delta 4717), reused 4885 (delta 4695),
pack-reused 7903
Receiving objects: 100% (12878/12878), 5.83 MiB | 9.15 MiB/s, done.
Resolving deltas: 100% (10097/10097), completed with 1019 local objects.
Updating files: 100% (68114/68114), done.
real 0m12.520s
user 0m5.312s
sys 0m1.917s
In the example above the first clone of the jdk repository took 38.987
seconds. The second clone of the loom repository took only 12.520
seconds because git could reuse almost all the repository data from my
*local* jdk clone for the loom clone.
> A related annoyance with the pull-request model is that every contributor
> needs to have public fork on GitHub, and create a fresh branch for each
> change.
> That seems a bit of a hassle compared to older ways of working.
>
> "The JEP proposes to support multiple workflows" - but do they all require
> a contributor to explicitly create a branch in a personal fork on GitHub?
> If not, how do they avoid that?
While you can create a pull request from the "master" branch of your
personal fork, it is not recommended due to issues that can arise when
you later want to sync the changes from the upstream repository's
"master" branch to your personal fork's "master" branch. Hence the
recommendation that contributors create a branch in their personal fork
for the work they want to contribute. This might perhaps sound harder
than it actually is in practice. With the Skara tooling [1] installed I
typically do:
$ cd /path/to/local/clone/of/personal/fork
$ git switch --create bugfix # create new branch
$ vim # do the actual work
$ git commit -m 'Fixed a bug' # create commit
$ git publish # publish branch on GitHub
There are of course many other tools one can use instead of the git
command-line interface, I just personally have a very CLI heavy workflow.
Taking a little step back here my observation has been that almost all
frequent OpenJDK contributors use some tooling to handle concurrent
work, for example
- Mercurial branches
- Mercurial bookmarks
- Mercurial topics
- Mercurial anonymous heads
- Mercurial patch queues (MQ)
- A fresh local clone for every change
Given that it often takes a couple of days to from the initial "RFR"
email to pushing the finished commit all frequent contributors need
_some_ way to model concurrent work (or else they would be stalled while
waiting for reviewers' feedback). All of the above techniques have
different pros and cons but I don't think git branches incur more
overhead compared to the above listed techniques.
> One idea: a separate repository openjdk-prs that would be a clone of and
> automatically
> track the main openjdk. In addition, automated tools that process a
> patch would create
> an automatic branch with the change, and the automatically-generated
> pull request
> uses that branch. They branch can be automatically deleted when the PR
> is closed.
>
> (Anyway - just a crazy idea. I'm no longer an OpenJDK contributor,
> though who
> knows what might happen in the future.)
Crazy ideas are always welcome :)
What you propose is certainly doable and is in fact in spirit somewhat
similar to the jdk/sandbox repository [2]. We just haven't gotten around
to set up a sandbox repository on GitHub, but that is certainly planned.
OpenJDK Committers [3] would then be able to push to a branch in the
sandbox repository and subsequently create a pull request from that
branch. My guess is that most contributors would prefer to have a
personal fork (which is similar to having a personal sandbox), but we
can certainly set up a shared sandbox repository for those that would
prefer it.
We already have plenty of tooling to support forwarding commits between
repositories, for example we continuously and automatically merge the
jdk repository's "master" branch into the mobile repository's "master"
branch.
Thanks,
Erik
[0]:
https://www.git-scm.com/docs/git-clone#Documentation/git-clone.txt---reference-if-ableltrepositorygt
[1]: https://wiki.openjdk.java.net/display/skara#Skara-GettingStarted
[2]: https://hg.openjdk.java.net/jdk/sandbox/
[3]: http://openjdk.java.net/bylaws#committer
More information about the discuss
mailing list