New candidate JEP: 369: Migrate to GitHub

Erik Helin erik.helin at oracle.com
Fri Nov 22 12:09:43 UTC 2019


On 11/21/19 12:18 PM, Andrew Dinn wrote:
> Hi Erik,

Hi Andrew,

On 11/21/19 12:18 PM, Andrew Dinn wrote:
> Thanks for pointing me at the Skara project and for explaining how you
> have extended the basic GitHub PR workflow to support important elements
> of our current process. The tooling does indeed appear to improve the
> usability and add back necessary operations to a great degree. I am very
> impressed with the efforts you have made. So, thank you (and others, Joe
> especially) for your efforts.

I would also like to take this opportunity to thank you for your very 
insightful and thoughtful feedback. You have really taken the time to 
read through our work, dig into it and tried to see things from our 
perspective, something which I'm very thankful for. Getting this kind of 
feedback is a rare thing nowadays, so again, thank you Andrew.

> I /am/ still concerned about two points.
> 
> 1) It seem a GitHub account is required for anyone who wishes to
> actually commit a change. I use GitHub or another project so I have an
> account and (clearly) have no special scruples about relying on GitHub,
> at least no more so than much of the other communal infrastructure I am
> obliged -- perhaps, sometimes expediently -- to use. However, I can
> understand see that some people (even, perhaps, some reviewers) might
> have reasonable ethical concerns about having to rely on GitHub -- not
> just for all the tooling it provides but even for simply hosting the
> code. I am glad that you have offered a path for them still to
> contribute (ending in contribution via a proxy sponsor) and I hope that
> will be adequate for anyone with ethical concerns.

That is our hope as well.

So far we haven't gotten any indications that manual creation of pull 
requests for contributions coming in via the mailing lists won't be 
enough, but we have ideas on how to scale that process if necessary. One 
could for example imagine a bot that scans the mailing lists for initial 
RFR emails, extracts the cr.openjdk.java.net link to the webrev from the 
email body and automatically creates a pull request. This is similar in 
spirit to how the bots today find mailing list emails replying to 
comments on pull requests.

> 2) I am pleased to hear that you very much wish for it to remain
> possible to initiate and proceed with RFRs using the existing process of
> creating and publishing a webrev and then negotiating its progress via
> email only.

Yes, the mailing lists are *not* going away. Just to be clear here: as I 
wrote in my last reply, the patches discussed in such an RFR email 
thread will eventually have to become one or more pull requests. However 
if all participants in such an RFR email thread agree that the patches 
are good, then those pull requests would just be a formality to 
integrate the changes.

> Also, it is very welcome that that RFRs initiated by a
> GitHub PR will be automatically provided with an initial webrev at PR
> open and upated webrevs as new commits are pushed, so making it very
> easy 3rd parties (especially reviewers) to contribute in their usual way
> without having to proceed via GitHub. However, I am still not confident
> that the mixed use of PRs and email will work.
> 
> I still fear that this mixed usage is not going to result in as coherent
> a dialogue (or, perhaps, polylogue might be a better term to use) as
> pure email although I accept your arguments about how it may well suit
> some people better than others. The other thing I still rank as
> problematic is an expectation that those who use the PR-based process
> are not going to be aware of or fully involved in any discussion that
> starts outside of GitHub. Once again, this is derived from my Graal
> project experience where mailing to the Graal list was very much like
> shouting into a void. Of course, I accept that this is not the Graal
> project and that you are trying very hard to ensure that we retain a
> workflow that Graal never had in the first place. So, I understand that
> this expectation may turn out to be misplaced.

It would disappoint me if OpenJDK contributors would stop paying 
attention to the mailing lists. As I have stated elsewhere in my replies 
to others, the OpenJDK mailing lists are used for much more than RFR emails:
- design discussions
- call for votes
- announcements
- JEP discussions
- occasional bug reports
- etc.

So, again, it is fully expected that frequent OpenJDK contributors 
follow the relevant mailing lists for the areas they contribute to. You 
can therefore reasonably expect that a frequent OpenJDK contributor is 
aware of discussions that have taken place solely on a relevant mailing 
list.

(I say frequent here because I don't expect a person that in total 
contribute one or two patches to keep up with the mailing lists.)

> I am afraid I'd still prefer not to make what I see as a risky and
> unnecessary move. I accept that it's not possible to prove or disprove
> your or my expectations without actually making the move and seeing what
> happens. If we do go down this path then I'd much prefer to do it slowly
> and piecemeal, with time for assessment of its impact on the project. I
> believe that is also your intention so I'll leave my objections to rest
> here and listen to what others have to say on the matter.

I hope that we have shown by now that project Skara is not rushing 
things. The OpenJDK project Skara was announced July, 2018. Progress has 
since then been reported and discussed at three consecutive OpenJDK 
Committers Workshops (Santa Clara, Aug 2018; Brussels, Feb 2019; Santa 
Clara, Aug 2019). The source code powering Skara has been open source 
since June, 2019. We have let a few OpenJDK projects of different sizes 
and with different policies to trial GitHub + Skara and we are 
continuously improving things based on their feedback.

This iterative strategy has proven to successful so far, so I don't see 
any reason to change it. As I've stated elsewhere in this thread, it has 
always been my intent to transition a few projects at a time and learn 
from their experiences _before_ the main JDK repository would be 
transitioned.

Thanks,
Erik

> regards,
> 
> 
> Andrew Dinn
> -----------
> 
> On 14/11/2019 11:34, Erik Helin wrote:
>> On 11/13/19 6:39 PM, Andrew Dinn wrote:
>>> On 13/11/2019 11:42, Erik Helin wrote:
>>>> Thanks Andrew for reading through it all and providing feedback! Please
>>>> see my replies inline.
>>>
>>> You are very welcome. It definitely merits a considered and fair review.
>>>
>>>>> My experience of GitHub use in the Graal project suggest that this is
>>>>> not entirely the full picture. My view derived from that experience is
>>>>> that the GitHub PR is very much an inferior medium for review
>>>>> discussion. In particular, it definitely fails to be "structurally
>>>>> similar to the existing e-mail and webrev based workflows".
>>>>
>>>> I think the key sentence here is "My experience of GitHub use in the
>>>> Graal project...". Having worked with project on GitHub using Skara
>>>> tooling and projects on GitHub _not_ using Skara tooling, my view is
>>>> that the experiences differs quite a bit.
>>>
>>> I am happy to hear that you will be supplementing the standard GitHub
>>> experience with extra tooling. I also must apologize for not looking at
>>> Skara before writing a critique that perhaps does not do it justice. I
>>> will take a look and post my thoughts.
>>>
>>>> This does not paint the whole picture. You are probably thinking of the
>>>> "Conversation" tab in a GitHub pull request which is meant for *general*
>>>> discussion of the patch that is not attached to any particular change in
>>>> the patch. GitHub's own help documentation [0] states that the
>>>> "Conversation" tab is meant to be used for:
>>>>
>>>>       You can comment on a pull request's Conversation tab to leave
>>>>       general comments, questions, or props.
>>>>
>>>> You are right that comments in the "Conversation" tab are linearized and
>>>> does not support a "tree style" view of comments.
>>>>
>>>> The good news are that the _other_ form of comments available on a
>>>> GitHub pull request, a so-called "review comment" or "file comment" [1],
>>>> does support replies in the form of a tree. These comments are filed
>>>> under the "Files changed" tab in a pull request.
>>>
>>> I'm afraid that makes it sound worse not better. This bifurcates the
>>> review process into 'general comments' vs 'critique on the code per se',
>>> with the former forced into a  linear frame while the latter can benefit
>>> from divide and conquer distinctions. I fear that's an artificial
>>> division of concerns that will make it harder still to reconcile general
>>> points with the evidence base for deriving them.
>>>
>>>> Personally I think of the comments in the "Conversation" tab as replies
>>>> to the "00" email in a classic patch series email and the "review
>>>> comments" in the "Files changed" tab as replies to the "0X" emails
>>>> actually containing patches. Do you follow?
>>>
>>> Well, I agree that sometimes that will work ok. However, I may be wrong
>>> here but I believe that the code comments are tied to a specific point
>>> in the file/change set. That's ok when a comment only relates to one
>>> change. When you wish comment on the combined effect of two or more
>>> changes at different points in the file the requirement to attach a
>>> comment to one specific change doesn't really work. Punting such
>>> comments to the 00 thread doesn't do it either. Quoting the relevant
>>> lines in a follow-up note does bring the relevant changes into the scope
>>> of preceding and subsequent replies.
>>>
>>> The root point here is that the GitHub PR presentation model is going to
>>> impose constraints on the way we structure and link our review comments
>>> because it needs them to conform to it's information structure that is
>>> essentially driven by it's web page layout. email is inherently much
>>> more flexible because it is just a hierarchically linked set of texts.
>>> Of course, GitHub provides aids to simplify the task of creating and
>>> linking information in its format. Whereas with email one has to
>>> explicitly structure the information by editing it and placing it in a
>>> reply sequence. I can see GitHub making some things a lot easier.
>>> However, my concern is that it may well make some important things that
>>> we do difficult or maybe even impossible (at the least impractical).
>>
>> Yes, email is more free form (and thus more flexible) than essentially
>> any other medium. The interesting question to ponder here is if this
>> flexibility helps or hurts the majority of reviews being conducted in
>> OpenJDK on a daily basis?
>>
>> I think we all have experienced where the flexibility of mailing lists
>> and free-form emails have resulted in less than stellar experiences as
>> well:
>> - a thread gets forked to two different mailing lists because one
>>    participant forgets to press "Reply all" and instead presses
>>    "Reply List"
>> - a reviewer joins a "RFR" thread after the patch has been pushed since
>>    there is no way see on a mailing list whether the patch has been
>>    pushed or not
>> - a contributor forgets to mentions one more important facts in a RFR
>>    email such as links to JBS issue(s), which commit the patch should
>>    be applied upon or does not supply an incremental webrev
>>
>> Looking at the jdk/jdk repository [0] we see that over 95% of all
>> commits have 3 or less reviewers [1]. Manually skimming the last weeks
>> of emails on core-libs-dev and hotspot-dev seems (to me) to confirm
>> this: the majority of the conversations have 1 - 4 participants (one
>> author and up to three reviewers). Looking at the conversations it also
>> seems (again, to me) that a majority of them take place on a singe list
>> and do not move/transfer to other mailing lists.
>>
>> With the above in mind I do believe that flexibility and expressiveness
>> of pull requests combined with Skara's mailing list interoperability is
>> powerful enough. The experience from OpenJFX transitioning to GitHub +
>> Skara seem to confirm this, but this of course my interpretation based
>> on their feedback.
>>
>> Now, as I have pointed in some other replies, we do *not* lose the
>> mailing lists just because we migrate to GitHub + Skara. If you have a
>> really large patch where you are expecting 5+ reviewers spanning 3+
>> mailing lists, then it is more than fine to send an RFC email with a
>> webrev, just like we do today. The patch will eventually have to become
>> one more pull requests and subject to the review process, but there is
>> nothing stopping you from using the mailing lists for the initial feedback.
>>
>>>> Now the interesting question is of course how the Skara tooling
>>>> translates these kinds of comments back-and-forth between mailing lists
>>>> and GitHub. Here I'm happy to report that the Skara tooling does proper
>>>> replying, which will cause the "review comments" created under the
>>>> "Files changed" tab on a pull request to result in a tree-view (in email
>>>> clients that support such views).
>>>>
>>>> You can see of some of this work on the openjfx-dev mailing list [2].
>>>> Now keep in mind if you are looking at Pipermail rendering of a mailing
>>>> list, which lacks some features that most email clients supports (such
>>>> as folding quoted text and nested replies beyond three levels). A
>>>> mailing list version of a pull request will in general have the
>>>> following structure:
>>>>
>>>> - RFR: Fix a bug <-- Email describing patch
>>>>     - Re: RFR: Fix a bug <-- This is a general comment
>>>>       - Re: RFR: Fix a bug <-- This is a reply to the general comment
>>>>     - Re: [Rev 01]: RFR: Fix a bug <-- The author updated the patch
>>>>       - Re: [Rev 01]: RFR: Fix a bug <-- comment from reviewer A on 01
>>>>         - Re: [Rev 01]: RFR: Fix a bug <-- reply to comment from A on 01
>>>>       - Re: [Rev 01]: RFR: Fix a bug <-- comment from reviewer B on 01
>>>>         - Re: [Rev 01]: RFR: Fix a bug <-- reply to comment from B on 01
>>>>     - Re: [Rev 02]: RFR: Fix a bug <-- The author updated again
>>>>     - Re: [Approved] RFR: Fix a bug <-- Reviewer A approved the patch
>>>>     - Re: [Approved] RFR: Fix a bug <-- Reviewer B approved the patch
>>>>     - Re: [Integrated] RFR: Fix a bug <-- Author integrated the patch
>>>>
>>>> We are tweaking this structure as we get more and more experience with
>>>> it, but at the moment I'm fairly satisfied with threading and the tree
>>>> layout. If you have any feedback on this structure/layout, then we are
>>>> happy to hear it.
>>>
>>> Well, I will of course look into this further. Thanks you for pointing
>>> me at the relevant matter to consider.
>>
>> Thanks for digging into this, we are longing for feedback :)
>>
>>>> Going in the other direction, mailing list -> GitHub, we also try to
>>>> preserve the mailing list structure as much as possible. This is
>>>> actually a harder challenge, since an email is less structured compared
>>>> to a comment on GitHub. An example of this direction can be found in
>>>> pull request 231 for the Skara repository [3] where you can see the
>>>> thread (which is a tree) from skara-dev at openjdk.java.net [4] being
>>>> "flattened" and uses quotation to try to preserve the flow.
>>>
>>> Yes, of course. However, rather than express that as 'email is less
>>> structured' one might equally state it as 'email is much more flexible'
>>> or 'GitHub is much more rigid' ;-)
>>
>> :) Please see my reply above for my thoughts on this matter.
>>
>>>> Here we are currently working on the /cc command that can be used in a
>>>> comment on pull request, for example:
>>>>
>>>>       /cc build-dev hotspot-dev
>>>
>>> Thanks for clarifying
>>
>> No problem at all.
>>
>>>> I hope that is does not come across that we are taking mailing list
>>>> interopability as a minor detail? I think it is fair to say that this is
>>>> the Skara feature we have spent the most time working on and are
>>>> constantly improving in order to provide a good experience.
>>>
>>> No, I was not addressing that at you -- apologies if ti came across like
>>> that as I very much appreciate that you do take this seriously -- rather
>>> at all other OpenJDK project members to try to raise awareness of the
>>> importance of getting a change like this right.
>>
>> No feelings hurt, I just wanted to stress how deeply we care about this
>> matter. And for the record, yes, I'm well aware of the importance of
>> getting this change right. I have worked in this community for more than
>> 7 years now and have a deep respect for it and its members.
>>
>> Thanks,
>> Erik
>>
>> [0]: https://git.openjdk.java.net/jdk
>> [1]: https://gist.github.com/edvbld/a03dec6c84bad054d7529461a38efdf5
>>
>>> regards,
>>>
>>>
>>> Andrew Dinn
>>> -----------
>>> Senior Principal Software Engineer
>>> Red Hat UK Ltd
>>> Registered in England and Wales under Company Registration No. 03798903
>>> Directors: Michael Cunningham, Michael ("Mike") O'Neill
>>>
>>
> 



More information about the discuss mailing list