From cnewland at chrisnewland.com Mon Jul 16 22:47:28 2018 From: cnewland at chrisnewland.com (Chris Newland) Date: Mon, 16 Jul 2018 23:47:28 +0100 Subject: hsdis / UPL Message-ID: <38f90e7e845fb642e609daa8f7ea2669.squirrel@excalibur.xssl.net> Hi all, I recently learned [1] that Graal Enterprise Edition, (UPL-licensed), contains a hsdis binary that matches the architecture of the Graal download. I looked into ways of distributing hsdis with JITWatch a few years back [2] but couldn't cut the GPLv2/v3 Gordian knot. The OpenJDK hsdis sources in hg still have the GPLv2 license so I'm just wondering if someone can provide clarity on whether hsdis binaries can be included in other software now? Many thanks, Chris [1] https://twitter.com/DonaldOJDK/status/1017814619463802883 [2] http://mail.openjdk.java.net/pipermail/adoption-discuss/2015-May/000833.html From doug.simon at oracle.com Tue Jul 17 07:25:43 2018 From: doug.simon at oracle.com (Doug Simon) Date: Tue, 17 Jul 2018 09:25:43 +0200 Subject: hsdis / UPL In-Reply-To: <38f90e7e845fb642e609daa8f7ea2669.squirrel@excalibur.xssl.net> References: <38f90e7e845fb642e609daa8f7ea2669.squirrel@excalibur.xssl.net> Message-ID: <3984B1BB-64AE-4F84-9EF5-50F834EEEE84@oracle.com> Hi Chris, We're seeking clarity on this ourselves internally and I'll get back to you once I have a better answer. -Doug > On 17 Jul 2018, at 00:47, Chris Newland wrote: > > Hi all, > > I recently learned [1] that Graal Enterprise Edition, (UPL-licensed), > contains a hsdis binary that matches the architecture of the Graal > download. > > I looked into ways of distributing hsdis with JITWatch a few years back > [2] but couldn't cut the GPLv2/v3 Gordian knot. > > The OpenJDK hsdis sources in hg still have the GPLv2 license so I'm just > wondering if someone can provide clarity on whether hsdis binaries can be > included in other software now? > > Many thanks, > > Chris > > [1] https://twitter.com/DonaldOJDK/status/1017814619463802883 > > [2] > http://mail.openjdk.java.net/pipermail/adoption-discuss/2015-May/000833.html > From volker.simonis at gmail.com Tue Jul 17 08:09:45 2018 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 17 Jul 2018 10:09:45 +0200 Subject: hsdis / UPL In-Reply-To: <38f90e7e845fb642e609daa8f7ea2669.squirrel@excalibur.xssl.net> References: <38f90e7e845fb642e609daa8f7ea2669.squirrel@excalibur.xssl.net> Message-ID: That's really strange because the hsdis binaries are available for download from the Graal GitHub project [1] but the corresponding source bundles [2] don't seem to contain the hsdis sources. So I wonder how the hsdis libraries have been built. As the downloadable hsdis libraries are clearly statically linked against binutils/bfd, the corresponding sources should be available if the download site doesn't want to violate the GPL license. [1] https://github.com/oracle/graal/releases/tag/hsdis-20180108 [2] https://github.com/oracle/graal/archive/hsdis-20180108.zip On Tue, Jul 17, 2018 at 12:47 AM, Chris Newland wrote: > Hi all, > > I recently learned [1] that Graal Enterprise Edition, (UPL-licensed), > contains a hsdis binary that matches the architecture of the Graal > download. > > I looked into ways of distributing hsdis with JITWatch a few years back > [2] but couldn't cut the GPLv2/v3 Gordian knot. > > The OpenJDK hsdis sources in hg still have the GPLv2 license so I'm just > wondering if someone can provide clarity on whether hsdis binaries can be > included in other software now? > > Many thanks, > > Chris > > [1] https://twitter.com/DonaldOJDK/status/1017814619463802883 > > [2] > http://mail.openjdk.java.net/pipermail/adoption-discuss/2015-May/000833.html > From fanf42 at gmail.com Mon Jul 23 12:55:58 2018 From: fanf42 at gmail.com (Francois) Date: Mon, 23 Jul 2018 14:55:58 +0200 Subject: Thread.strop removed: how to deal with a rogue ScriptEngine thread? Message-ID: <0cf25395-efed-5cd8-8fa0-a54e0d2c4e1a@gmail.com> Hello, I'm not sure at all it is the correct mailing list for that, so sorry if it is not, and please let me know where I should post my concerns. So, in JDK 11, the long term deprecated Thread.stop() will is removed (https://bugs.openjdk.java.net/browse/JDK-8204243), and I have an use case that I don't know how to deal with afterwards. We use nashorm to provides user with a embeded scripting languageon our plateform. Only simple things are accessible, and we try to forbid dangerous things like killing the host process, deleting the underlying filesystem, or infinite, not pause-to-look-for-Thread.interrupt while loops. The first two kind of problem are easely managed with a custom security manager and a thread factory that uses itfor starting the sandboxed process (ie: nashorn.eval). But I don't know how to manage the while(true){} loop. In that case, the nashorn thread does not respond to Thread.interrupt (it is a wide source of surpise and wonders on the internet, see for example: https://stackoverflow.com/questions/1601246/java-scripting-api-how-to-stop-the-evaluation/1601465#1601465 and the problem is more generally that Java ScriptEngine API does not provide a way to interrupt underlying process ounce started). So I used to have a timeout counter and a Thread.stop() on it. Ido understand that it could still raise problems, even with the custom threadfactory and the very specific use case, but it was the only solution I knew of at hand. But now, I don't have any at all to prevent an user to start unstoppable thread (by malice or by error). Any help would be much appreciated on that topic. -- Francois ARMAND - @fanf42 https://github.com/Normation/rudder http://www.normation.com From Alan.Bateman at oracle.com Mon Jul 23 13:26:04 2018 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Mon, 23 Jul 2018 14:26:04 +0100 Subject: Thread.strop removed: how to deal with a rogue ScriptEngine thread? In-Reply-To: <0cf25395-efed-5cd8-8fa0-a54e0d2c4e1a@gmail.com> References: <0cf25395-efed-5cd8-8fa0-a54e0d2c4e1a@gmail.com> Message-ID: On 23/07/2018 13:55, Francois wrote: > Hello, > > I'm not sure at all it is the correct mailing list for that, so sorry > if it is not, and please let me know where I should post my concerns. > > So, in JDK 11, the long term deprecated Thread.stop() will is removed > (https://bugs.openjdk.java.net/browse/JDK-8204243), and I have an use > case that I don't know how to deal with afterwards. Have you been using stop(Throwable) or the no-arg stop() ?? The method that is removed is stop(Throwable). This method was re-specified to throw UnsupportedOperationException in Java SE 8 so it seems unlikely that there is code running on JDK 8 or newer that relies on this method. I realize this isn't answering your question on how to deal with looping threads but I think it's important to understand that the removed method has been crippled for several releases and removing it in 11 should not be disruptive. -Alan From fanf42 at gmail.com Mon Jul 23 13:40:36 2018 From: fanf42 at gmail.com (Francois) Date: Mon, 23 Jul 2018 15:40:36 +0200 Subject: Thread.strop removed: how to deal with a rogue ScriptEngine thread? In-Reply-To: References: <0cf25395-efed-5cd8-8fa0-a54e0d2c4e1a@gmail.com> Message-ID: On 23/07/2018 15:26, Alan Bateman wrote: > On 23/07/2018 13:55, Francois wrote: >> Hello, >> >> I'm not sure at all it is the correct mailing list for that, so sorry >> if it is not, and please let me know where I should post my concerns. >> >> So, in JDK 11, the long term deprecated Thread.stop() will is removed >> (https://bugs.openjdk.java.net/browse/JDK-8204243), and I have an use >> case that I don't know how to deal with afterwards. > Have you been using stop(Throwable) or the no-arg stop() ?? The method > that is removed is stop(Throwable). This method was re-specified to > throw UnsupportedOperationException in Java SE 8 so it seems unlikely > that there is code running on JDK 8 or newer that relies on this method. > > I realize this isn't answering your question on how to deal with > looping threads but I think it's important to understand that the > removed method has been crippled for several releases and removing it > in 11 should not be disruptive. > > -Alan Oh, thank you! I misread the ticket and missed the fact that it was only stop(Throwable) that was removed. I don't use that one, but .stop() (no arg). So it's a relief, even if I don't have answer now, at least I don't have a ticking bomb to deal with :) Thank you very much! (And I'm still interested by a general, clean and supported solution to deal with the described problem). -- Francois ARMAND - @fanf42 https://github.com/Normation/rudder http://www.normation.com From roger.riggs at oracle.com Mon Jul 23 13:43:07 2018 From: roger.riggs at oracle.com (Roger Riggs) Date: Mon, 23 Jul 2018 09:43:07 -0400 Subject: Thread.strop removed: how to deal with a rogue ScriptEngine thread? In-Reply-To: <0cf25395-efed-5cd8-8fa0-a54e0d2c4e1a@gmail.com> References: <0cf25395-efed-5cd8-8fa0-a54e0d2c4e1a@gmail.com> Message-ID: <8eab2357-b5bf-9512-16d5-1f9e07db1026@oracle.com> Hi Francois, [The core-libs-dev at openjdk.java.net would also be a good mail list for the discussion.] There is no simple solution to stopping runaway computations and retaining the integrity of the remaining system but there are mitigations that can incrementally reduce the impact. If you control the scripting language that is exposed, you can build in checkpoints for runaway cpu consumption. Lightweight checks such as elapsed clock time (System.nanoTime) can be used to periodically check for accumulated cpu time, and other resource excessive usage. Impact on other processes can be reduced by throttling the computation by introducing wait times to avoid exceeding some threshold of the ratio between cputime and clock time. Depending on what's in the loop, throwing InterruptedException may sufficiently disrupt the loop so it terminates. If the script is really while(true){}; then perhaps the interpreter can flag that as a potential bug. Regards, Roger On 7/23/18 8:55 AM, Francois wrote: > Hello, > > I'm not sure at all it is the correct mailing list for that, so sorry > if it is not, and please let me know where I should post my concerns. > > So, in JDK 11, the long term deprecated Thread.stop() will is removed > (https://bugs.openjdk.java.net/browse/JDK-8204243), and I have an use > case that I don't know how to deal with afterwards. > > We use nashorm to provides user with a embeded scripting languageon > our plateform. Only simple things are accessible, and we try to forbid > dangerous things like killing the host process, deleting the > underlying filesystem, or infinite, not > pause-to-look-for-Thread.interrupt while loops. > The first two kind of problem are easely managed with a custom > security manager and a thread factory that uses itfor starting the > sandboxed process (ie: nashorn.eval). > > But I don't know how to manage the while(true){} loop. In that case, > the nashorn thread does not respond to Thread.interrupt (it is a wide > source of surpise and wonders on the internet, see for example: > https://stackoverflow.com/questions/1601246/java-scripting-api-how-to-stop-the-evaluation/1601465#1601465 > and the problem is more generally that Java ScriptEngine API does not > provide a way to interrupt underlying process ounce started). So I > used to have a timeout counter and a Thread.stop() on it. Ido > understand that it could still raise problems, even with the custom > threadfactory and the very specific use case, but it was the only > solution I knew of at hand. > > But now, I don't have any at all to prevent an user to start > unstoppable thread (by malice or by error). > > Any help would be much appreciated on that topic. > From christian.humer at oracle.com Mon Jul 23 16:37:21 2018 From: christian.humer at oracle.com (Christian Humer) Date: Mon, 23 Jul 2018 18:37:21 +0200 Subject: Thread.strop removed: how to deal with a rogue ScriptEngine thread? In-Reply-To: References: <8eab2357-b5bf-9512-16d5-1f9e07db1026@oracle.com> Message-ID: Hi Francois, Graal.js, the JavaScript implementation on top of GraalVM, supports reliable timeouts of *any* JavaScript code. Including endless loops without bodies. * How to use it:* http://www.graalvm.org/docs/graalvm-as-a-platform/embed/#reliable-timeouts-for-malicious-code * How it works:* We use the HotSpots safe point mechanism to trigger a deoptimization for the optimized machine code generated for while(true). For example, safe points are also used whenever the garbage collector needs to intercept the execution and do its work. After the deoptimization, we next use the Truffle instrumentation framework to insert a forced stack unwind on the next statement executed. A force stack unwind is implemented using an exception thrown by the instrumentation. The forced stack unwind will skip any finally blocks of the language, therefore the context needs to be closed, as the application state is invalid after that. It does *not* rely on Thread#stop for its implementation.. More info on how to migrate from Nashorn: https://medium.com/graalvm/oracle-graalvm-announces-support-for-nashorn-migration-c04810d75c1f Hope this is helpful, - Christian Humer > On 7/23/18 8:55 AM, Francois wrote: > Hello, > > I'm not sure at all it is the correct mailing list for that, so sorry if it is not, and please let me know where I should post my concerns. > > So, in JDK 11, the long term deprecated Thread.stop() will is removed (https://bugs.openjdk.java.net/browse/JDK-8204243), and I have an use case that I don't know how to deal with afterwards. > > We use nashorm to provides user with a embeded scripting languageon our plateform. Only simple things are accessible, and we try to forbid dangerous things like killing the host process, deleting the underlying filesystem, or infinite, not pause-to-look-for-Thread.interrupt while loops. > The first two kind of problem are easely managed with a custom security manager and a thread factory that uses itfor starting the sandboxed process (ie: nashorn.eval). > > But I don't know how to manage the while(true){} loop. In that case, the nashorn thread does not respond to Thread.interrupt (it is a wide source of surpise and wonders on the internet, see for example: https://stackoverflow.com/questions/1601246/java-scripting-api-how-to-stop-the-evaluation/1601465#1601465 and the problem is more generally that Java ScriptEngine API does not provide a way to interrupt underlying process ounce started). So I used to have a timeout counter and a Thread.stop() on it. Ido understand that it could still raise problems, even with the custom threadfactory and the very specific use case, but it was the only solution I knew of at hand. > > But now, I don't have any at all to prevent an user to start unstoppable thread (by malice or by error). > > Any help would be much appreciated on that topic. From fanf42 at gmail.com Mon Jul 23 19:13:33 2018 From: fanf42 at gmail.com (Francois) Date: Mon, 23 Jul 2018 21:13:33 +0200 Subject: Thread.strop removed: how to deal with a rogue ScriptEngine thread? In-Reply-To: References: <8eab2357-b5bf-9512-16d5-1f9e07db1026@oracle.com> Message-ID: On 23/07/2018 18:37, Christian Humer wrote: > Hi Francois, > > Graal.js, the JavaScript implementation on top of GraalVM, supports reliable timeouts of *any* JavaScript code. Including endless loops without bodies. > > * How to use it:* > > http://www.graalvm.org/docs/graalvm-as-a-platform/embed/#reliable-timeouts-for-malicious-code > > * How it works:* > > We use the HotSpots safe point mechanism to trigger a deoptimization for the optimized machine code generated for while(true). For example, safe points are also used whenever the garbage collector needs to intercept the execution and do its work. > After the deoptimization, we next use the Truffle instrumentation framework to insert a forced stack unwind on the next statement executed. A force stack unwind is implemented using an exception thrown by the instrumentation. The forced stack unwind will skip any finally blocks of the language, therefore the context needs to be closed, as the application state is invalid after that. It does *not* rely on Thread#stop for its implementation.. > > More info on how to migrate from Nashorn: https://medium.com/graalvm/oracle-graalvm-announces-support-for-nashorn-migration-c04810d75c1f > > Hope this is helpful, > > - Christian Humer Thank you, that seems to be exactly what I think I want to have for a long term solution. Definitly keeping it at hand for when I will do the migration. > > > >> On 7/23/18 8:55 AM, Francois wrote: >> Hello, >> >> I'm not sure at all it is the correct mailing list for that, so sorry if it is not, and please let me know where I should post my concerns. >> >> So, in JDK 11, the long term deprecated Thread.stop() will is removed (https://bugs.openjdk.java.net/browse/JDK-8204243), and I have an use case that I don't know how to deal with afterwards. >> >> We use nashorm to provides user with a embeded scripting languageon our plateform. Only simple things are accessible, and we try to forbid dangerous things like killing the host process, deleting the underlying filesystem, or infinite, not pause-to-look-for-Thread.interrupt while loops. >> The first two kind of problem are easely managed with a custom security manager and a thread factory that uses itfor starting the sandboxed process (ie: nashorn.eval). >> >> But I don't know how to manage the while(true){} loop. In that case, the nashorn thread does not respond to Thread.interrupt (it is a wide source of surpise and wonders on the internet, see for example: https://stackoverflow.com/questions/1601246/java-scripting-api-how-to-stop-the-evaluation/1601465#1601465 and the problem is more generally that Java ScriptEngine API does not provide a way to interrupt underlying process ounce started). So I used to have a timeout counter and a Thread.stop() on it. Ido understand that it could still raise problems, even with the custom threadfactory and the very specific use case, but it was the only solution I knew of at hand. >> >> But now, I don't have any at all to prevent an user to start unstoppable thread (by malice or by error). >> >> Any help would be much appreciated on that topic. -- Francois ARMAND - @fanf42 https://github.com/Normation/rudder http://www.normation.com From fanf42 at gmail.com Mon Jul 23 19:16:41 2018 From: fanf42 at gmail.com (Francois) Date: Mon, 23 Jul 2018 21:16:41 +0200 Subject: Thread.strop removed: how to deal with a rogue ScriptEngine thread? In-Reply-To: <8eab2357-b5bf-9512-16d5-1f9e07db1026@oracle.com> References: <0cf25395-efed-5cd8-8fa0-a54e0d2c4e1a@gmail.com> <8eab2357-b5bf-9512-16d5-1f9e07db1026@oracle.com> Message-ID: <1d89626c-d754-61d7-7264-4ad06352c3b4@gmail.com> On 23/07/2018 15:43, Roger Riggs wrote: > Hi Francois, > > [The core-libs-dev at openjdk.java.net would also be a good mail list for > the discussion.] > > There is no simple solution to stopping runaway computations and > retaining the integrity > of the remaining system but there are mitigations that can > incrementally reduce the impact. > > If you control the scripting language that is exposed, you can build > in checkpoints > for runaway cpu consumption. Lightweight checks such as elapsed clock > time (System.nanoTime) > can be used to periodically check for accumulated cpu time, and other > resource excessive usage. > > Impact on other processes can be reduced by throttling the computation > by introducing > wait times to avoid exceeding some threshold of the ratio between > cputime and clock time. > Depending on what's in the loop, throwing InterruptedException may > sufficiently disrupt > the loop so it terminates. > If the script is really while(true){}; then perhaps the interpreter > can flag that as a potential bug. > > Regards, Roger Thanks, Unfortunatly I don't have a hand on the underlying scripting engine / language (it's just nashorn/js). But I can totally add a sanity check at least for trivial loops. I will keep the idea, thanks ! > > > On 7/23/18 8:55 AM, Francois wrote: >> Hello, >> >> I'm not sure at all it is the correct mailing list for that, so sorry >> if it is not, and please let me know where I should post my concerns. >> >> So, in JDK 11, the long term deprecated Thread.stop() will is removed >> (https://bugs.openjdk.java.net/browse/JDK-8204243), and I have an use >> case that I don't know how to deal with afterwards. >> >> We use nashorm to provides user with a embeded scripting languageon >> our plateform. Only simple things are accessible, and we try to >> forbid dangerous things like killing the host process, deleting the >> underlying filesystem, or infinite, not >> pause-to-look-for-Thread.interrupt while loops. >> The first two kind of problem are easely managed with a custom >> security manager and a thread factory that uses itfor starting the >> sandboxed process (ie: nashorn.eval). >> >> But I don't know how to manage the while(true){} loop. In that case, >> the nashorn thread does not respond to Thread.interrupt (it is a wide >> source of surpise and wonders on the internet, see for example: >> https://stackoverflow.com/questions/1601246/java-scripting-api-how-to-stop-the-evaluation/1601465#1601465 >> and the problem is more generally that Java ScriptEngine API does not >> provide a way to interrupt underlying process ounce started). So I >> used to have a timeout counter and a Thread.stop() on it. Ido >> understand that it could still raise problems, even with the custom >> threadfactory and the very specific use case, but it was the only >> solution I knew of at hand. >> >> But now, I don't have any at all to prevent an user to start >> unstoppable thread (by malice or by error). >> >> Any help would be much appreciated on that topic. >> > -- Francois ARMAND - @fanf42 https://github.com/Normation/rudder http://www.normation.com From joe.darcy at oracle.com Fri Jul 27 03:42:12 2018 From: joe.darcy at oracle.com (joe darcy) Date: Thu, 26 Jul 2018 20:42:12 -0700 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources Message-ID: Hello, The source code management (SCM) system of a software project is a fundamental piece of its infrastructure and workflows. Starting in February 2008, the source code of different JDK releases and supporting projects has been hosted in Mercurial repositories under http://hg.openjdk.java.net/. Code reviews of JDK changes are typically conducted as discussions in mailing lists over small patches sent to one or more lists or over webrevs hosted on cr.openjdk.java.net. Since 2008, many open source projects have successfully adopted more efficient SCM and review tooling, in some cases provided by third parties. In order to help OpenJDK contributors be more productive, both seasoned committers and relative newcomers, the Skara project proposes to investigate alternative SCM and code review options for the JDK source code, including options based upon Git rather than Mercurial, and including options hosted by third parties. The Skara project intends to build prototypes of hosting the JDK 12 sources under different providers. The evaluation criteria to consider include but are not limited to: ??? * Performance: time for clone operations from master repos, time of local operations, etc. ??? * Space efficiency ??? * Usability in different geographies ??? * Support for common development environments such as Linux, Mac, and Windows ??? * Able to easily host the entire history of the JDK and the projected growth of its history over the next decade ??? * Support for general JDK code review practices ??? * Programmatic APIs to enable process assistance and automation of review and processes If one or more prototypes indicate a different SCM arrangement offers substantial improvements over the current situation, the Skara project will shepherd a JEP to change the SCM for the JDK. I propose to lead the project with the initial reviewers including but not limited to Tim Bell (tbell), Erik Duveblad (ehelin), Erik Joelsson (erikj), Tiep Vo (tiep), Tony Squier (squierts), and Robin Westberg (rwestberg). We suggest the build group sponsor this work. Changing the bug tracking system is out of scope for this project and is *not* under consideration. Comments? Cheers, -Joe From martijnverburg at gmail.com Fri Jul 27 08:31:28 2018 From: martijnverburg at gmail.com (Martijn Verburg) Date: Fri, 27 Jul 2018 09:31:28 +0100 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: References: Message-ID: Hi Joe, The Adoption group has been having a lot of ?fun? with importing hg into git (one of the options that you?ll likely explore) for our git mirrors. We?ve got a bunch of war stories and most importantly working scripts that capture history and tags and all that good stuff. They?re Apache 2 licensed at the moment but I?m sure the authors (all being Adoption members) will be happy to re license or dual license. Other useful knowledge we can share is around how we manage repos, projects, issues, work in progress and code reviews at the Adopt build farm. We?ve learned some valuable lessons of what works and what doesn?t at a reasonable scale of number of contributors around a git based workflow. Assuming this project kicks off We?re all very happy to help out! Cheers, Martijn On Fri, 27 Jul 2018 at 05:13, joe darcy wrote: > Hello, > > The source code management (SCM) system of a software project is a > fundamental piece of its infrastructure and workflows. Starting in > February 2008, the source code of different JDK releases and supporting > projects has been hosted in Mercurial repositories under > http://hg.openjdk.java.net/. Code reviews of JDK changes are typically > conducted as discussions in mailing lists over small patches sent to one > or more lists or over webrevs hosted on cr.openjdk.java.net. Since 2008, > many open source projects have successfully adopted more efficient SCM > and review tooling, in some cases provided by third parties. > > In order to help OpenJDK contributors be more productive, both seasoned > committers and relative newcomers, the Skara project proposes to > investigate alternative SCM and code review options for the JDK source > code, including options based upon Git rather than Mercurial, and > including options hosted by third parties. > > The Skara project intends to build prototypes of hosting the JDK 12 > sources under different providers. > > The evaluation criteria to consider include but are not limited to: > > * Performance: time for clone operations from master repos, time of > local operations, etc. > > * Space efficiency > > * Usability in different geographies > > * Support for common development environments such as Linux, Mac, > and Windows > > * Able to easily host the entire history of the JDK and the > projected growth of its history over the next decade > > * Support for general JDK code review practices > > * Programmatic APIs to enable process assistance and automation of > review and processes > > If one or more prototypes indicate a different SCM arrangement offers > substantial improvements over the current situation, the Skara project > will shepherd a JEP to change the SCM for the JDK. > > I propose to lead the project with the initial reviewers including but > not limited to Tim Bell (tbell), Erik Duveblad (ehelin), Erik Joelsson > (erikj), Tiep Vo (tiep), Tony Squier (squierts), and Robin Westberg > (rwestberg). > > We suggest the build group sponsor this work. > > Changing the bug tracking system is out of scope for this project and is > *not* under consideration. > > Comments? > > Cheers, > > -Joe > > -- Cheers, Martijn (Sent from Gmail Mobile) From neugens at redhat.com Fri Jul 27 09:26:26 2018 From: neugens at redhat.com (Mario Torre) Date: Fri, 27 Jul 2018 11:26:26 +0200 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: References: Message-ID: Hi Martijn, How many contributions from developers in those git mirrors came into OpenJDK (or even, how many contributions happened on those mirrors outside of OpenJDK development?). I think the point about performance is sound [1], but I would be very careful to introduce a new SCM, lots of developers are used with mercurial now, and even if git is probably just a small learning step away, I would argue that this is unnecessary to the people who are already contributing. What are the actual benefit for this change? I mean, yes, the list of evaluation criteria, but in practice, why do we need to change? I don't think the remote possibility of attracting a few more people is an answer, the current work flow is simple enough, while the biggest barrier is the process, not the SCM. I doubt we want to change the process (and I'm not advocating this either!), so the benefit to go git (unless you mean we're exploring svn or cvs!!) seems limited compared to the overhead. I may be just too old school, but I think we should be investing the energies somewhere else. Cheers, Mario [1] It really is terrible now with a single repo, but is it a problem of mercurial really? Git also carries all the history in the clone, did somebody do some testing on this, and I mean, on the same servers and network? On Fri, Jul 27, 2018 at 10:31 AM, Martijn Verburg wrote: > Hi Joe, > > The Adoption group has been having a lot of ?fun? with importing hg into > git (one of the options that you?ll likely explore) for our git mirrors. > > We?ve got a bunch of war stories and most importantly working scripts that > capture history and tags and all that good stuff. They?re Apache 2 > licensed at the moment but I?m sure the authors (all being Adoption > members) will be happy to re license or dual license. > > Other useful knowledge we can share is around how we manage repos, > projects, issues, work in progress and code reviews at the Adopt build > farm. We?ve learned some valuable lessons of what works and what doesn?t at > a reasonable scale of number of contributors around a git based workflow. > > Assuming this project kicks off We?re all very happy to help out! > > Cheers, > Martijn > > On Fri, 27 Jul 2018 at 05:13, joe darcy wrote: > >> Hello, >> >> The source code management (SCM) system of a software project is a >> fundamental piece of its infrastructure and workflows. Starting in >> February 2008, the source code of different JDK releases and supporting >> projects has been hosted in Mercurial repositories under >> http://hg.openjdk.java.net/. Code reviews of JDK changes are typically >> conducted as discussions in mailing lists over small patches sent to one >> or more lists or over webrevs hosted on cr.openjdk.java.net. Since 2008, >> many open source projects have successfully adopted more efficient SCM >> and review tooling, in some cases provided by third parties. >> >> In order to help OpenJDK contributors be more productive, both seasoned >> committers and relative newcomers, the Skara project proposes to >> investigate alternative SCM and code review options for the JDK source >> code, including options based upon Git rather than Mercurial, and >> including options hosted by third parties. >> >> The Skara project intends to build prototypes of hosting the JDK 12 >> sources under different providers. >> >> The evaluation criteria to consider include but are not limited to: >> >> * Performance: time for clone operations from master repos, time of >> local operations, etc. >> >> * Space efficiency >> >> * Usability in different geographies >> >> * Support for common development environments such as Linux, Mac, >> and Windows >> >> * Able to easily host the entire history of the JDK and the >> projected growth of its history over the next decade >> >> * Support for general JDK code review practices >> >> * Programmatic APIs to enable process assistance and automation of >> review and processes >> >> If one or more prototypes indicate a different SCM arrangement offers >> substantial improvements over the current situation, the Skara project >> will shepherd a JEP to change the SCM for the JDK. >> >> I propose to lead the project with the initial reviewers including but >> not limited to Tim Bell (tbell), Erik Duveblad (ehelin), Erik Joelsson >> (erikj), Tiep Vo (tiep), Tony Squier (squierts), and Robin Westberg >> (rwestberg). >> >> We suggest the build group sponsor this work. >> >> Changing the bug tracking system is out of scope for this project and is >> *not* under consideration. >> >> Comments? >> >> Cheers, >> >> -Joe >> >> -- > Cheers, Martijn (Sent from Gmail Mobile) -- Mario Torre Associate Manager, Software Engineering Red Hat GmbH 9704 A60C B4BE A8B8 0F30 9205 5D7E 4952 3F65 7898 From martijnverburg at gmail.com Fri Jul 27 10:50:15 2018 From: martijnverburg at gmail.com (Martijn Verburg) Date: Fri, 27 Jul 2018 11:50:15 +0100 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: References: Message-ID: Hi Mario, AdoptOpenJDK is not intended to be a place for source code development of OpenJDK, so we deliberately don't accept patches there with the exception of one patch to support a particularly esoteric platform which OpenJDK did not want to support (completely understandable). We want all source code development to happen at OpenJDK. We've had a fair number of requests to allow patches because using Git/GitHub has less friction for those folks (in particular new developers to OpenJDK). More importantly they would like to see their patches go through our build and test pipeline before submitting to OpenJDK (upstream). We've still go some work to go on the build farm (e.g. adding flexibility to build and test the various branches for amber and panama etc) before we'd discuss / consider doing this, but all of Infrastructure as Code that we have could be utilised to do so (and my understanding is that a few vendors are trying just that). -- As a side note SCM workflow hubs BitBucket and GitHub have powerful hooks where you could also list and/or block a Pull Request / Issue on patches, for example: * Have you discussed this change on X mailing list? * Have you signed your CLA? * Have you written / extended a jtreg test for this? They also has the built in mechanisms to support reviewers and a patch lifecycle (authored, builds OK on all platforms, passes test pipeline, ready for review, reviewed etc). I fairly regularly see exchanges on the mailing list where folks are asking for a reviewer to review a patch, get sent to a different mailing list(s), find out post commit that their patch breaks something like the Zero compiler etc, etc. I think there's an awful lot that a shared SCM workflow hub / CI pipeline can do for OpenJDK developers in terms of managing their patch lifecycle, freeing them up to work on the more important stuff. Cheers, Martijn On Fri, 27 Jul 2018 at 10:27, Mario Torre wrote: > Hi Martijn, > > How many contributions from developers in those git mirrors came into > OpenJDK (or even, how many contributions happened on those mirrors > outside of OpenJDK development?). > > I think the point about performance is sound [1], but I would be very > careful to introduce a new SCM, lots of developers are used with > mercurial now, and even if git is probably just a small learning step > away, I would argue that this is unnecessary to the people who are > already contributing. > > What are the actual benefit for this change? I mean, yes, the list of > evaluation criteria, but in practice, why do we need to change? > > I don't think the remote possibility of attracting a few more people > is an answer, the current work flow is simple enough, while the > biggest barrier is the process, not the SCM. I doubt we want to change > the process (and I'm not advocating this either!), so the benefit to > go git (unless you mean we're exploring svn or cvs!!) seems limited > compared to the overhead. > > I may be just too old school, but I think we should be investing the > energies somewhere else. > > Cheers, > Mario > > [1] It really is terrible now with a single repo, but is it a problem > of mercurial really? Git also carries all the history in the clone, > did somebody do some testing on this, and I mean, on the same servers > and network? > > > On Fri, Jul 27, 2018 at 10:31 AM, Martijn Verburg > wrote: > > Hi Joe, > > > > The Adoption group has been having a lot of ?fun? with importing hg into > > git (one of the options that you?ll likely explore) for our git mirrors. > > > > We?ve got a bunch of war stories and most importantly working scripts > that > > capture history and tags and all that good stuff. They?re Apache 2 > > licensed at the moment but I?m sure the authors (all being Adoption > > members) will be happy to re license or dual license. > > > > Other useful knowledge we can share is around how we manage repos, > > projects, issues, work in progress and code reviews at the Adopt build > > farm. We?ve learned some valuable lessons of what works and what doesn?t > at > > a reasonable scale of number of contributors around a git based workflow. > > > > Assuming this project kicks off We?re all very happy to help out! > > > > Cheers, > > Martijn > > > > On Fri, 27 Jul 2018 at 05:13, joe darcy wrote: > > > >> Hello, > >> > >> The source code management (SCM) system of a software project is a > >> fundamental piece of its infrastructure and workflows. Starting in > >> February 2008, the source code of different JDK releases and supporting > >> projects has been hosted in Mercurial repositories under > >> http://hg.openjdk.java.net/. Code reviews of JDK changes are typically > >> conducted as discussions in mailing lists over small patches sent to one > >> or more lists or over webrevs hosted on cr.openjdk.java.net. Since > 2008, > >> many open source projects have successfully adopted more efficient SCM > >> and review tooling, in some cases provided by third parties. > >> > >> In order to help OpenJDK contributors be more productive, both seasoned > >> committers and relative newcomers, the Skara project proposes to > >> investigate alternative SCM and code review options for the JDK source > >> code, including options based upon Git rather than Mercurial, and > >> including options hosted by third parties. > >> > >> The Skara project intends to build prototypes of hosting the JDK 12 > >> sources under different providers. > >> > >> The evaluation criteria to consider include but are not limited to: > >> > >> * Performance: time for clone operations from master repos, time of > >> local operations, etc. > >> > >> * Space efficiency > >> > >> * Usability in different geographies > >> > >> * Support for common development environments such as Linux, Mac, > >> and Windows > >> > >> * Able to easily host the entire history of the JDK and the > >> projected growth of its history over the next decade > >> > >> * Support for general JDK code review practices > >> > >> * Programmatic APIs to enable process assistance and automation of > >> review and processes > >> > >> If one or more prototypes indicate a different SCM arrangement offers > >> substantial improvements over the current situation, the Skara project > >> will shepherd a JEP to change the SCM for the JDK. > >> > >> I propose to lead the project with the initial reviewers including but > >> not limited to Tim Bell (tbell), Erik Duveblad (ehelin), Erik Joelsson > >> (erikj), Tiep Vo (tiep), Tony Squier (squierts), and Robin Westberg > >> (rwestberg). > >> > >> We suggest the build group sponsor this work. > >> > >> Changing the bug tracking system is out of scope for this project and is > >> *not* under consideration. > >> > >> Comments? > >> > >> Cheers, > >> > >> -Joe > >> > >> -- > > Cheers, Martijn (Sent from Gmail Mobile) > > > > -- > Mario Torre > Associate Manager, Software Engineering > Red Hat GmbH > 9704 A60C B4BE A8B8 0F30 9205 5D7E 4952 3F65 7898 > From neugens at redhat.com Fri Jul 27 11:11:58 2018 From: neugens at redhat.com (Mario Torre) Date: Fri, 27 Jul 2018 13:11:58 +0200 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: References: Message-ID: On Fri, Jul 27, 2018 at 12:50 PM, Martijn Verburg wrote: > Hi Mario, > > AdoptOpenJDK is not intended to be a place for source code development of > OpenJDK, so we deliberately don't accept patches there with the exception of > one patch to support a particularly esoteric platform which OpenJDK did not > want to support (completely understandable). We want all source code > development to happen at OpenJDK. > > We've had a fair number of requests to allow patches because using > Git/GitHub has less friction for those folks (in particular new developers > to OpenJDK). More importantly they would like to see their patches go > through our build and test pipeline before submitting to OpenJDK (upstream). Right, but that is not solved if we move to GitHub, because the process of OpenJDK isn't compatible to this process, and that has nothing to do with Git vs Mercurial I think, GitHub is just another place to host the code, the OpenJDK process would the same, so the only difference would be using GitHub for patch review rather than webrev, but if any, I would find that more dispersive for the reviewer, this is likely due to my own experience with GitHub, where I found the need to have eyes open everywhere to review code, I don't think it's made for such large projects as OpenJDK (and the Linux Kernel people seem to agree on this). > As a side note SCM workflow hubs BitBucket and GitHub have powerful hooks > where you could also list and/or block a Pull Request / Issue on patches, > for example: > > * Have you discussed this change on X mailing list? > * Have you signed your CLA? > * Have you written / extended a jtreg test for this? > > They also has the built in mechanisms to support reviewers and a patch > lifecycle (authored, builds OK on all platforms, passes test pipeline, ready > for review, reviewed etc). > > I fairly regularly see exchanges on the mailing list where folks are asking > for a reviewer to review a patch, get sent to a different mailing list(s), > find out post commit that their patch breaks something like the Zero > compiler etc, etc. I think there's an awful lot that a shared SCM workflow > hub / CI pipeline can do for OpenJDK developers in terms of managing their > patch lifecycle, freeing them up to work on the more important stuff. But is that really necessary? Also, I think the GitHub process moves the burden to the reviewers, instead of making life easier for them, and we lack this resource more than anything. If there were a massive return in patches from the community that would be interesting to explore, but I don't see this level of involvement from outside. Instead, I fear it would make life harder for people who already contribute. After all, it's not that hard to become a committer really, and the barrier being the same with or without GitHub, I can't understand what's difficult about webrev that GitHub would solve without introducing unneccessary overhead. We've been there before, when we discussed helping people compile OpenJDK, my objection was that if someone isn't able to invoke make I'm not sure I want his patch in ;) Same argument here, if somebody claims that he can't contribute because it's too hard to run "webrev" then perhaps I don't miss that contribution. All that, of course, assuming we're not touching the process. If we talk about the process that's a different story, and again, the benefit of GitHub is really on its process work flow (if you happen to like it, which I don't), but this is not compatible with our current one. Git vs Mercurial as SCM... well I don't see that big of a case here either but well... In any case, if we want to change, I think the *existing* community of OpenJDK should be allowed to properly vote on this. Cheers, Mario -- Mario Torre Associate Manager, Software Engineering Red Hat GmbH 9704 A60C B4BE A8B8 0F30 9205 5D7E 4952 3F65 7898 From brian.goetz at oracle.com Fri Jul 27 15:45:27 2018 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 27 Jul 2018 11:45:27 -0400 Subject: Method and Field Literals In-Reply-To: <876194083.862388.1529433658434.JavaMail.zimbra@u-pem.fr> References: <876194083.862388.1529433658434.JavaMail.zimbra@u-pem.fr> Message-ID: <38876efa-0dc7-3ff7-9ae7-04ff647e029a@oracle.com> > The non easy way is to make the method reference Serializable, serialize it and extract the Serialization proxy which contains the class, the method etc. In addition to non-easy, this is also non-safe, non-cost-free, etc. Let's put this in the category of "dangerous workarounds" :) From martijnverburg at gmail.com Fri Jul 27 16:59:33 2018 From: martijnverburg at gmail.com (Martijn Verburg) Date: Fri, 27 Jul 2018 17:59:33 +0100 Subject: Method and Field Literals In-Reply-To: <38876efa-0dc7-3ff7-9ae7-04ff647e029a@oracle.com> References: <876194083.862388.1529433658434.JavaMail.zimbra@u-pem.fr> <38876efa-0dc7-3ff7-9ae7-04ff647e029a@oracle.com> Message-ID: @HereBeDragons - I?m almost tempted to raise a JEP. On Fri, 27 Jul 2018 at 16:45, Brian Goetz wrote: > > The non easy way is to make the method reference Serializable, serialize > it and extract the Serialization proxy which contains the class, the method > etc. > > In addition to non-easy, this is also non-safe, non-cost-free, etc. > Let's put this in the category of "dangerous workarounds" :) > > > -- Cheers, Martijn (Sent from Gmail Mobile) From joe.darcy at oracle.com Fri Jul 27 20:10:20 2018 From: joe.darcy at oracle.com (joe darcy) Date: Fri, 27 Jul 2018 13:10:20 -0700 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: References: Message-ID: Hello Mario, On 7/27/2018 2:26 AM, Mario Torre wrote: > Hi Martijn, > > How many contributions from developers in those git mirrors came into > OpenJDK (or even, how many contributions happened on those mirrors > outside of OpenJDK development?). > > I think the point about performance is sound [1], but I would be very > careful to introduce a new SCM, lots of developers are used with > mercurial now, and even if git is probably just a small learning step > away, I would argue that this is unnecessary to the people who are > already contributing. > [snip] > > [1] It really is terrible now with a single repo, but is it a problem > of mercurial really? Git also carries all the history in the clone, > did somebody do some testing on this, and I mean, on the same servers > and network? > In Mercurial, when a file is moved, its history is restarted, meaning a full copy of the file is stored. Therefore, lots of file moves will tend to make a Mercurial repo get disproportionally larger. In the JDK, many files were moved in JDK 9 for modularity and large numbers of files were moved again in JDK 10 for the repo consolidation. The Mercurial representation of JDK 8 GA takes about 412 MB, JDK 9 GA ~808 MB, and JDK 10 GA ~1553 MB. Given the number of changesets in JDK 10, extrapolating from the good linear fit between number of changesets and size in the JDK 7 and 8 update releases, one would expect JDK 10 in hg to take in the neighborhood of 450 MB - 500 MB. Therefore, the file moves are certainly bulking up the repo size, contributing to the increased download times. While a simple import of the JDK sources into git can lead to a larger representation, if the git repo is repacked [1], it will result in a much, much small representation. Basically a repack is requesting git use forward and backward differencing with a large window to look for a more compact representation; this will remove the excess size introduced by the file moves. In particular, by running ??? git repack -a -d --depth=250 --window=250 -f on some git imports of the JDK we've done internally, we ended with a git repo size of recent JDK sources of around 300 MB, roughly 5X smaller. That 300 MB includes all the JDK changeset history and tags, etc. In some experiments with hosting providers, cloning such a repacked git repo can be completed within 1 to 3 minutes, which is considerably faster than the clone times we see now from hg.openjdk.java.net. HTH, -Joe [1] https://metalinguist.wordpress.com/2007/12/06/the-woes-of-git-gc-aggressive-and-how-git-deltas-work/ From david at davidherron.com Fri Jul 27 21:07:00 2018 From: david at davidherron.com (David Herron) Date: Fri, 27 Jul 2018 14:07:00 -0700 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: References: Message-ID: Mario, I did not see in Joe's message a suggestion to "Move to GitHub". I'm sure you're aware there are other possibilities for hosting Git repositories. Gitlab, Gitbucket, Gogs, etc Those three all attempt a version of the GitHub experience without being GitHub. I personally use Gogs at home for personal projects. What I'd be concerned about is that moving to a web-ui-for-Git doesn't add anything unless you're also going to change processes for managing OpenJDK contributions. Joe's email said nothing about changing processes and explicitly said it would not be about changing the bug tracker. So... all that would be gained is switching from Mercurial to Git as the repository technology and one would not be gaining any of the benefits of the UI offered by GitHub et al. There may be some gains in performance and the size of the repository. Joe says there is, and I'm sure Joe has some proof to back that up. But ... is the work involved to switch SCM's worth only that gain? The project seems to not be seeking any of the other possible gains from switching to Git. Okay, Joes message also did not explicitly say "switch from Mercurial to Git" ... but in the SCM landscape, aren't those the remaining two choices? And isn't it Git that has the huge marketshare/mindshare advantage? + David Herron On Fri, Jul 27, 2018 at 4:11 AM, Mario Torre wrote: > On Fri, Jul 27, 2018 at 12:50 PM, Martijn Verburg > wrote: > > Hi Mario, > > > > AdoptOpenJDK is not intended to be a place for source code development of > > OpenJDK, so we deliberately don't accept patches there with the > exception of > > one patch to support a particularly esoteric platform which OpenJDK did > not > > want to support (completely understandable). We want all source code > > development to happen at OpenJDK. > > > > We've had a fair number of requests to allow patches because using > > Git/GitHub has less friction for those folks (in particular new > developers > > to OpenJDK). More importantly they would like to see their patches go > > through our build and test pipeline before submitting to OpenJDK > (upstream). > > Right, but that is not solved if we move to GitHub, because the > process of OpenJDK isn't compatible to this process, and that has > nothing to do with Git vs Mercurial I think, GitHub is just another > place to host the code, the OpenJDK process would the same, so the > only difference would be using GitHub for patch review rather than > webrev, but if any, I would find that more dispersive for the > reviewer, this is likely due to my own experience with GitHub, where I > found the need to have eyes open everywhere to review code, I don't > think it's made for such large projects as OpenJDK (and the Linux > Kernel people seem to agree on this). > > > As a side note SCM workflow hubs BitBucket and GitHub have powerful hooks > > where you could also list and/or block a Pull Request / Issue on patches, > > for example: > > > > * Have you discussed this change on X mailing list? > > * Have you signed your CLA? > > * Have you written / extended a jtreg test for this? > > > > They also has the built in mechanisms to support reviewers and a patch > > lifecycle (authored, builds OK on all platforms, passes test pipeline, > ready > > for review, reviewed etc). > > > > I fairly regularly see exchanges on the mailing list where folks are > asking > > for a reviewer to review a patch, get sent to a different mailing > list(s), > > find out post commit that their patch breaks something like the Zero > > compiler etc, etc. I think there's an awful lot that a shared SCM > workflow > > hub / CI pipeline can do for OpenJDK developers in terms of managing > their > > patch lifecycle, freeing them up to work on the more important stuff. > > But is that really necessary? > > Also, I think the GitHub process moves the burden to the reviewers, > instead of making life easier for them, and we lack this resource more > than anything. If there were a massive return in patches from the > community that would be interesting to explore, but I don't see this > level of involvement from outside. Instead, I fear it would make life > harder for people who already contribute. > > After all, it's not that hard to become a committer really, and the > barrier being the same with or without GitHub, I can't understand > what's difficult about webrev that GitHub would solve without > introducing unneccessary overhead. We've been there before, when we > discussed helping people compile OpenJDK, my objection was that if > someone isn't able to invoke make I'm not sure I want his patch in ;) > > Same argument here, if somebody claims that he can't contribute > because it's too hard to run "webrev" then perhaps I don't miss that > contribution. > > All that, of course, assuming we're not touching the process. If we > talk about the process that's a different story, and again, the > benefit of GitHub is really on its process work flow (if you happen to > like it, which I don't), but this is not compatible with our current > one. Git vs Mercurial as SCM... well I don't see that big of a case > here either but well... > > In any case, if we want to change, I think the *existing* community of > OpenJDK should be allowed to properly vote on this. > > Cheers, > Mario > > -- > Mario Torre > Associate Manager, Software Engineering > Red Hat GmbH > 9704 A60C B4BE A8B8 0F30 9205 5D7E 4952 3F65 7898 > From joe.darcy at oracle.com Sat Jul 28 03:28:33 2018 From: joe.darcy at oracle.com (joe darcy) Date: Fri, 27 Jul 2018 20:28:33 -0700 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: References: Message-ID: Hello, On 7/27/2018 3:50 AM, Martijn Verburg wrote: > Hi Mario, > > AdoptOpenJDK is not intended to be a place for source code development > of OpenJDK, so we deliberately don't accept patches there with the > exception of one patch to support a particularly esoteric platform > which OpenJDK did not want to support (completely understandable).? We > want all source code development to happen at OpenJDK. > > We've had a fair number of requests to allow patches because using > Git/GitHub has less friction for those folks (in particular new > developers to OpenJDK). More importantly they would like to see their > patches go through our build and test pipeline before submitting to > OpenJDK (upstream). > > We've still go some work to go on the build farm (e.g. adding > flexibility to build and test the various branches for amber and > panama etc) before we'd discuss / consider doing this, but all of > Infrastructure as Code that we have could be utilised to do so (and my > understanding is that a few vendors are trying just that). > > -- > > As a side note SCM workflow hubs BitBucket and GitHub have powerful > hooks where you could also list and/or block a Pull Request / Issue on > patches, for example: > > * Have you discussed this change on X mailing list? > * Have you signed your CLA? > * Have you written / extended a jtreg test for this? > > They also has the built in mechanisms to support reviewers and a patch > lifecycle (authored, builds OK on all platforms, passes test pipeline, > ready for review, reviewed etc). > > I fairly regularly see exchanges on the mailing list where folks are > asking for a reviewer to review a patch, get sent to a different > mailing list(s), find out post commit that their patch breaks > something like the Zero compiler etc, etc. I think there's an awful > lot that a shared SCM workflow hub / CI pipeline can do for OpenJDK > developers in terms of managing their patch lifecycle, freeing them up > to work on the more important stuff. > > Cheers, > Martijn > One of my professors was fond of saying "The essence of civilization is being about to benefit from someone else's experience without having to relive it." In that sense, I think we should also strive to have any SCM transition for the JDK to be a civilized one. For example, the Python community migrated from Mercurial to Git on Github: ??? PEP 512 -- Migrating from hg.python.org to GitHub ??? https://www.python.org/dev/peps/pep-0512/ They put in place automation for contributor agreement checking, etc. I understand the Graal project on Github has OCA checking as well. I think a change like an SCM transition is an opportunity to reassess existing processes and consider better aligning them with the preferred workflow of the new tool set. For example, a git migration for the JDK could involve adjustments to things like syntax of the commit messages and I would expect use of bots working with a hosting provider to automate other checking of various kinds too. For example, besides a contributor agreement bot it would be possible to have a jcheck bot validate a pull request. A CI bot could also notice a pull request, run tests on behalf of the author, and report back a comment summarizing the resulting build and test status. Cheers, -Joe From scolebourne at joda.org Sat Jul 28 06:16:20 2018 From: scolebourne at joda.org (Stephen Colebourne) Date: Sat, 28 Jul 2018 07:16:20 +0100 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: References: Message-ID: I support the creation of this project. I think the list of tasks/considerations is reasonable, although I believe that git is the only realistic alternative. As a data point, I've used git for everything I do for years. I now pretty much refuse to use anything else. I disliked it strongly initially, but now find it invaluable - there is a learning curve, and as a developer you have to invest time to get over that. Personally, I found Mercurial almost unusable when I had to use it for JSR-310. I have had no desire to fight Mercurial again since JSR-310 was complete. As such a number of bugs have lingered that I might otherwise have tackled (as only Oracle employees are tackling them). Given this, I strongly suspect that use of git (preferably via GitHub) would greatly increase my ability to contribute. thanks Stephen On 27 July 2018 at 04:42, joe darcy wrote: > Hello, > > The source code management (SCM) system of a software project is a > fundamental piece of its infrastructure and workflows. Starting in February > 2008, the source code of different JDK releases and supporting projects has > been hosted in Mercurial repositories under http://hg.openjdk.java.net/. > Code reviews of JDK changes are typically conducted as discussions in > mailing lists over small patches sent to one or more lists or over webrevs > hosted on cr.openjdk.java.net. Since 2008, many open source projects have > successfully adopted more efficient SCM and review tooling, in some cases > provided by third parties. > > In order to help OpenJDK contributors be more productive, both seasoned > committers and relative newcomers, the Skara project proposes to investigate > alternative SCM and code review options for the JDK source code, including > options based upon Git rather than Mercurial, and including options hosted > by third parties. > > The Skara project intends to build prototypes of hosting the JDK 12 sources > under different providers. > > The evaluation criteria to consider include but are not limited to: > > * Performance: time for clone operations from master repos, time of > local operations, etc. > > * Space efficiency > > * Usability in different geographies > > * Support for common development environments such as Linux, Mac, and > Windows > > * Able to easily host the entire history of the JDK and the projected > growth of its history over the next decade > > * Support for general JDK code review practices > > * Programmatic APIs to enable process assistance and automation of > review and processes > > If one or more prototypes indicate a different SCM arrangement offers > substantial improvements over the current situation, the Skara project will > shepherd a JEP to change the SCM for the JDK. > > I propose to lead the project with the initial reviewers including but not > limited to Tim Bell (tbell), Erik Duveblad (ehelin), Erik Joelsson (erikj), > Tiep Vo (tiep), Tony Squier (squierts), and Robin Westberg (rwestberg). > > We suggest the build group sponsor this work. > > Changing the bug tracking system is out of scope for this project and is > *not* under consideration. > > Comments? > > Cheers, > > -Joe > From thomas.stuefe at gmail.com Sat Jul 28 06:31:13 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Sat, 28 Jul 2018 08:31:13 +0200 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: References: Message-ID: Hallo, If this is just a move from mercural to git with all else staying the same, I am indifferent. I like mercurial but git is pretty similar. Moving to git may make life easier for all those people who manage downstream repos in git. Git may also (need to check that) run faster under Windows Cygwin, which would be a nice bonus. However, I am apprehensive about a move away from the current review process (mailing lists). The proposal mentioned "different providers" which I assume would mean GitHub? For me, the review discussions on the mailing lists - with all their combined knowledge, wisdom and civility - are a huge wealth in itself. Close in value, to me, to the source code itself. I am afraid that moving to a different review platform would endanger all that. The fact that past discussions are preserved and searchable for all eternity is extremely valuable. Old school mailing lists can be archived by anyone, by definition, the content is democratically shared by all subscribers. If we move to e.g. GitHub, who owns that content? How easy would it be to archive discussions from github? Then, I can read review old school mails with whatever reader I please. With the review process closed up in a providers website, I am effectively forced to that one review interface he offers. That interface may also change the way reviews are done. Currently, review mails are often long, involved, carefully worded, with lots of condensed knowledge. Which is good. A new interface may negatively effect the quality of the review answers. I found discussions on Github never quite up to par with what I know from our mailing lists, but I could be biased. Slack or IRC is even worse. This may just me being old school. But for me, this is a bit like electronic voting machines: sometimes old school techniques still have their place. Best Regards, Thomas On Fri, Jul 27, 2018 at 5:42 AM, joe darcy wrote: > Hello, > > The source code management (SCM) system of a software project is a > fundamental piece of its infrastructure and workflows. Starting in February > 2008, the source code of different JDK releases and supporting projects has > been hosted in Mercurial repositories under http://hg.openjdk.java.net/. > Code reviews of JDK changes are typically conducted as discussions in > mailing lists over small patches sent to one or more lists or over webrevs > hosted on cr.openjdk.java.net. Since 2008, many open source projects have > successfully adopted more efficient SCM and review tooling, in some cases > provided by third parties. > > In order to help OpenJDK contributors be more productive, both seasoned > committers and relative newcomers, the Skara project proposes to investigate > alternative SCM and code review options for the JDK source code, including > options based upon Git rather than Mercurial, and including options hosted > by third parties. > > The Skara project intends to build prototypes of hosting the JDK 12 sources > under different providers. > > The evaluation criteria to consider include but are not limited to: > > * Performance: time for clone operations from master repos, time of > local operations, etc. > > * Space efficiency > > * Usability in different geographies > > * Support for common development environments such as Linux, Mac, and > Windows > > * Able to easily host the entire history of the JDK and the projected > growth of its history over the next decade > > * Support for general JDK code review practices > > * Programmatic APIs to enable process assistance and automation of > review and processes > > If one or more prototypes indicate a different SCM arrangement offers > substantial improvements over the current situation, the Skara project will > shepherd a JEP to change the SCM for the JDK. > > I propose to lead the project with the initial reviewers including but not > limited to Tim Bell (tbell), Erik Duveblad (ehelin), Erik Joelsson (erikj), > Tiep Vo (tiep), Tony Squier (squierts), and Robin Westberg (rwestberg). > > We suggest the build group sponsor this work. > > Changing the bug tracking system is out of scope for this project and is > *not* under consideration. > > Comments? > > Cheers, > > -Joe > From peter.lawrey at gmail.com Sat Jul 28 08:41:59 2018 From: peter.lawrey at gmail.com (Peter Lawrey) Date: Sat, 28 Jul 2018 09:41:59 +0100 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: References: Message-ID: On 27 July 2018 at 12:11, Mario Torre wrote: > > After all, it's not that hard to become a committer really, and the > barrier being the same with or without GitHub, My impressions is, even for some of the most technical Java Champions it is impossible to become a committer. I have asked Java Champions who are actively working in the JVM or contributing directly but they found it easier to find/hire someone else to do the committing than become a committer. I have; - the most answers for Java and JVM on StackOverflow, - two high performance open source libraries with over 1,000 stars on Github, - another library which got downloaded 6 million times last month - built a self funded business which made ?2.5m last year. I am eager to contribute, can code and understand business needs, but I have no idea how to become a committer without being hired into the JVM development team. What works for me is that some committers read my blog so things I write about get fixed in the next version indirectly. I would agree that the choice of technology is not the biggest barrier to entry. What we would need is a change of process if we are going to open up OpenJDK. A new platform might bring a change of mind set. BTW I don't think it should be easy to be a committer. Regards, Peter. From patrick at reini.net Sat Jul 28 19:11:17 2018 From: patrick at reini.net (Patrick Reinhart) Date: Sat, 28 Jul 2018 21:11:17 +0200 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: References: Message-ID: <4D7B6056-79F6-4697-AAF6-CBCC75AF1B67@reini.net> Hi Mario, I?ve now contributed a couple of changes to the JDK now and for me as a part time contributor. It would be much easier for me, to have used git instead of mercurial, due the fact that I use git on a daily basis. and its commands for creating different feature branches and handling patches would be easier for me. Nevertheless I it is important in my opinion that git and the Github process are two separate things. I would fear to have the JDK on Github just due to the fact you might get flooded with a huge amount of pull requests for things not actually discussed on any mailing list before and the existing reviewers would not be able to keep up. I would like to see a improvement and better support in handling contributions and the review process in general as using the webrev tool as it is now. In that regard the review process on github using a separate feature branch on a clone seems a good start... -Patrick > Am 27.07.2018 um 11:26 schrieb Mario Torre : > > Hi Martijn, > > How many contributions from developers in those git mirrors came into > OpenJDK (or even, how many contributions happened on those mirrors > outside of OpenJDK development?). > > I think the point about performance is sound [1], but I would be very > careful to introduce a new SCM, lots of developers are used with > mercurial now, and even if git is probably just a small learning step > away, I would argue that this is unnecessary to the people who are > already contributing. > > What are the actual benefit for this change? I mean, yes, the list of > evaluation criteria, but in practice, why do we need to change? > > I don't think the remote possibility of attracting a few more people > is an answer, the current work flow is simple enough, while the > biggest barrier is the process, not the SCM. I doubt we want to change > the process (and I'm not advocating this either!), so the benefit to > go git (unless you mean we're exploring svn or cvs!!) seems limited > compared to the overhead. > > I may be just too old school, but I think we should be investing the > energies somewhere else. > > Cheers, > Mario > > [1] It really is terrible now with a single repo, but is it a problem > of mercurial really? Git also carries all the history in the clone, > did somebody do some testing on this, and I mean, on the same servers > and network? > > > On Fri, Jul 27, 2018 at 10:31 AM, Martijn Verburg > wrote: >> Hi Joe, >> >> The Adoption group has been having a lot of ?fun? with importing hg into >> git (one of the options that you?ll likely explore) for our git mirrors. >> >> We?ve got a bunch of war stories and most importantly working scripts that >> capture history and tags and all that good stuff. They?re Apache 2 >> licensed at the moment but I?m sure the authors (all being Adoption >> members) will be happy to re license or dual license. >> >> Other useful knowledge we can share is around how we manage repos, >> projects, issues, work in progress and code reviews at the Adopt build >> farm. We?ve learned some valuable lessons of what works and what doesn?t at >> a reasonable scale of number of contributors around a git based workflow. >> >> Assuming this project kicks off We?re all very happy to help out! >> >> Cheers, >> Martijn >> >> On Fri, 27 Jul 2018 at 05:13, joe darcy wrote: >> >>> Hello, >>> >>> The source code management (SCM) system of a software project is a >>> fundamental piece of its infrastructure and workflows. Starting in >>> February 2008, the source code of different JDK releases and supporting >>> projects has been hosted in Mercurial repositories under >>> http://hg.openjdk.java.net/. Code reviews of JDK changes are typically >>> conducted as discussions in mailing lists over small patches sent to one >>> or more lists or over webrevs hosted on cr.openjdk.java.net. Since 2008, >>> many open source projects have successfully adopted more efficient SCM >>> and review tooling, in some cases provided by third parties. >>> >>> In order to help OpenJDK contributors be more productive, both seasoned >>> committers and relative newcomers, the Skara project proposes to >>> investigate alternative SCM and code review options for the JDK source >>> code, including options based upon Git rather than Mercurial, and >>> including options hosted by third parties. >>> >>> The Skara project intends to build prototypes of hosting the JDK 12 >>> sources under different providers. >>> >>> The evaluation criteria to consider include but are not limited to: >>> >>> * Performance: time for clone operations from master repos, time of >>> local operations, etc. >>> >>> * Space efficiency >>> >>> * Usability in different geographies >>> >>> * Support for common development environments such as Linux, Mac, >>> and Windows >>> >>> * Able to easily host the entire history of the JDK and the >>> projected growth of its history over the next decade >>> >>> * Support for general JDK code review practices >>> >>> * Programmatic APIs to enable process assistance and automation of >>> review and processes >>> >>> If one or more prototypes indicate a different SCM arrangement offers >>> substantial improvements over the current situation, the Skara project >>> will shepherd a JEP to change the SCM for the JDK. >>> >>> I propose to lead the project with the initial reviewers including but >>> not limited to Tim Bell (tbell), Erik Duveblad (ehelin), Erik Joelsson >>> (erikj), Tiep Vo (tiep), Tony Squier (squierts), and Robin Westberg >>> (rwestberg). >>> >>> We suggest the build group sponsor this work. >>> >>> Changing the bug tracking system is out of scope for this project and is >>> *not* under consideration. >>> >>> Comments? >>> >>> Cheers, >>> >>> -Joe >>> >>> -- >> Cheers, Martijn (Sent from Gmail Mobile) > > > > -- > Mario Torre > Associate Manager, Software Engineering > Red Hat GmbH > 9704 A60C B4BE A8B8 0F30 9205 5D7E 4952 3F65 7898 From Ryan.LaMothe at pnnl.gov Sun Jul 29 19:21:42 2018 From: Ryan.LaMothe at pnnl.gov (LaMothe, Ryan R) Date: Sun, 29 Jul 2018 19:21:42 +0000 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: References: Message-ID: <8A9968A2-169B-4545-856C-C450641993C5@pnnl.gov> I also agree with a change from internally hosted Mercurial to externally hosted Github. Internally and externally, all of our work has migrated to either Gitlab/Stash or Github, respectively. That includes our Subversion and Mercurial code bases. Some of the primary benefits of this migration has been increased tooling support, increased vendor support and significantly easier patch/feature/etc. submissions by staff. For example, staff can now simply fork a git repo, apply their changes and submit pull requests via JIRA. Reviews of pull requests can be performed in multiple different CLI or GUI tool suites. No more DIFF files attached to issue tickets! Additional data point, while git itself (at least the command line) has been more complicated to use and understand than svn, requiring additional training and testing, the move from mercurial to git was relatively straightforward. ____________________________________________ Ryan LaMothe ?On 7/26/18, 9:17 PM, "discuss on behalf of joe darcy" wrote: Hello, The source code management (SCM) system of a software project is a fundamental piece of its infrastructure and workflows. Starting in February 2008, the source code of different JDK releases and supporting projects has been hosted in Mercurial repositories under http://hg.openjdk.java.net/. Code reviews of JDK changes are typically conducted as discussions in mailing lists over small patches sent to one or more lists or over webrevs hosted on cr.openjdk.java.net. Since 2008, many open source projects have successfully adopted more efficient SCM and review tooling, in some cases provided by third parties. In order to help OpenJDK contributors be more productive, both seasoned committers and relative newcomers, the Skara project proposes to investigate alternative SCM and code review options for the JDK source code, including options based upon Git rather than Mercurial, and including options hosted by third parties. The Skara project intends to build prototypes of hosting the JDK 12 sources under different providers. The evaluation criteria to consider include but are not limited to: * Performance: time for clone operations from master repos, time of local operations, etc. * Space efficiency * Usability in different geographies * Support for common development environments such as Linux, Mac, and Windows * Able to easily host the entire history of the JDK and the projected growth of its history over the next decade * Support for general JDK code review practices * Programmatic APIs to enable process assistance and automation of review and processes If one or more prototypes indicate a different SCM arrangement offers substantial improvements over the current situation, the Skara project will shepherd a JEP to change the SCM for the JDK. I propose to lead the project with the initial reviewers including but not limited to Tim Bell (tbell), Erik Duveblad (ehelin), Erik Joelsson (erikj), Tiep Vo (tiep), Tony Squier (squierts), and Robin Westberg (rwestberg). We suggest the build group sponsor this work. Changing the bug tracking system is out of scope for this project and is *not* under consideration. Comments? Cheers, -Joe From david.holmes at oracle.com Mon Jul 30 06:02:42 2018 From: david.holmes at oracle.com (David Holmes) Date: Mon, 30 Jul 2018 16:02:42 +1000 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: References: Message-ID: On 28/07/2018 4:31 PM, Thomas St?fe wrote: > Hallo, > > If this is just a move from mercural to git with all else staying the > same, I am indifferent. I like mercurial but git is pretty similar. > Moving to git may make life easier for all those people who manage > downstream repos in git. Git may also (need to check that) run faster > under Windows Cygwin, which would be a nice bonus. > > However, I am apprehensive about a move away from the current review > process (mailing lists). The proposal mentioned "different providers" > which I assume would mean GitHub? > > For me, the review discussions on the mailing lists - with all their > combined knowledge, wisdom and civility - are a huge wealth in itself. > Close in value, to me, to the source code itself. I am afraid that > moving to a different review platform would endanger all that. +1 on that. With a simple email-based review process (plus webrevs hosted on cr.o.j.n) I can easily scan dozens of incoming changes to see if they may be something I need to dive more deeply into. That is all lost if reviews happens inside some other system - even if a notification email is generated when such reviews are initiated. Changes to the review processes/tools should be kept a separate as possible from the selection of an underlying SCM. Cheers, David > The fact that past discussions are preserved and searchable for all > eternity is extremely valuable. Old school mailing lists can be > archived by anyone, by definition, the content is democratically > shared by all subscribers. If we move to e.g. GitHub, who owns that > content? How easy would it be to archive discussions from github? > > Then, I can read review old school mails with whatever reader I > please. With the review process closed up in a providers website, I am > effectively forced to that one review interface he offers. > > That interface may also change the way reviews are done. Currently, > review mails are often long, involved, carefully worded, with lots of > condensed knowledge. Which is good. A new interface may negatively > effect the quality of the review answers. I found discussions on > Github never quite up to par with what I know from our mailing lists, > but I could be biased. Slack or IRC is even worse. > > This may just me being old school. But for me, this is a bit like > electronic voting machines: sometimes old school techniques still have > their place. > > Best Regards, Thomas > > > > > On Fri, Jul 27, 2018 at 5:42 AM, joe darcy wrote: >> Hello, >> >> The source code management (SCM) system of a software project is a >> fundamental piece of its infrastructure and workflows. Starting in February >> 2008, the source code of different JDK releases and supporting projects has >> been hosted in Mercurial repositories under http://hg.openjdk.java.net/. >> Code reviews of JDK changes are typically conducted as discussions in >> mailing lists over small patches sent to one or more lists or over webrevs >> hosted on cr.openjdk.java.net. Since 2008, many open source projects have >> successfully adopted more efficient SCM and review tooling, in some cases >> provided by third parties. >> >> In order to help OpenJDK contributors be more productive, both seasoned >> committers and relative newcomers, the Skara project proposes to investigate >> alternative SCM and code review options for the JDK source code, including >> options based upon Git rather than Mercurial, and including options hosted >> by third parties. >> >> The Skara project intends to build prototypes of hosting the JDK 12 sources >> under different providers. >> >> The evaluation criteria to consider include but are not limited to: >> >> * Performance: time for clone operations from master repos, time of >> local operations, etc. >> >> * Space efficiency >> >> * Usability in different geographies >> >> * Support for common development environments such as Linux, Mac, and >> Windows >> >> * Able to easily host the entire history of the JDK and the projected >> growth of its history over the next decade >> >> * Support for general JDK code review practices >> >> * Programmatic APIs to enable process assistance and automation of >> review and processes >> >> If one or more prototypes indicate a different SCM arrangement offers >> substantial improvements over the current situation, the Skara project will >> shepherd a JEP to change the SCM for the JDK. >> >> I propose to lead the project with the initial reviewers including but not >> limited to Tim Bell (tbell), Erik Duveblad (ehelin), Erik Joelsson (erikj), >> Tiep Vo (tiep), Tony Squier (squierts), and Robin Westberg (rwestberg). >> >> We suggest the build group sponsor this work. >> >> Changing the bug tracking system is out of scope for this project and is >> *not* under consideration. >> >> Comments? >> >> Cheers, >> >> -Joe >> From adinn at redhat.com Mon Jul 30 08:22:51 2018 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 30 Jul 2018 09:22:51 +0100 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: References: Message-ID: <579734f6-3a5a-d2d4-afb6-0a653c51e7db@redhat.com> On 28/07/18 09:41, Peter Lawrey wrote: > My impressions is, even for some of the most technical Java Champions it is > impossible to become a committer. Java Champions in what area of Java? Java is a big church and the requirements to implement Java applications, implement Java JDK runtime code or implement JVM code are /widely/ divergent. 'technical' is a highly questionable way of talking about this. Is someone whose 'technical' expertise lies in the implementation of transactions going to be able to help build a file system or a compiler? > I have asked Java Champions who are actively working in the JVM or > contributing directly but they found it easier to find/hire someone else to > do the committing than become a committer. That sentence doesn't parse very well as it seems to suggest these are Java Champions who are active committers but then are not, instead delegating to real committers. Your point has been lost, I am afraid. > I have; > - the most answers for Java and JVM on StackOverflow, > - two high performance open source libraries with over 1,000 stars on > Github, > - another library which got downloaded 6 million times last month > - built a self funded business which made ?2.5m last year. I am not at all sure why points on StackOverflow qualify you to commit to the JVM? Coding a language virtual machine runtime is not a beauty contest. Nor, indeed do the other qualifications appear particularly to qualify you for this task. If you had previously been involved in VM or compiler design/implementation or come up with a novel GC algorithm then that experience would be highly relevant. The things you cite in no way not guarantee that you will know which end of an OpenJDK JIRA to pick up and what to do about it. They don't disqualify you, of course, but there is a better, much more relevant standard (see below) > I am eager to contribute, can code and understand business needs, but I > have no idea how to become a committer without being hired into the JVM > development team. Well, there is a simple way to make this happen. You need to read the code, find something which is broken or incomplete, read the code again, raise a JIRA pointing out the problem, read the code again, create and post a patch with a request for review, read the feedback (and the code again), see the patch through review (no doubt, reading the code again several times along the way) then rinse and repeat. Do that a few times and you will become an official committer. Do it enough with a suitable display of expertise and you will become a reviewer. No one here is going to stop you understanding what needs fixing and then fixing it. Indeed, if you are unsure how something works and ask an intelligent question on any of the lists you /will/ be helped to arrive at the understanding needed to fix problems. We even have a project group (AdoptOpenJDK) whose sole purpose is to help make people perform this task, providing guidance on where to look in the code, how to raise an issue, where to post problems etc. > What works for me is that some committers read my blog so things I write > about get fixed in the next version indirectly. Well, if you know enough to blog about real problems and you also know how to code then the only thing stopping you from contributing fixes is learning enough about the code to come up with your own fix and then submitting it for inclusion. OpenJDK is an open source project and, as we say in open source, show me the code. > I would agree that the choice of technology is not the biggest barrier to > entry. > What we would need is a change of process if we are going to open up > OpenJDK. > A new platform might bring a change of mind set. I'm sorry but I think the conclusion that we need a change of process looks to me to be a complete non sequitur when your complaints above don't even appear to recognise the existing process (as the farmer said, you can't get there from here). I'm left very unsure as to whose mind set it is that is actually misaligned. > BTW I don't think it should be easy to be a committer. It isn't. But that's not because there is no process to follow. It's because the code base takes a long time to understand and improve. That's not in the least bit surprising given what it does. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From adinn at redhat.com Mon Jul 30 08:47:22 2018 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 30 Jul 2018 09:47:22 +0100 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: <4D7B6056-79F6-4697-AAF6-CBCC75AF1B67@reini.net> References: <4D7B6056-79F6-4697-AAF6-CBCC75AF1B67@reini.net> Message-ID: On 28/07/18 20:11, Patrick Reinhart wrote: > I?ve now contributed a couple of changes to the JDK now and for me as > a part time contributor. It would be much easier for me, to have used > git instead of mercurial, due the fact that I use git on a daily > basis. and its commands for creating different feature branches and > handling patches would be easier for me. Well, I am in the fortunate position of using both hg and git regularly and I don't actually think there is much of an edge to either in terms of usability or, indeed, functionality. For the things you need to do everyday both work reasonable well and are fairly easy to learn and retain. I do actually have a preference but I don't think it matters much (any decision to move really must not be reduced to a beauty contest). If there is a reason to move then I think it is neither function nor usability but the issue Joe raised -- performance i.e. the way they both scale. The hg repo for OpenJDK has become very unwieldy. Without shipilev.net it would already be a big headache (thank you, Aleksey!). It appears from Joe's research that git is currently, and will continue to be, much more manageable in this regard. Whether that is enough to justify a change is a hard question to answer but I can understand his argument and sympathise with it. > Nevertheless I it is important in my opinion that git and the Github > process are two separate things. I would fear to have the JDK on > Github just due to the fact you might get flooded with a huge amount > of pull requests for things not actually discussed on any mailing > list before and the existing reviewers would not be able to keep up. There are many virtues to our current multiple mailing list-based review model which need to be thought about carefully before we make any change. Moving to something like Github (even if it is not Github itself) is not something we ought to do lightly. However, that is a completely different order of change to switching the SCM system we use. > I would like to see a improvement and better support in handling > contributions and the review process in general as using the webrev > tool as it is now. In that regard the review process on github using > a separate feature branch on a clone seems a good start... Maybe. I am not sure from my experience on Graal that this is necessarily the best way to do things. One of the benefits of our use of email lists is that you /have/ to subscribe and get sent a copy of everything that is happening in the area you wish to contribute to (sometimes even several lists in several areas). That does mean the initial experience is like trying to drink from a firehose but that is a good thing as well as a bad one. It's important to know what is going on in the project if you are going to be contributing to it to any significant degree. So, while the volume of traffic makes it hard for people to get started it means that those who do start are aware of what they will need to keep up with in order to continue. I don't believe the project is actually going to benefit from bringing in contributors who quickly drop out again. Yes, we might get a lot of small low-priority fixes but at what overhead? regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From shade at redhat.com Mon Jul 30 09:09:20 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 30 Jul 2018 11:09:20 +0200 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: References: <4D7B6056-79F6-4697-AAF6-CBCC75AF1B67@reini.net> Message-ID: On 07/30/2018 10:47 AM, Andrew Dinn wrote: > If there is a reason to move then I think it is neither function nor > usability but the issue Joe raised -- performance i.e. the way they both > scale. The hg repo for OpenJDK has become very unwieldy. Without > shipilev.net it would already be a big headache (thank you, Aleksey!). > It appears from Joe's research that git is currently, and will continue > to be, much more manageable in this regard. Whether that is enough to > justify a change is a hard question to answer but I can understand his > argument and sympathise with it. I have to note that we can make Mercurial clones much less painful if we adopt what Mozilla does for their large repositories: providing the bundles that are cached on CDN for ether automatic or manual clones [1]. What we have at my build server [2] is basically a stop-gap, and can be solved on server side without considering switching to Git. Would be awesome if Skara explored if that is a viable way for OpenJDK. -Aleksey [1] https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Source_Code/Mercurial/Bundles [2] https://builds.shipilev.net/workspaces/ From roman at kennke.org Mon Jul 30 09:17:38 2018 From: roman at kennke.org (Roman Kennke) Date: Mon, 30 Jul 2018 11:17:38 +0200 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: References: <4D7B6056-79F6-4697-AAF6-CBCC75AF1B67@reini.net> Message-ID: <6B15CFE1-6487-446E-A2AA-696DDF8990C6@kennke.org> Am 30. Juli 2018 11:09:20 MESZ schrieb Aleksey Shipilev : >On 07/30/2018 10:47 AM, Andrew Dinn wrote: >> If there is a reason to move then I think it is neither function nor >> usability but the issue Joe raised -- performance i.e. the way they >both >> scale. The hg repo for OpenJDK has become very unwieldy. Without >> shipilev.net it would already be a big headache (thank you, >Aleksey!). >> It appears from Joe's research that git is currently, and will >continue >> to be, much more manageable in this regard. Whether that is enough to >> justify a change is a hard question to answer but I can understand >his >> argument and sympathise with it. > >I have to note that we can make Mercurial clones much less painful if >we adopt what Mozilla does for >their large repositories: providing the bundles that are cached on CDN >for ether automatic or manual >clones [1]. What we have at my build server [2] is basically a >stop-gap, and can be solved on server >side without considering switching to Git. Would be awesome if Skara >explored if that is a viable >way for OpenJDK Yes. And we can make partial clones. Nobody really ever needs all of the history. Roman > >-Aleksey > >[1] >https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Source_Code/Mercurial/Bundles >[2] https://builds.shipilev.net/workspaces/ -- Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. From weijun.wang at oracle.com Mon Jul 30 09:47:31 2018 From: weijun.wang at oracle.com (Weijun Wang) Date: Mon, 30 Jul 2018 17:47:31 +0800 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: <6B15CFE1-6487-446E-A2AA-696DDF8990C6@kennke.org> References: <4D7B6056-79F6-4697-AAF6-CBCC75AF1B67@reini.net> <6B15CFE1-6487-446E-A2AA-696DDF8990C6@kennke.org> Message-ID: <3B321C69-2DF8-4844-A4B3-507B2F71450D@oracle.com> > On Jul 30, 2018, at 5:17 PM, Roman Kennke wrote: > > Yes. And we can make partial clones. Nobody really ever needs all of the history. While I don't need the history of every file, I often find myself looking at the full history of the files I am working on, to find out how they evolve to what they look today. Does partial clone support this? Also, do you happen to know a solution to make the repo smaller after so many renames? Aleksey's page [2] shows the xz file for jdk8u is 232M but the one for jdk/jdk is 752M. Thanks Max > > Roman > >> [2] https://builds.shipilev.net/workspaces/ > > -- > Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. From shade at redhat.com Mon Jul 30 09:55:35 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 30 Jul 2018 11:55:35 +0200 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: <3B321C69-2DF8-4844-A4B3-507B2F71450D@oracle.com> References: <4D7B6056-79F6-4697-AAF6-CBCC75AF1B67@reini.net> <6B15CFE1-6487-446E-A2AA-696DDF8990C6@kennke.org> <3B321C69-2DF8-4844-A4B3-507B2F71450D@oracle.com> Message-ID: On 07/30/2018 11:47 AM, Weijun Wang wrote: > Also, do you happen to know a solution to make the repo smaller after so many renames? Aleksey's > page [2] shows the xz file for jdk8u is 232M but the one for jdk/jdk is 752M. This seems to be the classic space-time tradeoff. I have been experimenting with publishing the Mozilla-style bundles, and jdk/jdk hg bundle compressed with "xz -9" is only 266M, which seems to be due to bundle being much more compact, compared to .hg in the filesystem. The downside is that unpacking that bundle takes much more time than un-xzing the full .hg snapshot. So, if you have more bandwidth, .hg snapshot is quicker to bring up. If you have more time, bundles seem to be the way to go. builds.shipilev.net publishes .hg snapshots, because bandwidth is cheaper than time so far. -Aleksey From adinn at redhat.com Mon Jul 30 10:05:34 2018 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 30 Jul 2018 11:05:34 +0100 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: <6B15CFE1-6487-446E-A2AA-696DDF8990C6@kennke.org> References: <4D7B6056-79F6-4697-AAF6-CBCC75AF1B67@reini.net> <6B15CFE1-6487-446E-A2AA-696DDF8990C6@kennke.org> Message-ID: <56959793-353b-a149-9ea3-320009b6716b@redhat.com> On 30/07/18 10:17, Roman Kennke wrote: > Yes. And we can make partial clones. Nobody really ever needs all of the history. No no no! There have been quite a few occasions in the last year when I really needed to search all of the history (yes, even taking me back to jdk7 in some cases). Indeed, for AArch64 -- which was upstreamed into jdk9 in one great big lump -- I have returned to the downstream jdk8 repo to find out when and how something was inserted into that history. The full history is fairly obviously a major concern while we still need to backport security fixes. So, yes, it is critical that we can continue to identify what went into jdk7 when for several years to come. However, that's not the only case. My experience has been that full history is occasionally vital to understanding how something arose when I want to work out what to do about it now. It is all very well asking the relevant old-timers about why things happened but the repo has a better memory for important details (as my experience with the AArch64 code made very clear to me). regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From erik.osterlund at oracle.com Mon Jul 30 10:20:11 2018 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Mon, 30 Jul 2018 12:20:11 +0200 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: References: <4D7B6056-79F6-4697-AAF6-CBCC75AF1B67@reini.net> Message-ID: <5CF1E0A5-72FF-4884-BA7E-736B83B5BEB9@oracle.com> Hi, For me the main advantage of switching to git is tooling support. I already switched to using git locally with an hg remote (that I got from our Skara friends), so that I can use magit in Emacs, and it has made my life much easier compared to using mq patch queues with poor Emacs integration that forced me to make my own extensions to make me less sad. These days I am much happier and more productive using git, despite having to move things over to hg before pushing. It is worth it for me to be able to use magit. Considering every hotspot engineer seems to have their own unique quirky development environment, it might be that better and wider tooling in general matters to more people than myself. It certainly matters to me anyway. /Erik > On 30 Jul 2018, at 10:47, Andrew Dinn wrote: > >> On 28/07/18 20:11, Patrick Reinhart wrote: >> I?ve now contributed a couple of changes to the JDK now and for me as >> a part time contributor. It would be much easier for me, to have used >> git instead of mercurial, due the fact that I use git on a daily >> basis. and its commands for creating different feature branches and >> handling patches would be easier for me. > > Well, I am in the fortunate position of using both hg and git regularly > and I don't actually think there is much of an edge to either in terms > of usability or, indeed, functionality. For the things you need to do > everyday both work reasonable well and are fairly easy to learn and > retain. I do actually have a preference but I don't think it matters > much (any decision to move really must not be reduced to a beauty contest). > > If there is a reason to move then I think it is neither function nor > usability but the issue Joe raised -- performance i.e. the way they both > scale. The hg repo for OpenJDK has become very unwieldy. Without > shipilev.net it would already be a big headache (thank you, Aleksey!). > It appears from Joe's research that git is currently, and will continue > to be, much more manageable in this regard. Whether that is enough to > justify a change is a hard question to answer but I can understand his > argument and sympathise with it. > >> Nevertheless I it is important in my opinion that git and the Github >> process are two separate things. I would fear to have the JDK on >> Github just due to the fact you might get flooded with a huge amount >> of pull requests for things not actually discussed on any mailing >> list before and the existing reviewers would not be able to keep up. > > There are many virtues to our current multiple mailing list-based review > model which need to be thought about carefully before we make any > change. Moving to something like Github (even if it is not Github > itself) is not something we ought to do lightly. However, that is a > completely different order of change to switching the SCM system we use. > >> I would like to see a improvement and better support in handling >> contributions and the review process in general as using the webrev >> tool as it is now. In that regard the review process on github using >> a separate feature branch on a clone seems a good start... > Maybe. I am not sure from my experience on Graal that this is > necessarily the best way to do things. One of the benefits of our use of > email lists is that you /have/ to subscribe and get sent a copy of > everything that is happening in the area you wish to contribute to > (sometimes even several lists in several areas). That does mean the > initial experience is like trying to drink from a firehose but that is a > good thing as well as a bad one. It's important to know what is going on > in the project if you are going to be contributing to it to any > significant degree. So, while the volume of traffic makes it hard for > people to get started it means that those who do start are aware of what > they will need to keep up with in order to continue. I don't believe the > project is actually going to benefit from bringing in contributors who > quickly drop out again. Yes, we might get a lot of small low-priority > fixes but at what overhead? > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From neugens at redhat.com Mon Jul 30 10:22:50 2018 From: neugens at redhat.com (Mario Torre) Date: Mon, 30 Jul 2018 12:22:50 +0200 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: <56959793-353b-a149-9ea3-320009b6716b@redhat.com> References: <4D7B6056-79F6-4697-AAF6-CBCC75AF1B67@reini.net> <6B15CFE1-6487-446E-A2AA-696DDF8990C6@kennke.org> <56959793-353b-a149-9ea3-320009b6716b@redhat.com> Message-ID: On Mon, Jul 30, 2018 at 12:05 PM, Andrew Dinn wrote: > On 30/07/18 10:17, Roman Kennke wrote: > >> Yes. And we can make partial clones. Nobody really ever needs all of the history. > No no no! > > There have been quite a few occasions in the last year when I really > needed to search all of the history (yes, even taking me back to jdk7 in > some cases). Indeed, for AArch64 -- which was upstreamed into jdk9 in > one great big lump -- I have returned to the downstream jdk8 repo to > find out when and how something was inserted into that history. > > The full history is fairly obviously a major concern while we still need > to backport security fixes. So, yes, it is critical that we can continue > to identify what went into jdk7 when for several years to come. However, > that's not the only case. My experience has been that full history is > occasionally vital to understanding how something arose when I want to > work out what to do about it now. It is all very well asking the > relevant old-timers about why things happened but the repo has a better > memory for important details (as my experience with the AArch64 code > made very clear to me). Yeah, but I don't think the two things are mutually exclusive, are they? You can make a full clone with history when you need it or live with shallow copies: https://bitbucket.org/facebook/hg-experimental/src/e9cd2a76c49207ecae6b64b2de804b858a00dc36/remotefilelog/README.md?at=default&fileviewer=file-view-default There are other extensions over there that seem interesting, however most only work with much recent versions of mercurial. Cheers, Mario -- Mario Torre Associate Manager, Software Engineering Red Hat GmbH 9704 A60C B4BE A8B8 0F30 9205 5D7E 4952 3F65 7898 From neugens at redhat.com Mon Jul 30 10:26:41 2018 From: neugens at redhat.com (Mario Torre) Date: Mon, 30 Jul 2018 12:26:41 +0200 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: References: Message-ID: On Mon, Jul 30, 2018 at 8:02 AM, David Holmes wrote: > On 28/07/2018 4:31 PM, Thomas St?fe wrote: >> >> Hallo, >> >> If this is just a move from mercural to git with all else staying the >> same, I am indifferent. I like mercurial but git is pretty similar. >> Moving to git may make life easier for all those people who manage >> downstream repos in git. Git may also (need to check that) run faster >> under Windows Cygwin, which would be a nice bonus. >> >> However, I am apprehensive about a move away from the current review >> process (mailing lists). The proposal mentioned "different providers" >> which I assume would mean GitHub? >> >> For me, the review discussions on the mailing lists - with all their >> combined knowledge, wisdom and civility - are a huge wealth in itself. >> Close in value, to me, to the source code itself. I am afraid that >> moving to a different review platform would endanger all that. > > > +1 on that. With a simple email-based review process (plus webrevs hosted on > cr.o.j.n) I can easily scan dozens of incoming changes to see if they may be > something I need to dive more deeply into. That is all lost if reviews > happens inside some other system - even if a notification email is generated > when such reviews are initiated. > > Changes to the review processes/tools should be kept a separate as possible > from the selection of an underlying SCM. Yes, my point exactly, I can live with git if that's the consensus, but not with a GitHub style process. Cheers, Mario -- Mario Torre Associate Manager, Software Engineering Red Hat GmbH 9704 A60C B4BE A8B8 0F30 9205 5D7E 4952 3F65 7898 From thomas.stuefe at gmail.com Mon Jul 30 10:52:15 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 30 Jul 2018 12:52:15 +0200 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: <56959793-353b-a149-9ea3-320009b6716b@redhat.com> References: <4D7B6056-79F6-4697-AAF6-CBCC75AF1B67@reini.net> <6B15CFE1-6487-446E-A2AA-696DDF8990C6@kennke.org> <56959793-353b-a149-9ea3-320009b6716b@redhat.com> Message-ID: On Mon, Jul 30, 2018 at 12:05 PM, Andrew Dinn wrote: > On 30/07/18 10:17, Roman Kennke wrote: > >> Yes. And we can make partial clones. Nobody really ever needs all of the history. > No no no! > > There have been quite a few occasions in the last year when I really > needed to search all of the history (yes, even taking me back to jdk7 in > some cases). Indeed, for AArch64 -- which was upstreamed into jdk9 in > one great big lump -- I have returned to the downstream jdk8 repo to > find out when and how something was inserted into that history. > > The full history is fairly obviously a major concern while we still need > to backport security fixes. So, yes, it is critical that we can continue > to identify what went into jdk7 when for several years to come. However, > that's not the only case. My experience has been that full history is > occasionally vital to understanding how something arose when I want to > work out what to do about it now. It is all very well asking the > relevant old-timers about why things happened but the repo has a better > memory for important details (as my experience with the AArch64 code > made very clear to me). > I absolutely agree. The full history is indispensable for our work. Best Regards, Thomas From weijun.wang at oracle.com Mon Jul 30 11:13:10 2018 From: weijun.wang at oracle.com (Weijun Wang) Date: Mon, 30 Jul 2018 19:13:10 +0800 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: References: <4D7B6056-79F6-4697-AAF6-CBCC75AF1B67@reini.net> <6B15CFE1-6487-446E-A2AA-696DDF8990C6@kennke.org> <3B321C69-2DF8-4844-A4B3-507B2F71450D@oracle.com> Message-ID: Joe said on Jul 28: > In Mercurial, when a file is moved, its history is restarted, meaning a full copy of the file is stored. Therefore, lots of file moves will tend to make a Mercurial repo get disproportionally larger. In the JDK, many files were moved in JDK 9 for modularity and large numbers of files were moved again in JDK 10 for the repo consolidation. > > The Mercurial representation of JDK 8 GA takes about 412 MB, JDK 9 GA ~808 MB, and JDK 10 GA ~1553 MB. So this is related to Mercurial's design that a rename equals to a remove and a create. Maybe we can fix Mercurial to make this a real "move", and I doubt if there is a space-time tradeoff here. --Max > On Jul 30, 2018, at 5:55 PM, Aleksey Shipilev wrote: > > On 07/30/2018 11:47 AM, Weijun Wang wrote: >> Also, do you happen to know a solution to make the repo smaller after so many renames? Aleksey's >> page [2] shows the xz file for jdk8u is 232M but the one for jdk/jdk is 752M. > This seems to be the classic space-time tradeoff. From shade at redhat.com Mon Jul 30 11:53:09 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 30 Jul 2018 13:53:09 +0200 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: References: <4D7B6056-79F6-4697-AAF6-CBCC75AF1B67@reini.net> <6B15CFE1-6487-446E-A2AA-696DDF8990C6@kennke.org> <3B321C69-2DF8-4844-A4B3-507B2F71450D@oracle.com> Message-ID: <3cd25e13-2b99-031e-b2de-ea406daf6ae0@redhat.com> On 07/30/2018 01:13 PM, Weijun Wang wrote: > Joe said on Jul 28: > >> In Mercurial, when a file is moved, its history is restarted, meaning a full copy of the file is stored. Therefore, lots of file moves will tend to make a Mercurial repo get disproportionally larger. In the JDK, many files were moved in JDK 9 for modularity and large numbers of files were moved again in JDK 10 for the repo consolidation. >> >> The Mercurial representation of JDK 8 GA takes about 412 MB, JDK 9 GA ~808 MB, and JDK 10 GA ~1553 MB. > > So this is related to Mercurial's design that a rename equals to a remove and a create. > > Maybe we can fix Mercurial to make this a real "move", and I doubt if there is a space-time tradeoff here. What I meant to say is that space-time tradeoff between on-the-wire format (bundles) and on-the-disk format (.hg folder) is there, and you can choose either, depending on the context. Publishing blobs in on-the-wire format has better compatibility, while tarballs in on-the-disk format are ultimately faster to "clone". Two mega-moves (Jigsaw in 9, and monorepo in 10) inflated the on-the-disk size quite badly, as Joe indicated above, but on-the-wire format size seems to remain okay. So, if we enabled CDN-backed bundles-assisted clone, it should probably cut down clone pains, at least for our Europe-side folks, at the expense of some client CPU churn associated with converting on-the-wire to on-the-disk during the clone. Some optimization for on-the-disk size is possible if you re-clone the repo with "--config=format.generaldelta=1 --config=format.aggressivemergedeltas=1", thus optimizing internal .hg metadata. That would take a lot of time. If you have some time to spare, then it makes sense to do so. My build scripts do that automatically before packaging the .hg snapshots. Also, it seems that doing the "clone --pull" twice with generaldelta enabled compacts metadata even more: jdk/jdk .hg size fell from 1.5 GB to 1.2 GB uncompressed, and from 750M to 590M xz9-compressed. I just fixed my build scripts and currently testing them. -Aleksey From erik.osterlund at oracle.com Mon Jul 30 12:41:02 2018 From: erik.osterlund at oracle.com (=?utf-8?Q?Erik_=C3=96sterlund?=) Date: Mon, 30 Jul 2018 14:41:02 +0200 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: <3cd25e13-2b99-031e-b2de-ea406daf6ae0@redhat.com> References: <3cd25e13-2b99-031e-b2de-ea406daf6ae0@redhat.com> Message-ID: <1532954204.local-c2774ebe-1ee1-v1.3.0-fd741eb7@getmailspring.com> Hi, I don't see the fascination with reinventing the source code hosting wheel again for our project. Perhaps there is a good point to it, but I can't currently seem to see it. It seems like some are saying that with custom mercurial hacks we can achieve smaller repos competitive with git for cloning. Although to fully utilize that and actually get fast cloning times for everyone working on the project, we would need mirrors of the hack in multiple countries, with backup, synchronization and other hosting stuff (security patching and what not), and it still comes with space-time tradeoffs. Even then there are things like object pooling across repos allowing people to have their own forks without unnecessary disk overheads that perhaps we could solve by digging into mercurial and writing our own extensions and custom hosting solution. I'm sure if we tried hard and put lots of resources into reinventing solutions for these source code hosting problems, it would be almost as good a! s github one day. But is it worth our focus and effort to reinvent source code hosting again because OpenJDK is so special, instead of just putting it on github like everybody else and have small repos natively with git (without hacks), good tooling, fast access with mirrors everywhere, backups, cross-repo object sharing, programmable bots (that can be used to e.g. check automatically if it builds on Oracle external platforms like PPC/S390/AArch64/Zero, so we can notice problems before they are pushed), etc for free? I would personally rather ride on the source code hosting experience and expertise of GitHub than to chase after homegrown solutions to patch the problems. /Erik On Jul 30 2018, at 1:53 pm, Aleksey Shipilev wrote: > > On 07/30/2018 01:13 PM, Weijun Wang wrote: > > Joe said on Jul 28: > > > > > In Mercurial, when a file is moved, its history is restarted, meaning a full copy of the file is stored. Therefore, lots of file moves will tend to make a Mercurial repo get disproportionally larger. In the JDK, many files were moved in JDK 9 for modularity and large numbers of files were moved again in JDK 10 for the repo consolidation. > > > The Mercurial representation of JDK 8 GA takes about 412 MB, JDK 9 GA ~808 MB, and JDK 10 GA ~1553 MB. > > So this is related to Mercurial's design that a rename equals to a remove and a create. > > Maybe we can fix Mercurial to make this a real "move", and I doubt if there is a space-time tradeoff here. > What I meant to say is that space-time tradeoff between on-the-wire format (bundles) and on-the-disk > format (.hg folder) is there, and you can choose either, depending on the context. Publishing blobs > in on-the-wire format has better compatibility, while tarballs in on-the-disk format are ultimately > faster to "clone". > > Two mega-moves (Jigsaw in 9, and monorepo in 10) inflated the on-the-disk size quite badly, as Joe > indicated above, but on-the-wire format size seems to remain okay. So, if we enabled CDN-backed > bundles-assisted clone, it should probably cut down clone pains, at least for our Europe-side folks, > at the expense of some client CPU churn associated with converting on-the-wire to on-the-disk during > the clone. > > Some optimization for on-the-disk size is possible if you re-clone the repo with > "--config=format.generaldelta=1 --config=format.aggressivemergedeltas=1", thus optimizing internal > .hg metadata. That would take a lot of time. If you have some time to spare, then it makes sense to > do so. My build scripts do that automatically before packaging the .hg snapshots. > > Also, it seems that doing the "clone --pull" twice with generaldelta enabled compacts metadata even > more: jdk/jdk .hg size fell from 1.5 GB to 1.2 GB uncompressed, and from 750M to 590M > xz9-compressed. I just fixed my build scripts and currently testing them. > > -Aleksey From neugens at redhat.com Mon Jul 30 13:16:16 2018 From: neugens at redhat.com (Mario Torre) Date: Mon, 30 Jul 2018 15:16:16 +0200 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: <1532954204.local-c2774ebe-1ee1-v1.3.0-fd741eb7@getmailspring.com> References: <3cd25e13-2b99-031e-b2de-ea406daf6ae0@redhat.com> <1532954204.local-c2774ebe-1ee1-v1.3.0-fd741eb7@getmailspring.com> Message-ID: On Mon, Jul 30, 2018 at 2:41 PM, Erik ?sterlund wrote: > Hi, > > etc for free? I would personally rather ride on the source code hosting experience and expertise of GitHub than to chase after homegrown solutions to patch the problems. > /Erik That surely isn't for free. Besides, what you described are just clones, how does GitHub protect us from multiple clones going out of sync? Since it's a model that does favour branching I can only see more out of sync repos. The GitHub model may be good for the vast majority of little projects out there, but not for us (no disrespect here intended, they may have large communities etc. but clearly the majority of projects have smaller source count and focus areas than OpenJDK, this project is huge, that's what makes it special). I do have experience with one other project on GitHub that is not even large enough to approach the critical mass of OpenJDK, but is large and the development model is insane, it's very dispersive and there's no simple way of filtering or keep organised discussion and reviews, bugs etc and features etc... Once a project becomes somewhat big, GitHub is a mess. So, yes, we may decide to host of GitHub (or similar), but we should be *very* careful not to use the GitHub model, it won't scale for us. Our current model is not broken, and mercurial is only a tad slow, so we shouldn't change anything other than make the SCM a tad faster, and again, if the only solution is to move to git... well, whatever, but that's just about it. I usually compare OpenJDK and the Kernel because they are very similar (by design I think?) and although we don't have any more the mono-tree/multiple repos approach, this is still valid: https://blog.ffwll.ch/2017/08/github-why-cant-host-the-kernel.html In my view, moving to GitHub would be a mistake that will bring us more pain down the road and force us to adapt to a new workflow for no reason. Cheers, Mario > On Jul 30 2018, at 1:53 pm, Aleksey Shipilev wrote: >> >> On 07/30/2018 01:13 PM, Weijun Wang wrote: >> > Joe said on Jul 28: >> > >> > > In Mercurial, when a file is moved, its history is restarted, meaning a full copy of the file is stored. Therefore, lots of file moves will tend to make a Mercurial repo get disproportionally larger. In the JDK, many files were moved in JDK 9 for modularity and large numbers of files were moved again in JDK 10 for the repo consolidation. >> > > The Mercurial representation of JDK 8 GA takes about 412 MB, JDK 9 GA ~808 MB, and JDK 10 GA ~1553 MB. >> > So this is related to Mercurial's design that a rename equals to a remove and a create. >> > Maybe we can fix Mercurial to make this a real "move", and I doubt if there is a space-time tradeoff here. >> What I meant to say is that space-time tradeoff between on-the-wire format (bundles) and on-the-disk >> format (.hg folder) is there, and you can choose either, depending on the context. Publishing blobs >> in on-the-wire format has better compatibility, while tarballs in on-the-disk format are ultimately >> faster to "clone". >> >> Two mega-moves (Jigsaw in 9, and monorepo in 10) inflated the on-the-disk size quite badly, as Joe >> indicated above, but on-the-wire format size seems to remain okay. So, if we enabled CDN-backed >> bundles-assisted clone, it should probably cut down clone pains, at least for our Europe-side folks, >> at the expense of some client CPU churn associated with converting on-the-wire to on-the-disk during >> the clone. >> >> Some optimization for on-the-disk size is possible if you re-clone the repo with >> "--config=format.generaldelta=1 --config=format.aggressivemergedeltas=1", thus optimizing internal >> .hg metadata. That would take a lot of time. If you have some time to spare, then it makes sense to >> do so. My build scripts do that automatically before packaging the .hg snapshots. >> >> Also, it seems that doing the "clone --pull" twice with generaldelta enabled compacts metadata even >> more: jdk/jdk .hg size fell from 1.5 GB to 1.2 GB uncompressed, and from 750M to 590M >> xz9-compressed. I just fixed my build scripts and currently testing them. >> >> -Aleksey -- Mario Torre Associate Manager, Software Engineering Red Hat GmbH 9704 A60C B4BE A8B8 0F30 9205 5D7E 4952 3F65 7898 From shade at redhat.com Mon Jul 30 13:27:38 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 30 Jul 2018 15:27:38 +0200 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: <1532954204.local-c2774ebe-1ee1-v1.3.0-fd741eb7@getmailspring.com> References: <3cd25e13-2b99-031e-b2de-ea406daf6ae0@redhat.com> <1532954204.local-c2774ebe-1ee1-v1.3.0-fd741eb7@getmailspring.com> Message-ID: On 07/30/2018 02:41 PM, Erik ?sterlund wrote: > I don't see the fascination with reinventing the source code hosting wheel again for our > project. Perhaps there is a good point to it, but I can't currently seem to see it. This "fascination" is pragmatism: deal with whatever is currently available, and move to doing the actual work. I have seen way too many attempts to fix something known good with something potentially perfect, breaking lots of stuff in progress (I am guilty of quite a few of these moves!). There are two very distinct starting points: OpenJDK with Git from day one, and OpenJDK migration from Mercurial to Git -- I am not sure the benefit of the second one is not marginal. > It seems like some are saying that with custom mercurial hacks we can achieve smaller repos competitive with git for cloning. Well, *I* am saying that there are known ways to deal with scale in Mercurial repos, and they do make sense in the context of large project like OpenJDK. Calling them "hacks" is the appeal to emotion, which we need to avoid -- not the last reason for that is many Git features can be labeled the same, in anger. > But is it worth our focus and effort to reinvent source code hosting again because OpenJDK is so > special, instead of just putting it on github like everybody else and have small repos natively with > git (without hacks), good tooling, fast access with mirrors everywhere, backups, cross-repo object > sharing, programmable bots (that can be used to e.g. check automatically if it builds on Oracle > external platforms like PPC/S390/AArch64/Zero, so we can notice problems before they are pushed), > etc for free? I would personally rather ride on the source code hosting experience and expertise of > GitHub than to chase after homegrown solutions to patch the problems. Okay, this conflates moving to Git and moving to GitHub, which are different stories. Homegrown solutions are there for a reason: workflow control, security, reliability, performance are in your hands -- for better or for worse. Relying on (opinionated) community infrastructure is a risk that needs to be quantified. Doing things "like everybody else" is weird policy for large project -- which I am reminded about every time GitHub goes down, or I see another "do not submit PRs here" repo warning. Let's not pretend that moving to Git/GitHub is such a no-brainer idea. This is why I am very happy Project Skara is proposed to quantify risks and benefits of SCM infra changes. -Aleksey From stuart.monteith at linaro.org Mon Jul 30 15:35:14 2018 From: stuart.monteith at linaro.org (Stuart Monteith) Date: Mon, 30 Jul 2018 16:35:14 +0100 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: References: Message-ID: Hi, This is perhaps tangential, but has there been any discussion on replacing http://hg.openjdk.java.net/ with https://hg.openjdk.java.net/ ? I would expect any future setup to take some measures to avoid man-in-the-middle issues. Stuart On Fri, 27 Jul 2018 at 05:23, joe darcy wrote: > > Hello, > > The source code management (SCM) system of a software project is a > fundamental piece of its infrastructure and workflows. Starting in > February 2008, the source code of different JDK releases and supporting > projects has been hosted in Mercurial repositories under > http://hg.openjdk.java.net/. Code reviews of JDK changes are typically > conducted as discussions in mailing lists over small patches sent to one > or more lists or over webrevs hosted on cr.openjdk.java.net. Since 2008, > many open source projects have successfully adopted more efficient SCM > and review tooling, in some cases provided by third parties. > > In order to help OpenJDK contributors be more productive, both seasoned > committers and relative newcomers, the Skara project proposes to > investigate alternative SCM and code review options for the JDK source > code, including options based upon Git rather than Mercurial, and > including options hosted by third parties. > > The Skara project intends to build prototypes of hosting the JDK 12 > sources under different providers. > > The evaluation criteria to consider include but are not limited to: > > * Performance: time for clone operations from master repos, time of > local operations, etc. > > * Space efficiency > > * Usability in different geographies > > * Support for common development environments such as Linux, Mac, > and Windows > > * Able to easily host the entire history of the JDK and the > projected growth of its history over the next decade > > * Support for general JDK code review practices > > * Programmatic APIs to enable process assistance and automation of > review and processes > > If one or more prototypes indicate a different SCM arrangement offers > substantial improvements over the current situation, the Skara project > will shepherd a JEP to change the SCM for the JDK. > > I propose to lead the project with the initial reviewers including but > not limited to Tim Bell (tbell), Erik Duveblad (ehelin), Erik Joelsson > (erikj), Tiep Vo (tiep), Tony Squier (squierts), and Robin Westberg > (rwestberg). > > We suggest the build group sponsor this work. > > Changing the bug tracking system is out of scope for this project and is > *not* under consideration. > > Comments? > > Cheers, > > -Joe > From patrick at reini.net Mon Jul 30 21:00:11 2018 From: patrick at reini.net (Patrick Reinhart) Date: Mon, 30 Jul 2018 23:00:11 +0200 Subject: Call for Discussion: New Project: Skara -- investigating source code management options for the JDK sources In-Reply-To: References: Message-ID: > Am 30.07.2018 um 12:26 schrieb Mario Torre : >> +1 on that. With a simple email-based review process (plus webrevs hosted on >> cr.o.j.n) I can easily scan dozens of incoming changes to see if they may be >> something I need to dive more deeply into. That is all lost if reviews >> happens inside some other system - even if a notification email is generated >> when such reviews are initiated. >> >> Changes to the review processes/tools should be kept a separate as possible >> from the selection of an underlying SCM. > > Yes, my point exactly, I can live with git if that's the consensus, > but not with a GitHub style process. > That?s exactly what I tried to point out also in my statement earlier. For me there is Git as SCM and work flow handling as Github provides for example (which seems for me does not fit for the OpenJDK kind of workflow) -Patrick