Looking ahead: proposed Hg forest consolidation for JDK 10

Thomas Stüfe thomas.stuefe at gmail.com
Wed Oct 19 05:20:34 UTC 2016


Hi Dean,

On Tue, Oct 18, 2016 at 9:43 PM, <dean.long at oracle.com> wrote:

> Hi Thomas,
>
>
>
> On 10/18/16 7:39 AM, Thomas Stüfe wrote:
>
> Hi all,
>>
>> On Mon, Oct 17, 2016 at 2:22 PM, Lindenmaier, Goetz <
>> goetz.lindenmaier at sap.com> wrote:
>>
>> Hi,
>>>
>>> I'm on a 1.6 TB share available to our team visible on all our
>>> servers. The machine is a 48 processor 64GB linux x86_64 server.
>>> The servers only have limited local disc space.  Especially with the
>>> new setup I can not have clones on all the machines I need to compile
>>> and test on.
>>>
>>> Best regards,
>>>    Goetz.
>>>
>>>
>>> I'm on Goetz team at SAP. Just wanted to give some additional
>> information:
>>
>> I keep reading "use local repos". This is unfortunately not practical for
>> us. Limited local disk space on build machines is only the smallest of the
>> problems. A bigger problem is our platform breadth.
>>
>> We have a large zoo of machines running a number of cpu/os combinations.
>> When we develop for the OpenJDK, we want to build and test on multiple -
>> preferably all - machines and platforms the OpenJDK is supported on, to
>> make sure we do not introduce regressions. But we have no automatic build
>> system for these platforms, nor do we have access to your jprt.
>>
>> So, as an example the fix for JDK-8166944 I did build on Linux (x64, ppc,
>> s390), Solaris (sparc, x64), AIX, MacOS, and Windows x64. I usually leave
>> out platforms I think are safe - in this case 32bit and zero - but at my
>> own risk, because if I introduce a regression because I do not build, I
>> risk the annoyance of other developers.
>>
>> Therefore local repos are not practical - syncing local repos across many
>> machines is a pain and very error prone. So, every developer keeps his
>> repos on a filer (NFS) and so if the NFS client on a certain machine is
>> not
>> good, we wait and suffer.
>>
>> So performance matters for us a lot.
>>
>
> I'm curious if you are using file caching on the NFS clients.  That
> feature should be available on linux and solaris at least, and would
> hopefully improve performance.
>
>
Thanks for the hint! We have not all machines we work under our control,
but we can see what we can do.

Thomas


> dl
>
>
> Kind Regards, Thomas
>>
>>
>> -----Original Message-----
>>>> From: Erik Helin [mailto:erik.helin at oracle.com]
>>>> Sent: Montag, 17. Oktober 2016 13:42
>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
>>>> Cc: Anthony Vanelverdinghe <anthony.vanelverdinghe at gmail.com>; jdk9-
>>>> dev at openjdk.java.net
>>>> Subject: Re: Looking ahead: proposed Hg forest consolidation for JDK 10
>>>>
>>>> On 2016-10-17, Lindenmaier, Goetz wrote:
>>>>
>>>>> Hi Anthony and Joe,
>>>>>
>>>>> such as hg rebase, hg histedit, hg graft, hg strip, hg strip --keep,
>>>>>>
>>>>> and hg
>>>
>>>> commit --amend" [7].
>>>>
>>>>> I understand how these commands replace hg queues.
>>>>> I don't understand how they replace having several
>>>>> clones to do something like this:
>>>>>   -  Change 1 is compiling  (in clone 1)
>>>>>   -  Debugging change 2 (in clone 2)
>>>>>   - I get a review and want to immediately edit change 3 (in clone 3),
>>>>>
>>>> without
>>>
>>>>     invalidating the sources I stepped in the debugging session and
>>>>>
>>>> without
>>>
>>>>     canceling the builds.
>>>>>
>>>> I would recommend two similar but slighty different workflows for this:
>>>> - local clones (as previously discussed). One local clone per
>>>>    feature/bug/review/debugging-session.
>>>> - shares, using `hg share` and bookmarks. One bookmark in each hg share.
>>>>
>>>> If I use a single clone, I can have these three changes in three
>>>>>
>>>> branches,
>>>
>>>> or in several  sequential changes (like a mercurial queue) I keep
>>>>>
>>>> reordering
>>>
>>>> with
>>>>
>>>>> histedit. But as there is only one source tree I can only work on one
>>>>>
>>>> of
>>>
>>>> the changes at a time.
>>>>> Hg share will do this job I assume.
>>>>>
>>>>> I have been looking at the consol-proto:
>>>>>
>>>> The timings seems an order of magnitude off? What kind of machine are
>>>> you using? See inline for my measurements, all done on a 1 year old
>>>> Samsung SSD with an ext4 filesystem and Linux kernel 4.3.3. I'm using hg
>>>> version 3.8.1.
>>>>
>>>> hg clone takes 30 minutes.  Before, get_source.sh took 20 mins.
>>>>>
>>>> $ time hg clone http://hg.openjdk.java.net/jdk9/consol-proto
>>>> requesting all changes
>>>> adding changesets
>>>> adding manifests
>>>> adding file changes
>>>> added 41157 changesets with 358201 changes to 148305 files
>>>> updating to branch default
>>>> 53435 files updated, 0 files merged, 0 files removed, 0 files unresolved
>>>>
>>>> real    17m42.057s
>>>> user    4m37.156s
>>>> sys     0m41.153s
>>>>
>>>> The time for the remote clone will depend mostly on yours (and
>>>> Oracle's) network. The amount of bits that needs to be transferred are
>>>> almost the same (differs on a few MB IIRC) compared to a forest. The
>>>> difference is that the forest downloads the metadata for multiple
>>>> repositories in parallel.
>>>>
>>>> I know you find it cumbersome, but my recommendation here would be to
>>>> not clone that often from the remote servers. Since mercurial is a DVCS,
>>>> you already have most of the bits locally on your machine if you've
>>>> already cloned once.
>>>>
>>>> Hg share takes 5 minutes. Before, hg clone of hotspot took 3 mins.
>>>>>
>>>> $ hg clone http://hg.openjdk.java.net/jdk9/consol-proto
>>>> $ time hg share --bookmarks consol-proto share
>>>> updating working directory
>>>> 53435 files updated, 0 files merged, 0 files removed, 0 files unresolved
>>>>
>>>> real    0m7.528s
>>>> user    0m28.442s
>>>> sys     0m9.791s
>>>>
>>>> I have no idea why it takes 5 (or even 3) minutes on your machine?
>>>>
>>>> Hg diff takes 32 secs!!!, before it took 4 secs on hotspot repo.
>>>>>
>>>> $ hg clone http://hg.openjdk.java.net/jdk9/consol-proto
>>>> $ cd consol-proto/src/hotspot/
>>>> $ wget http://cr.openjdk.java.net/~goetz/wr16/8166560-
>>>> basic_s390/hotspot.wr04/hotspot.changeset
>>>> $ patch -p2 hotspot.changeset # skip changes to jdk.hotspot.agent
>>>> $ time hg diff
>>>> M src/hotspot/os/linux/vm/os_linux.cpp
>>>> M src/hotspot/share/tools/hsdis/hsdis.c
>>>> M src/hotspot/share/vm/code/codeCache.cpp
>>>> M src/hotspot/share/vm/interpreter/abstractInterpreter.hpp
>>>> M src/hotspot/share/vm/runtime/globals.hpp
>>>> M src/hotspot/share/vm/runtime/vm_version.cpp
>>>> M src/hotspot/share/vm/utilities/macros.hpp
>>>> ? src/hotspot/hotspot.changeset
>>>> ? src/hotspot/share/vm/interpreter/abstractInterpreter.hpp.orig
>>>> ? src/hotspot/share/vm/runtime/globals.hpp.orig
>>>>
>>>> real    0m0.787s
>>>> user    0m0.587s
>>>> sys     0m0.200s
>>>>
>>>> What patch/changes did you apply before running `hg diff`? I have hard
>>>> time getting `hg diff` to take longer than 1 second...
>>>>
>>>> The full repo requires 1.9G.
>>>>>
>>>> $ hg clone http://hg.openjdk.java.net/jdk9/consol-proto
>>>> $ du -ms consol-proto
>>>> 1754    consol-proto/
>>>>
>>>> I don't know why yours is 191 MB larger than mine. Different filesystem
>>>> and or hg version?
>>>>
>>>> A 'hg share' repo requires 0.6G
>>>>>
>>>> $ mkdir measurements && cd measurements
>>>> $ hg clone http://hg.openjdk.java.net/jdk9/consol-proto
>>>> $ du -ms .
>>>> 1754    .
>>>>
>>>> $ hg share console-proto share
>>>> $ du -ms .
>>>> 2415    .
>>>>
>>>> so 2415 - 1754 = 661 MB for a share, so similar to my measurements.
>>>>
>>>> Thanks,
>>>> Erik
>>>>
>>>> A hotspot repo before required 0.2G.
>>>>>
>>>>> I will be able to live with this using modern, slower functionality ;)
>>>>> But it imposes a considerable overhead in hardware, tool runtime
>>>>> and administration on my side.
>>>>>
>>>>> Best regards,
>>>>>    Goetz.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Can I check out several
>>>>>
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>>> From: jdk9-dev [mailto:jdk9-dev-bounces at openjdk.java.net] On Behalf
>>>>>>
>>>>> Of
>>>>
>>>>> Anthony Vanelverdinghe
>>>>>> Sent: Freitag, 14. Oktober 2016 22:11
>>>>>> To: jdk9-dev at openjdk.java.net
>>>>>> Subject: Re: Looking ahead: proposed Hg forest consolidation for JDK
>>>>>>
>>>>> 10
>>>
>>>> Hi
>>>>>>
>>>>>> While I'm not an OpenJDK committer (yet, hope to become so one fine
>>>>>> day), I believe this is a great initiative. Since several people have
>>>>>> raised concerns related to the increased repository size, I just
>>>>>>
>>>>> wanted
>>>
>>>> to point out that both Facebook and Mozilla work with Mercurial
>>>>>> repositories which dwarf a typical OpenJDK repository. For example, a
>>>>>> clone of mozilla-central [1] is 2.79 GB, whereas a clone of jdk9-dev
>>>>>>
>>>>> is
>>>
>>>> less than 1 GB.
>>>>>>
>>>>>> There are also several Mercurial extensions which may prove to be
>>>>>>
>>>>> useful
>>>
>>>> for people having to work with large repositories and/or adapt their
>>>>>> workflows.
>>>>>> During their work to scale Mercurial [2], Facebook contributed/made
>>>>>> several extensions, such as fsmonitor as mentioned by Joe [3], and
>>>>>>
>>>>> the
>>>
>>>> ones in their BitBucket repository [4], such as remotefilelog [5].
>>>>>> When working with local clones, the share extension may be helpful
>>>>>>
>>>>> [6].
>>>
>>>> Finally, note that mq is "often considered for deprecation", so this
>>>>>>
>>>>> may
>>>
>>>> be an opportunity to adopt "modern tools, such as hg rebase, hg
>>>>>> histedit, hg graft, hg strip, hg strip --keep, and hg commit
>>>>>>
>>>>> --amend" [7].
>>>
>>>> Kind regards,
>>>>>> Anthony
>>>>>>
>>>>>> [1] https://hg.mozilla.org/mozilla-central/
>>>>>> [2]
>>>>>> https://code.facebook.com/posts/218678814984400/scaling-mercurial-
>>>>>>
>>>>> at-
>>>>
>>>>> facebook/
>>>>>> [3] http://mail.openjdk.java.net/pipermail/jdk9-dev/2016-
>>>>>> October/004990.html
>>>>>> [4] https://bitbucket.org/facebook/hg-experimental
>>>>>> [5]
>>>>>> https://bitbucket.org/facebook/hg-
>>>>>>
>>>>>> experimental/src/d2c3a2c02eb6c7e5a7331ba0cf15e5bf7c8dc8dc/remotefilel
>>>>
>>>>> og/?at=default
>>>>>> [6] https://www.mercurial-scm.org/wiki/ShareExtension
>>>>>> [7] https://www.mercurial-scm.org/wiki/MqExtension
>>>>>>
>>>>>> On 14/10/2016 19:03, Brian Goetz wrote:
>>>>>>
>>>>>>> Conversely, I think it is reasonable for engineers making changes
>>>>>>>>
>>>>>>> to
>>>
>>>> the JDK to be wiling to offer some flexibility in adjusting
>>>>>>>> established worksflows optimized for the split repositories to
>>>>>>>> accommodate the sort of infrastructure changes being proposed here
>>>>>>>> for a consolidated one.
>>>>>>>>
>>>>>>> Let me amplify this: OpenJDK developers are not the only
>>>>>>>
>>>>>> stakeholders
>>>
>>>> here.   By aligning more with the way the rest of the world
>>>>>>>
>>>>>> develops
>>>
>>>> -- all code in one linearized, transactionally updated repo -- it
>>>>>>> increases the feasibility / reduces the cost of tools like
>>>>>>>
>>>>>> 'bisect' to
>>>
>>>> determine where a fault was introduced. This reduces SQE costs and
>>>>>>> increases product quality -- something we all have a stake in.
>>>>>>>
>>>>>> David
>>>
>>>> Lloyd has pointed up other tooling-related benefits, such as
>>>>>>>
>>>>>> making it
>>>
>>>> easier to maintain a git mirror.
>>>>>>>
>>>>>>> Most of the objections raised so far have been "(I think they will)
>>>>>>> make my life harder."  Fair enough; people should be their own
>>>>>>> advocates.  But let's not forget the significant benefits that
>>>>>>>
>>>>>> accrue
>>>
>>>> to *everyone* as a result, and keep those in mind when judging the
>>>>>>> pros and cons.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>


More information about the jdk9-dev mailing list