hg push getting worse?

Tue Aug 11 11:58:20 PDT 2009

2009/8/11 John Coomes <John.Coomes at sun.com>:
> Andrew John Hughes (gnu_andrew at member.fsf.org) wrote:
>> > ...
>> > From my experience with the OpenJDK servers, I think it may be jcheck.
>> > Â Unfortunately, jcheck itself is not Free Software, so it is
>> > impossible for me to say for definite. Â Instead, we seem to be
>> > discovering what checks it performs only when failures occur.
>> >
>> > I do know it does the following:
>> >
>> > * Checks the use of whitespace in changesets, rejecting the use of
>> > tabs and trailing whitespace.
>> > * Checks the format of the commit message. Â It must even follow the
>> > bugid/summary/reviewer format documented in the developer guide or
>> > simply be 'Merge'.
>> > * Checks that the bugid used is not used by any other changeset.
>> >
>> > I've run into all of these with different changesets. Â I can
>> > definitely see how the latter could take a while on something like the
>> > jdk repository (which dwarfs the others in size).
>
> I think jcheck is relatively light-weight compared to the zfs snapshot
> and auto-push that are also done.  But it would be nice to know for
> sure.
>

True.  It's hard to tell from the client side as hg appears to hang
'searching for changes' and then suddenly a burst of information
appears, either reporting success or failure.  I presume this is part
of the design of hg unfortunately, but it would be helpful if things
were more verbose.  I presume hooks were never meant to utilise enough
time that this would become an issue.

I can see how zfs snapshotting would take longer over time.  Although
the snapshots are, I presume, relative to the last so that the size
remains fairly sane, the sheer number of them is likely to cause a
slowdown.  Is there a significant benefit to creating these?  And are
they also created in Maurizio's FX project?

>> > jcheck appears to perform its check on the changeset once committed to
>> > the remote repository and then performs a rollback if it fails,
>> > although you don't see any of this feedback in realtime on the client.
>> > Â I presume this also means it has to get Mercurial to generate the
>> > changeset (and others for the bugid check) which would take time.
>
> That's the standard way that mercurial hooks work, and the reason we
> have the extra *-gate forests.  In more recent versions of mercurial,
> the pending changesets aren't visible until the hooks have completed
> successfully, so the *-gate forests become unnecessary.  We haven't
> been able to update because the server-side of the forest extension
> doesn't work w/recent mercurial releases.
>

Is there any further news on whether the forest extension will become
a standard part of Mercurial? When I went searching for a client-side
version to support newer versions, I had to resort to using a snapshot
of their repository for forest.

>> > It would be much better if we could perform these sanity checks
>> > locally, though we'd still need some way of checking this had been
>> > done on the server side (or who knows, we could trust the developers
>> > to have done it...).
>>
>> I should also note that duplicate bugids can appear completely
>> legitimately in certain merge cases.  This is the current issue with
>> updating OpenJDK6's HotSpot..  Fixes for bugs occur both in the
>> original (rebased) OpenJDK6 HotSpot and the copy of HotSpot being
>> merged from the hs14 repository, resulting in duplicate bugids which
>> jcheck rejects.  I presume this is to protect against the case that
>> someone mistypes the bugid as another legitimate bugid, but it is
>> certainly an expensive check for what seems a fairly unlikely
>> occurrence.
>
> The goal is to prevent people from checking in fixes piece-meal,
> scattering a bug fix over 2, 3 or more changesets.  Having a bug fix
> isolated to a single changeset makes backporting it to a different
> release much, much easier.  We lived with the practice of partial
> fixes for quite some time, and I'd rather not go back.
>

I agree.  I'm just not sure that the best way of enforcing this is a
technical one.  The OpenJDK6 example is fairly convoluted, but I
imagine our IcedTea forest would be prone to this too, were it to be
jchecked.  We often commit a fix there early, and then also receive
the same fix in a later pull from the JDK7 forest.

All changes have to be approved first, so could not this issue be
flagged at that point?  The history could also be scanned for
duplicates at that point if someone really wanted to be sure.  The
case is even worse with the whitespace checks.  Given jcheck clearly
can find the issues, can it not also fix them?  Or could some of these
checks be applied on the client-side? Worst case, could the whole
repo. not just be sanitised on a regular basis?  As is, you push a
commit, wait several minutes and then get a result from jcheck saying
there is an extra space at the end of one line.  So, you have to
rollback the commit, remove the space, reapply the commit with the
same comments and then push again.  In all, a lot of time is spent for
very little gain.

> -John
>
>

-- 
Andrew :-)

Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

Support Free Java!
Contribute to GNU Classpath and the OpenJDK
http://www.gnu.org/software/classpath
http://openjdk.java.net

PGP Key: 94EFD9D8 (http://subkeys.pgp.net)
Fingerprint: F8EF F1EA 401E 2E60 15FA  7927 142C 2591 94EF D9D8