RFR: race with nested repos in hgforest.sh
Chris Hegarty
chris.hegarty at oracle.com
Wed Feb 6 10:04:56 UTC 2013
Thank for the review. comments inline...
On 06/02/2013 01:53, David Holmes wrote:
> 204 # Terminate with exit 0 only if all subprocesses were successful
> 205 ec=0
> 206 if [ -d ${tmp} ]; then
> 207 for rc in `ls ${tmp}/*.pid.rc` ; do
>
> I don't think you need to subshell an invocation of ls, simply do
>
> for rc in "${tmp}/*.pid.rc" ; do
Yes this is better, but the quotes are incorrect. It should simply be
for rc in ${tmp}/*.pid.rc ; do
> 208 exit_code=`cat ${rc} | tr -d ' \n\r'`
> 209 if [ "${exit_code}" != "0" ] ; then
> 210 echo "WARNING: ${rc} exited abnormally."
>
> Not sure printing ${rc} will be very informative there ??
It prints the file name which contains the subrepo name. Not pretty, but
this gives more information about which clone/pull failed.
> 211 ec=1
>
> Is there a "break" we can add here?
Yes, we could break out/exit early, but I deliberately want to iterator
over all the child process exits codes to better highlight which ones,
if more than one, failed.
> In get_source.sh
>
> # Get clones of all nested repositories
> ! sh ./common/bin/hgforest.sh clone "$@" || exit 1
>
> # Update all existing repositories to the latest sources
> sh ./common/bin/hgforest.sh pull -u
>
> don't we want "|| exit 1" for the pull case as well?
Since the pull is the last command there is no need for "|| exit 1",
since its exit code will be returned.
-Chris.
>
> Thanks,
> David
>
> On 6/02/2013 12:07 AM, Chris Hegarty wrote:
>> On 02/05/2013 06:21 AM, David Holmes wrote:
>>> Chris,
>>>
>>> When these failures occur does the failure get reflected in an error
>>> exit code?
>>
>> Not currently, good point.
>>
>>> I'm seeing hudson builds merilly buildinh OpenJDK instead of Oracle JDK
>>> because the closed repos were skipped due to this bug :(
>>
>> This is what I am seeing too, and the reason I proposed a patch to
>> resolve it.
>>
>> Updated webrev;
>> http://cr.openjdk.java.net/~chegar/hgforest_nestedRepos/webrev/
>>
>> 1) now checks, and returns, appropriate error code, so that higher
>> level scripts can determine if it completed successfully
>> 2) Added diagnostic error message to help identify potential future
>> problems if a nested repo is blocked for too long
>>
>> -Chris.
>>
>>>
>>> David
>>>
>>> On 2/02/2013 12:40 AM, Chris Hegarty wrote:
>>>> [ to build-dev and core-libs-dev, expect reviewer from either, but will
>>>> integrate through jdk8/tl ]
>>>>
>>>> This issue is mainly of interest to Oracle engineers, but it effects
>>>> the
>>>> public hgforest script.
>>>>
>>>> When hgforest.sh is run with an addition argument to specify a closed
>>>> server, there is a problem/race between the creation of the directories
>>>> to hold nested repositories and the clone itself. These directories
>>>> need
>>>> to be created before the clone command is executed, otherwise it will
>>>> fail, as below.
>>>>
>>>> The trivial fix is to back off these nested repos until their
>>>> containing
>>>> directory exists.
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~chegar/hgforest_nestedRepos/webrev/
>>>>
>>>> sh ./get_source.sh http://xxx.yyy.oracle.com | & tee clone.log
>>>> # Repositories: corba jaxp jaxws langtools jdk hotspot jdk/src/closed
>>>> jdk/make/closed jdk/test/closed hotspot/make/closed hotspot/src/closed
>>>> hotspot/test/closed deploy install sponsors pubs
>>>>
>>>> Waiting 5 secs before spawning next background command.
>>>> Waiting 5 secs before spawning next background command.
>>>> Waiting 5 secs before spawning next background command.
>>>> jdk/src/closed: /java/devtools/sparc/mercurial/0.9.5/bin/python -u
>>>> /java/devtools/sparc/mercurial/latest/bin/hg clone
>>>> http://xxx.yyy.oracle.com/jdk8/tl/jdk/src/closed jdk/src/closed
>>>> jdk/src/closed: abort: No such file or directory: jdk/src/closed
>>>> jdk/make/closed: /java/devtools/sparc/mercurial/0.9.5/bin/python -u
>>>> /java/devtools/sparc/mercurial/latest/bin/hg clone
>>>> http://xxx.yyy.oracle.com/jdk8/tl/jdk/make/closed jdk/make/closed
>>>> jdk/make/closed: abort: No such file or directory: jdk/make/closed
>>>> Waiting 5 secs before spawning next background command.
>>>> ....
>>>>
>>>>
>>>> -Chris.
More information about the build-dev
mailing list