RFR: race with nested repos in hgforest.sh

David Holmes david.holmes at oracle.com
Wed Feb 6 10:24:52 UTC 2013


Thanks Chris. I missed a couple of obvious things there. :(

One thing related to the repo name in the $(rc) variable. Somewhere the 
repo name gets mangled:

+ echo jdk/src/closed
+ sed -e s at ./@@ -e s@/@_ at g
+ repopidfile=jdsrc_closed

the / to _ doesn't work for the first /  (not part of your fix though).

I must say though that this bug still has me perplexed. I ran with set 
-x and observed the following. Given:

# Repositories:  corba jaxp jaxws langtools jdk hotspot  jdk/src/closed 
jdk/make/closed jdk/test/closed hotspot/make/closed
hotspot/src/closed hotspot/test/closed deploy install sponsors pubs

The first clone I see is corba and the second is jdk/make/closed. The 
very last clone  I see is jdk. BUT there was no error creating the 
sub-repos before the enclosing repo! ??? Makes no sense.

David
-----

On 6/02/2013 8:04 PM, Chris Hegarty wrote:
> Thank for the review. comments inline...
>
> On 06/02/2013 01:53, David Holmes wrote:
>> 204 # Terminate with exit 0 only if all subprocesses were successful
>> 205 ec=0
>> 206 if [ -d ${tmp} ]; then
>> 207 for rc in `ls ${tmp}/*.pid.rc` ; do
>>
>> I don't think you need to subshell an invocation of ls, simply do
>>
>> for rc in "${tmp}/*.pid.rc" ; do
>
> Yes this is better, but the quotes are incorrect. It should simply be
> for rc in ${tmp}/*.pid.rc ; do
>
>> 208 exit_code=`cat ${rc} | tr -d ' \n\r'`
>> 209 if [ "${exit_code}" != "0" ] ; then
>> 210 echo "WARNING: ${rc} exited abnormally."
>>
>> Not sure printing ${rc} will be very informative there ??
>
> It prints the file name which contains the subrepo name. Not pretty, but
> this gives more information about which clone/pull failed.
>
>> 211 ec=1
>>
>> Is there a "break" we can add here?
>
> Yes, we could break out/exit early, but I deliberately want to iterator
> over all the child process exits codes to better highlight which ones,
> if more than one, failed.
>
>> In get_source.sh
>>
>> # Get clones of all nested repositories
>> ! sh ./common/bin/hgforest.sh clone "$@" || exit 1
>>
>> # Update all existing repositories to the latest sources
>> sh ./common/bin/hgforest.sh pull -u
>>
>> don't we want "|| exit 1" for the pull case as well?
>
> Since the pull is the last command there is no need for "|| exit 1",
> since its exit code will be returned.
>
> -Chris.
>
>>
>> Thanks,
>> David
>>
>> On 6/02/2013 12:07 AM, Chris Hegarty wrote:
>>> On 02/05/2013 06:21 AM, David Holmes wrote:
>>>> Chris,
>>>>
>>>> When these failures occur does the failure get reflected in an error
>>>> exit code?
>>>
>>> Not currently, good point.
>>>
>>>> I'm seeing hudson builds merilly buildinh OpenJDK instead of Oracle JDK
>>>> because the closed repos were skipped due to this bug :(
>>>
>>> This is what I am seeing too, and the reason I proposed a patch to
>>> resolve it.
>>>
>>> Updated webrev;
>>> http://cr.openjdk.java.net/~chegar/hgforest_nestedRepos/webrev/
>>>
>>> 1) now checks, and returns, appropriate error code, so that higher
>>> level scripts can determine if it completed successfully
>>> 2) Added diagnostic error message to help identify potential future
>>> problems if a nested repo is blocked for too long
>>>
>>> -Chris.
>>>
>>>>
>>>> David
>>>>
>>>> On 2/02/2013 12:40 AM, Chris Hegarty wrote:
>>>>> [ to build-dev and core-libs-dev, expect reviewer from either, but
>>>>> will
>>>>> integrate through jdk8/tl ]
>>>>>
>>>>> This issue is mainly of interest to Oracle engineers, but it effects
>>>>> the
>>>>> public hgforest script.
>>>>>
>>>>> When hgforest.sh is run with an addition argument to specify a closed
>>>>> server, there is a problem/race between the creation of the
>>>>> directories
>>>>> to hold nested repositories and the clone itself. These directories
>>>>> need
>>>>> to be created before the clone command is executed, otherwise it will
>>>>> fail, as below.
>>>>>
>>>>> The trivial fix is to back off these nested repos until their
>>>>> containing
>>>>> directory exists.
>>>>>
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~chegar/hgforest_nestedRepos/webrev/
>>>>>
>>>>> sh ./get_source.sh http://xxx.yyy.oracle.com | & tee clone.log
>>>>> # Repositories: corba jaxp jaxws langtools jdk hotspot jdk/src/closed
>>>>> jdk/make/closed jdk/test/closed hotspot/make/closed hotspot/src/closed
>>>>> hotspot/test/closed deploy install sponsors pubs
>>>>>
>>>>> Waiting 5 secs before spawning next background command.
>>>>> Waiting 5 secs before spawning next background command.
>>>>> Waiting 5 secs before spawning next background command.
>>>>> jdk/src/closed: /java/devtools/sparc/mercurial/0.9.5/bin/python -u
>>>>> /java/devtools/sparc/mercurial/latest/bin/hg clone
>>>>> http://xxx.yyy.oracle.com/jdk8/tl/jdk/src/closed jdk/src/closed
>>>>> jdk/src/closed: abort: No such file or directory: jdk/src/closed
>>>>> jdk/make/closed: /java/devtools/sparc/mercurial/0.9.5/bin/python -u
>>>>> /java/devtools/sparc/mercurial/latest/bin/hg clone
>>>>> http://xxx.yyy.oracle.com/jdk8/tl/jdk/make/closed jdk/make/closed
>>>>> jdk/make/closed: abort: No such file or directory: jdk/make/closed
>>>>> Waiting 5 secs before spawning next background command.
>>>>> ....
>>>>>
>>>>>
>>>>> -Chris.



More information about the core-libs-dev mailing list