Some update on Cygwin hangs

David Holmes david.holmes at oracle.com
Thu Oct 18 03:46:59 PDT 2012


So when this hangs can you see a mkdir process and the associated shell?

David

On 18/10/2012 7:39 PM, Magnus Ihse Bursie wrote:
> Erik and I have been chasing the cygwin instability for a while. This is
> a report of our (or at least mine :)) current understanding.
> * Overall, cygwin builds seems pretty stable nowadays, even on Windows 7.
> * The one glaring exception is a hang that's quite repeatable, under the
> right circumstances.
> * This hang happens while building images. Typically you see an output
> about "ctsym" right before the hang. We've named this the "ctsym bug",
> even though ctsym does not have anything to do with it. (It's just about
> the last successful thing we managed to do before the hang).
> * Adding a simple "echo" output between running create-jars and images
> made the issue much harder to repeat.
> * When running with JOBS=1, the hang is much harder to reproduce.
>
> I made a small hack that ran just the images target with -j1, and that
> run orders of magnitude more times before hanging than when running with
> default parallelism (4 on my machine). However, in the end, it still
> hanged, but after like 2-3 days.
>
> I have now managed to reproduce the hang with JOBS=1 LOG=trace. Of
> course, adding debugging might change things enough that this is not the
> real case, but it seems likely to be.
>
> This is how far we got:
> * make -f Images.gmk
> * all static code (VAR=$(shell ...), mostly a bunch of find's) have been
> executed.
> * we have *just* started executing our first rule, which is at line 77
> in Images.gmk:
> $(JRE_IMAGE_DIR)/bin/%: $(JDK_OUTPUTDIR)/bin/%
> $(ECHO) $(LOG_INFO) Copying $(patsubst $(OUTPUT_ROOT)/%,%,$@)
> $(install-file)
>
> The last few lines of output is:
> Images.gmk:78: Building
> /cygdrive/c/localdata/hg/build-infra-jdk8-b/build/windows-x86_64-normal-server-release/images/j2re-image/bin/attach.diz
> (from
> /cygdrive/c/localdata/hg/build-infra-jdk8-b/build/windows-x86_64-normal-server-release/jdk/bin/attach.diz)
> (/cygdrive/c/localdata/hg/build-infra-jdk8-b/build/windows-x86_64-normal-server-release/jdk/bin/attach.diz
> newer)
> /usr/bin/echo Copying images/j2re-image/bin/attach.diz
>
> and then we hang. The macro install-file is defined as such:
> ifeq ($(OPENJDK_TARGET_OS),solaris)
> # On Solaris, if the target is a symlink and exists, cp won't overwrite.
> define install-file
> # ...
> endef
> else ifeq ($(OPENJDK_TARGET_OS),macosx)
> define install-file
> # ...
> endef
> else
> define install-file
> $(MKDIR) -p $(@D)
> $(CP) -fP '$<' '$@'
> endef
> endif
>
> So we seem to get stuck at the mkdir. Let's check the output dir!
> $ ls
> /cygdrive/c/localdata/hg/build-infra-jdk8-b/build/windows-x86_64-normal-server-release/images/j2re-image/bin
>
> ls: cannot access
> /cygdrive/c/localdata/hg/build-infra-jdk8-b/build/windows-x86_64-normal-server-release/images/j2re-image/bin:
> No such file or directory
>
> Aha! Not created. We haven't even started the recurseive mkdir:
> $ ls
> /cygdrive/c/localdata/hg/build-infra-jdk8-b/build/windows-x86_64-normal-server-release/images
>
> lib local_policy_jar.tmp src src.zip symbols US_export_policy_jar.tmp
>
> Look, no j2re-image directory.
>
> So what do we make of this? I don't know. I'm not sure how to proceed on
> this, except to add some more debug output. It might be that multi-level
> directory creation (mkdir -p needed to create both j2re-image and
> j2re-image/bin) is unstable in make in Windows. On the other hand, since
> we're running with LOG=trace, make should always execute the external
> shell and not try to shortcut it for known operations.
>
> I didn't say I had a solution, just an update. :-)
>
> /Magnus
>



More information about the build-infra-dev mailing list