Some update on Cygwin hangs
Magnus Ihse Bursie
magnus.ihse.bursie at oracle.com
Thu Oct 18 02:39:43 PDT 2012
Erik and I have been chasing the cygwin instability for a while. This is
a report of our (or at least mine :)) current understanding.
* Overall, cygwin builds seems pretty stable nowadays, even on Windows 7.
* The one glaring exception is a hang that's quite repeatable, under the
right circumstances.
* This hang happens while building images. Typically you see an output
about "ctsym" right before the hang. We've named this the "ctsym bug",
even though ctsym does not have anything to do with it. (It's just about
the last successful thing we managed to do before the hang).
* Adding a simple "echo" output between running create-jars and images
made the issue much harder to repeat.
* When running with JOBS=1, the hang is much harder to reproduce.
I made a small hack that ran just the images target with -j1, and that
run orders of magnitude more times before hanging than when running with
default parallelism (4 on my machine). However, in the end, it still
hanged, but after like 2-3 days.
I have now managed to reproduce the hang with JOBS=1 LOG=trace. Of
course, adding debugging might change things enough that this is not the
real case, but it seems likely to be.
This is how far we got:
* make -f Images.gmk
* all static code (VAR=$(shell ...), mostly a bunch of find's) have been
executed.
* we have *just* started executing our first rule, which is at line 77
in Images.gmk:
$(JRE_IMAGE_DIR)/bin/%: $(JDK_OUTPUTDIR)/bin/%
$(ECHO) $(LOG_INFO) Copying $(patsubst $(OUTPUT_ROOT)/%,%,$@)
$(install-file)
The last few lines of output is:
Images.gmk:78: Building
/cygdrive/c/localdata/hg/build-infra-jdk8-b/build/windows-x86_64-normal-server-release/images/j2re-image/bin/attach.diz
(from
/cygdrive/c/localdata/hg/build-infra-jdk8-b/build/windows-x86_64-normal-server-release/jdk/bin/attach.diz)
(/cygdrive/c/localdata/hg/build-infra-jdk8-b/build/windows-x86_64-normal-server-release/jdk/bin/attach.diz
newer)
/usr/bin/echo Copying images/j2re-image/bin/attach.diz
and then we hang. The macro install-file is defined as such:
ifeq ($(OPENJDK_TARGET_OS),solaris)
# On Solaris, if the target is a symlink and exists, cp won't overwrite.
define install-file
# ...
endef
else ifeq ($(OPENJDK_TARGET_OS),macosx)
define install-file
# ...
endef
else
define install-file
$(MKDIR) -p $(@D)
$(CP) -fP '$<' '$@'
endef
endif
So we seem to get stuck at the mkdir. Let's check the output dir!
$ ls
/cygdrive/c/localdata/hg/build-infra-jdk8-b/build/windows-x86_64-normal-server-release/images/j2re-image/bin
ls: cannot access
/cygdrive/c/localdata/hg/build-infra-jdk8-b/build/windows-x86_64-normal-server-release/images/j2re-image/bin:
No such file or directory
Aha! Not created. We haven't even started the recurseive mkdir:
$ ls
/cygdrive/c/localdata/hg/build-infra-jdk8-b/build/windows-x86_64-normal-server-release/images
lib local_policy_jar.tmp src src.zip symbols US_export_policy_jar.tmp
Look, no j2re-image directory.
So what do we make of this? I don't know. I'm not sure how to proceed on
this, except to add some more debug output. It might be that multi-level
directory creation (mkdir -p needed to create both j2re-image and
j2re-image/bin) is unstable in make in Windows. On the other hand, since
we're running with LOG=trace, make should always execute the external
shell and not try to shortcut it for known operations.
I didn't say I had a solution, just an update. :-)
/Magnus
More information about the build-infra-dev
mailing list