Recording source information in a build
Kelly O'Hair
Kelly.Ohair at Sun.COM
Thu Apr 3 18:36:19 UTC 2008
Problem Statement:
Given a build of the OpenJDK, how can you find out what source was used to
build this binary install?
Seed of a Solution:
With Mercurial, a single repository changeset number identifies the state of
the complete source repository. If this changeset (or set of changesets)
could be somehow recorded with the built bits, then given any build you
could quickly and easily reconstruct the exact source files that were used
at build time.
Problems:
We have a forest not a single repository.
We often create source bundles (sources minus the SCM management data, e.g. ".hg")
so we need this to work in the face of building from source bundles.
Possible Solution:
First issue is identifying a repository of the forest relative to the root of the forest.
So each repository would get a managed file ".identification" which would contain
information to help identify the repository.
For example, the topmost OpenJDK one would have a ".identification" file containing:
root=.
directory=.
description=Root of the JDK Source Tree
and the corba one would have:
root=..
directory=corba
description=Corba Sources
etc. (the directory could be a deeper nested directory, like jdk/src/closed)
This .identification file would be a permanent file in the repository, at the root
of the repository. It's saying that to get to the root of the forest, you
'cd ${root}'. And if this repository is not located at ${root}/${directory}
something is wrong, or the repository is not currently part of a forest.
Second issue, the changeset id.
A second file called ".changeset" would not be a managed file and would be created
before the source bundles are created, and be non-existent if they can't be created
because you don't have repositories (building from raw source trees) or don't have
access to 'hg'. These files would just contain a changeset=id, created with:
hg tip --template 'changeset={node}\n'
So somewhere this needs to happen, before source bundles are created and before
the use of this data:
TREES:=$(shell hg ftrees)
if [ "$(TREES)" != "" ] ; then
for i in $(TREES) ; do
(cd $i && hg tip --template 'changeset={node}\n' > .changeset )
done
fi
Third, all this data needs to be merged together into a file that could be
used later to recreate the source tree by running:
hg clone -r ${changeset} http://hg.openjdk.java.net/jdk7/${directory} ${directory}
as many times as needed.
The Makefiles would be sensitive to the existence of the .changeset files and
allow for them to not exist where they are used, they might not be there in
all cases. But when they are there, do something like:
jdk_source_information.txt:
$(RM) $@
echo "# JDK Source Information" > $@
if [ "$(TREES)" != "" ] ; then
for i in $(TREES) ; do
if [ -f ${i}/.identification ] ; then
cat ${i}/.identification >> $@
if [ -f ${i}/.changeset ] ; then
cat ${i}/.changeset >> $@
fi
fi
done
fi
Resulting in a file:
# JDK Source Information
root=.
directory=.
description=Root of the JDK Source Tree
changeset=BIGHEXNUMBER
root=..
directory=corba
description=Corba Sources
changeset=BIGHEXNUMBER
...
Left in the jdk install tree.
---
Just a first guess at a basic idea as to how this could work...
Please don't assume the above is also an implementation, it's the basic idea
of having members of the forest identify themselves, and the idea of
recording the changesets, and finally of leaving source information in
the resulting binary build.
Comments?
-kto
P.S. Full RFE can be seen at: http://bugs.sun.com/view_bug.do?bug_id=6631003
More information about the build-dev
mailing list