History data for "JDK-8058150: Compile for Specific Platform Version"

Jan Lahoda jan.lahoda at oracle.com
Wed Mar 4 16:02:52 UTC 2015


Magnus, Jon,

As per your suggestions, I've split the ct.sym.txt into several files 
approximately per (current) module. I've changed to format to a 
baseline+change files as well. The biggest file is java.desktop baseline 
(OpenJDK 7 content) which is <5MB, followed by java.base baseline whose 
size is ~4.5MB. The total size of all these files inside the .hg folder 
is still ~1.7MB.

The files can be seen here:
http://hg.openjdk.java.net/jdk9/sandbox/file/7f017edb8377/make/data/symbols

An example of the change file:
http://hg.openjdk.java.net/jdk9/sandbox/file/7f017edb8377/make/data/symbols/jdk.dev-8.sym.txt

Does this look reasonable?

Thanks for your help!

Jan

On 27.2.2015 17:54, Jonathan Gibbons wrote:
> On 02/27/2015 08:35 AM, Magnus Ihse Bursie wrote:
>> On 2015-02-27 16:58, Jan Lahoda wrote:
>>> On 27.2.2015 14:48, Magnus Ihse Bursie wrote:
>>>> Hi Jan,
>>>>
>>>> On 2015-02-27 09:31, Jan Lahoda wrote:
>>>>> Hi,
>>>>>
>>>>> I have a question on JDK-8058150: "Compile for Specific Platform
>>>>> Version". To support compilation for older versions of the platform,
>>>>> javac will need some description of the APIs as they existed in the
>>>>> target platforms.
>>>>>
>>>>> For this, the current proposal is to use lib/ct.sym file (similar, but
>>>>> different, to the JDK 8 lib/ct.sym), containing classfiles of the
>>>>> older APIs. This file would be constructed at build time from a
>>>>> textual representation of the APIs stored in an OpenJDK repository
>>>>> (currently called ct.sym.txt).
>>>>>
>>>>> The current ct.sym.txt is a single file that contains APIs for all
>>>>> supported versions, reusing entries for multiple versions when needed.
>>>>> An alternative would be to use ct7.sym.txt for JDK 7 APIs, ct8.sym.txt
>>>>> for JDK 8 APIs, etc. Using a single file leads to a smaller total size
>>>>> (as it reuses entries where it can), but needs to be considerably
>>>>> changed when a new version is added or an obsolete version is removed.
>>>>>
>>>>> The size of the file is considerable: for the "ct.sym.txt" that
>>>>> represents APIs from OpenJDK 7 and 8, the size of the checked-out file
>>>>> in the working copy is (currently[2]) ~23MB, and inside the .hg
>>>>> directory, the file has ~1.7MB (Mercurial is apparently able to
>>>>> compress the ct.sym.txt file very well - but as all history is kept
>>>>> inside .hg directory, the size of the file inside the .hg directory
>>>>> increases when the ct.sym.txt is updated).
>>>>
>>>> In my opinion, the only size that matters here is in the .hg directory.
>>>> If the workspace takes 23 MB more or less is a non-issue when a full
>>>> forest clone is in the gigabyte range. But I don't think 1.7 MB extra
>>>> for the top-level repo is much of a problem either. Since the top-level
>>>> repo currently is so tiny, it will grow noticably in percentage. But
>>>> compare this with hotspot/.hg on 67 MB or jdk/.hg on 351 MB.
>>>
>>> Thanks. I also think the size inside the .hg folder is more important.
>>>
>>>>
>>>> If you have a proper text format so future edits can be made as trivial
>>>> diffs, then the mercurial storage will not grow noticable in the future
>>>> either.
>>>
>>> The file currently contains version numbers on most lines, so adding
>>> or removing a support for a platform version means a significant
>>> update to the file. I am thinking of some ways to limit that, though.
>>
>> I peaked at your current solution. Is it possible to store the file as
>> a baseline version (JDK 7, or however far back you want to go) and a
>> set of deltas? Instead of like:
>> header extends java/lang/Object flags 31 classAnnotations
>> @Ljdk/Profile+Annotation;(value=I4) versions 78
>> method name <clinit> descriptor ()V flags 8 versions 78
>> method name <init> descriptor ()V flags 1 versions 8
>> method name getHostId descriptor ()Ljava/lang/String; flags 1 versions 7
>>
>> Store it like:
>> baseline-jdk7.sym.txt
>> header extends java/lang/Object flags 31 classAnnotations
>> @Ljdk/Profile+Annotation;(value=I4)
>> method name <clinit> descriptor ()V flags 8
>>
>> jdk8.sym.txt
>> +method name <init> descriptor ()V flags 1
>> -method name getHostId descriptor ()Ljava/lang/String; flags 1
>>
>> ?
>>
>> In fact, if it is not too expensive to generate this kind of file from
>> the build, you could perhaps do it the other way round, and create a
>> "baseline" for the current JDK just built, and just store the diffs to
>> previous versions. (Although that might be a maintainence burden to
>> make sure that these reverse diffs are up to date).
>>
>>
>>>
>>>>
>>>>> Another alternative would be to partition the file into several
>>>>> smaller files - would be easier to grasp, but if the files would be
>>>>> too small, the compression would be worse (leading to bigger
>>>>> repositories).
>>>> What is the actual difference? Having too large files can be burdensome
>>>> on other tools as well, eg. if you open it (mistakenly or not) in a
>>>> text
>>>> editor. I would tend to prefer several smaller files than one huge.
>>>
>>> That depends on how is the file split up. Originally, I was thinking
>>> of having a file per package, but that produces (too) many small
>>> files - 797 files, the biggest one having ~650kB (in the working
>>> copy), consumes ~5.9MB total inside .hg.
>> Can you match it to modules? I realize older JDKs has no module
>> concept, but that sounds like it could be a reasonable number of
>> reasonable sized chunks.
>>
>> /Magnus
>>
>>>
>>> There was a proposal to have a single ct.sym file per each jar file
>>> on the bootclasspath (in the target platform), but this produces
>>> ~20MB (in the working copy) file for ct.sym, while the next biggest
>>> file is for nashorn.jar: ~1MB (in the working copy). This consumes
>>> ~1.7MB inside the .hg directory.
>>>
>>> I tried to split the big ct.sym.txt artificially at approximately
>>> 1MB, this leads to 23 files, and still consumes ~1.7MB inside the .hg
>>> directory.
>>>
>>>>> Currently, the proposal is to place the ct.sym.txt file into the
>>>>> top-level repository. A prototype of this feature is currently in the
>>>>> jdk9/sandbox forest, on branch JDK-8058150-branch. The current
>>>>> ct.sym.txt file is
>>>>> <top-level-repository>/make/data/symbols/ct.sym.txt.
>>>> Sounds like a good place to put such a file. I assume it will be
>>>> processed during building?
>>>
>>> Yes, the file is processed during build and the ct.sym file in
>>> produced into the "lib" directory. In my current prototype, I've
>>> tweaked the make/Images.gmk to do that:
>>> http://hg.openjdk.java.net/jdk9/sandbox/file/087692cc2663/make/Images.gmk
>>>
>>> (in the "ct.sym" section).
>>>
>>> Thanks for the comments,
>>>     Jan
>>>
>>>>
>>>> /Magnus
>>
>
>
> Magnus,
>
> The simplified generalization of your suggestion would be to simply use
> patch/diff files.
>
> We could have a baseline of (say)
>
> jdk7.sym.txt
>
> then have a patch file
>
> jdk8.sym.patch
>
> which can be applied (in the build) to jdk7.sym.txt to get jdk8.sym.txt
>
> Since the .txt and .patch files reflect existing releases, they would
> almost never change, so the .hg files would be constant.
>
> Maybe once in a while, we would prune the history, remove older version
> info, and add a new baseline.
>
> -- Jon


More information about the compiler-dev mailing list