Largest files in the JDK repo
Jorn Vernee
jorn.vernee at oracle.com
Thu Oct 24 18:47:53 UTC 2024
WRT the two biggest files:
6.8M ./test/jdk/java/foreign/libTestUpcallStack.c
3.5M ./test/jdk/java/foreign/libTestDowncallStack.c
These are mechanically generate C libraries featuring a lot of different
function shapes, for testing of FFM downcalls. The Java code that is
used to generate these C files could theoretically run as part of the
test as well, but the problem is that then we then need to compile the
generated sources into a native library.
Currently the JDK build system will find and build all native libraries
needed for tests before any of the tests run, but maybe it's possible to
create a way for a test to request that a native library be built on
demand. Then we wouldn't need to pre-generate these files and include
them in the repo, and could instead generate + compile them when the
test runs. (This might also help cut down on the build time of the test
image, since you'd only need to compile test libraries for the tests
that actually run).
Jorn
On 24-10-2024 13:04, Magnus Ihse Bursie wrote:
>
> I got intrigued at how https://bugs.openjdk.org/browse/JDK-8339507
> could integrate a 7 MB large file without nobody noticing, so I
> started wondering how many other huge text files there is in our repo.
> (We are much more restrictive with binary files, even if they are
> small...)
>
> So I compiled a top 100 list, which basically ended up being all files
> larger than 400 kB. In total, these 100 files account from ca 82 MB of
> data. I'm not saying that any of these files are wrong per se, but
> maybe some of the files on this list could provide a bit food for
> thought. Further down is the complete top-list, but it is a bit hard
> to get a grip on. I sorted and grouped the result, since the large
> files are not randomly sprinkled throughout the code base. This list
> does not contain test files. The huge test files are more numerous,
> but there are also (imho) more compelling reasons in general to allow
> for bigger files in testing. With that said, even some of the test
> files seems a bit excessive. (And one can not help but wonder what
> kind of file
> src/java.base/share/data/unicodedata/NormalizationTest.txt really is.)
>
> Character sets and localization:
> * make/data/charsetmapping
> * make/data/cldr
> * src/java.base/share/data/lsrdata/
> * src/java.base/share/data/unicodedata
> * src/java.base/share/classes/java/lang/Character.java
> * src/java.base/share/classes/sun/nio/cs/GB18030.java
> * src/jdk.charsets/share/classes/sun/nio/cs/ext/IBM33722.java
> * src/jdk.charsets/share/classes/sun/nio/cs/ext/IBM964.java.template
> 3rd party source:
> * src/jdk.incubator.vector/*/native/libjsvml/*.S
> * src/java.base/share/native/libzip/zlib/crc32.h
> * src/java.desktop/share/native/common/java2d/opengl/J2D_GL/glext.h
> Symbols from previous JDKS:
> * src/jdk.compiler/share/data/symbols
> Huge Hotspot files:
> * src/hotspot/cpu/*/*.ad
> * src/hotspot/cpu/x86/assembler_x86.cpp
> * src/hotspot/share/prims/jvmti.xml
> Other:
> * src/java.desktop/share/classes/javax/swing/plaf/nimbus/skin.laf
> * src/java.base/share/classes/java/lang/invoke/MethodHandles.java
> * src/java.sql.rowset/share/classes/com/sun/rowset/CachedRowSetImpl.java
> And a binary file:
> * src/demo/share/java2d/J2DBench/resources/cmm_images/img_icc_large.jpg
>
> And here is the complete top list:
>
> 6.8M ./test/jdk/java/foreign/libTestUpcallStack.c
> 3.5M ./test/jdk/java/foreign/libTestDowncallStack.c
> 2.7M ./test/jdk/com/sun/net/httpserver/docs/test1/largefile.txt
> 2.6M ./src/java.base/share/data/unicodedata/NormalizationTest.txt
> 2.3M ./test/jdk/sun/nio/cs/EUC_TW_OLD.java
> 2.1M ./src/jdk.compiler/share/data/symbols/java.desktop-8.sym.txt
> 2.0M ./src/java.desktop/share/classes/javax/swing/plaf/nimbus/skin.laf
> 2.0M ./test/jdk/java/text/Normalizer/NormalizationTest-3.2.0.Corrigendum4.txt
> 2.0M ./test/jdk/java/text/Normalizer/NormalizationTest-3.2.0.txt
> 1.9M ./src/java.base/share/data/unicodedata/UnicodeData.txt
> 1.6M ./test/hotspot/jtreg/gc/TestBigObj.java
> 1.5M ./test/jdk/java/foreign/libTestUpcall.c
> 1.4M ./src/jdk.compiler/share/data/symbols/java.base-8.sym.txt
> 1.2M ./test/jdk/java/lang/String/concat/ImplicitStringConcatShapes.java
> 1.1M ./src/java.base/share/data/unicodedata/DerivedCoreProperties.txt
> 952K ./test/hotspot/jtreg/compiler/c2/stemmer/words
> 941K ./src/jdk.compiler/share/data/symbols/java.base-M.sym.txt
> 928K ./make/data/charsetmapping/EUC_TW.map
> 927K ./test/hotspot/jtreg/vmTestbase/vm/mlvm/mixed/stress/java/findDeadlock/INDIFY_Test.java
> 912K ./make/data/cldr/common/supplemental/likelySubtags.xml
> 898K ./make/data/charsetmapping/MS936.map
> 865K ./src/jdk.charsets/share/classes/sun/nio/cs/ext/IBM964.java.template
> 857K ./test/jdk/java/foreign/libTestDowncall.c
> 843K ./test/hotspot/jtreg/compiler/loopopts/superword/TestDependencyOffsets.java
> 830K ./src/java.desktop/share/native/common/java2d/opengl/J2D_GL/glext.h
> 794K ./make/data/cldr/common/main/ru.xml
> 774K ./test/jdk/sun/nio/cs/mapping/GB18030_2000.b2c
> 774K ./test/jdk/sun/nio/cs/mapping/GB18030.b2c
> 767K ./test/jdk/jdk/internal/math/ToDecimal/java.base/jdk/internal/math/DoubleToDecimalChecker.java
> 752K ./make/data/cldr/common/main/uk.xml
> 742K ./make/data/charsetmapping/Johab.map
> 741K ./test/jdk/sun/nio/cs/mapping/Johab.b2c
> 739K ./src/java.base/share/classes/sun/nio/cs/GB18030.java
> 733K ./src/jdk.incubator.vector/linux/native/libjsvml/jsvml_d_tan_linux_x86.S
> 731K ./make/data/charsetmapping/MS950.map
> 727K ./test/jdk/sun/nio/cs/mapping/MS950.b2c
> 709K ./src/java.base/share/data/lsrdata/language-subtag-registry.txt
> 698K ./make/data/charsetmapping/MS949.map
> 695K ./test/jdk/sun/nio/cs/mapping/MS949.b2c
> 655K ./src/hotspot/cpu/x86/assembler_x86.cpp
> 647K ./test/jdk/java/lang/instrument/BigClass.java
> 634K ./src/jdk.incubator.vector/linux/native/libjsvml/jsvml_d_sin_linux_x86.S
> 628K ./src/jdk.incubator.vector/linux/native/libjsvml/jsvml_d_cos_linux_x86.S
> 616K ./src/jdk.incubator.vector/windows/native/libjsvml/jsvml_d_tan_windows_x86.S
> 601K ./src/jdk.charsets/share/classes/sun/nio/cs/ext/IBM33722.java
> 597K ./src/hotspot/share/prims/jvmti.xml
> 597K ./test/jdk/sun/security/ec/SigGen-1.txt
> 593K ./make/data/cldr/common/main/lt.xml
> 582K ./make/data/cldr/common/main/cs.xml
> 579K ./src/java.base/share/native/libzip/zlib/crc32.h
> 577K ./make/data/cldr/common/main/sk.xml
> 577K ./src/jdk.compiler/share/data/symbols/java.desktop-9.sym.txt
> 572K ./test/jdk/javax/swing/text/html/parser/Parser/8078268/slowparse.html
> 567K ./test/jdk/sun/nio/cs/OLD/IBM933_OLD.java
> 539K ./test/jdk/sun/nio/cs/mapping/untested/gb18030_1.b2c
> 536K ./test/micro/org/openjdk/bench/vm/gc/RawAllocationRate.java
> 534K ./src/jdk.incubator.vector/windows/native/libjsvml/jsvml_d_sin_windows_x86.S
> 532K ./make/data/cldr/common/main/ff_Adlm.xml
> 531K ./src/jdk.incubator.vector/windows/native/libjsvml/jsvml_d_cos_windows_x86.S
> 526K ./test/jdk/sun/nio/cs/mapping/EUC_TW.b2c
> 524K ./src/jdk.compiler/share/data/symbols/java.desktop-B.sym.txt
> 523K ./make/data/cldr/common/main/pl.xml
> 520K ./test/hotspot/jtreg/vmTestbase/vm/mlvm/indy/stress/java/loopsAndThreads/INDIFY_Test.java
> 518K ./make/data/cldr/common/main/sl.xml
> 510K ./test/jdk/sun/nio/cs/OLD/IBM950_OLD.java
> 509K ./make/data/cldr/common/main/mr.xml
> 507K ./make/data/cldr/common/main/kn.xml
> 505K ./test/jdk/sun/nio/cs/OLD/IBM948_OLD.java
> 504K ./make/data/cldr/common/main/sr.xml
> 503K ./test/jdk/sun/nio/cs/OLD/IBM937_OLD.java
> 502K ./test/jdk/sun/net/www/protocol/jar/foo1.jar
> 501K ./make/data/cldr/common/main/ta.xml
> 496K ./test/jdk/sun/nio/cs/OLD/Johab_OLD.java
> 490K ./test/jdk/sun/nio/cs/OLD/MS949_OLD.java
> 489K ./test/hotspot/jtreg/vmTestbase/vm/jit/LongTransitions/LTTest.java
> 485K ./test/hotspot/jtreg/vmTestbase/jit/FloatingPoint/FPCompare/TestFPBinop/TestFPBinop.gold
> 485K ./src/hotspot/cpu/aarch64/aarch64.ad
> 478K ./src/hotspot/cpu/ppc/ppc.ad
> 467K ./make/data/cldr/common/main/gd.xml
> 466K ./src/java.base/share/classes/java/lang/Character.java
> 453K ./make/data/cldr/common/main/ar.xml
> 452K ./test/jdk/sun/nio/cs/OLD/IBM949_OLD.java
> 446K ./src/jdk.incubator.vector/linux/native/libjsvml/jsvml_d_pow_linux_x86.S
> 445K ./make/data/cldr/common/main/cy.xml
> 443K ./make/data/cldr/common/main/ml.xml
> 442K ./make/data/cldr/common/main/br.xml
> 442K ./test/jdk/sun/nio/cs/OLD/MS950_OLD.java
> 442K ./make/data/cldr/common/main/hr.xml
> 441K ./src/hotspot/cpu/x86/x86_32.ad
> 438K ./src/java.base/share/classes/java/lang/invoke/MethodHandles.java
> 436K ./test/jaxp/javax/xml/jaxp/unittest/transform/msgAttach.xml
> 433K ./make/data/cldr/common/main/el.xml
> 432K ./src/java.sql.rowset/share/classes/com/sun/rowset/CachedRowSetImpl.java
> 429K ./make/data/cldr/common/main/lv.xml
> 428K ./make/data/cldr/common/main/fi.xml
> 427K ./test/jdk/sun/nio/cs/OLD/GBK_OLD.java
> 421K ./src/demo/share/java2d/J2DBench/resources/cmm_images/img_icc_large.jpg
> 419K ./make/data/cldr/common/main/en.xml
> 418K ./src/hotspot/cpu/x86/x86.ad
> 416K ./make/data/cldr/common/main/sr_Latn.xml
>
> The list was compiled by running:
>
> find . -path ./.git -prune -o -type f -printf '%s %p\n' | sort -nr |
> numfmt --field=1 --to=iec | head -n 100
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/jdk-dev/attachments/20241024/c959907b/attachment-0001.htm>
More information about the jdk-dev
mailing list