Largest files in the JDK repo

Magnus Ihse Bursie magnus.ihse.bursie at oracle.com
Thu Oct 24 11:04:13 UTC 2024


I got intrigued at how https://bugs.openjdk.org/browse/JDK-8339507 could 
integrate a 7 MB large file without nobody noticing, so I started 
wondering how many other huge text files there is in our repo. (We are 
much more restrictive with binary files, even if they are small...)

So I compiled a top 100 list, which basically ended up being all files 
larger than 400 kB. In total, these 100 files account from ca 82 MB of 
data. I'm not saying that any of these files are wrong per se, but maybe 
some of the files on this list could provide a bit food for thought. 
Further down is the complete top-list, but it is a bit hard to get a 
grip on. I sorted and grouped the result, since the large files are not 
randomly sprinkled throughout the code base. This list does not contain 
test files. The huge test files are more numerous, but there are also 
(imho) more compelling reasons in general to allow for bigger files in 
testing. With that said, even some of the test files seems a bit 
excessive. (And one can not help but wonder what kind of file 
src/java.base/share/data/unicodedata/NormalizationTest.txt really is.)

Character sets and localization:
* make/data/charsetmapping
* make/data/cldr
* src/java.base/share/data/lsrdata/
* src/java.base/share/data/unicodedata
* src/java.base/share/classes/java/lang/Character.java
* src/java.base/share/classes/sun/nio/cs/GB18030.java
* src/jdk.charsets/share/classes/sun/nio/cs/ext/IBM33722.java
* src/jdk.charsets/share/classes/sun/nio/cs/ext/IBM964.java.template
3rd party source:
* src/jdk.incubator.vector/*/native/libjsvml/*.S
* src/java.base/share/native/libzip/zlib/crc32.h
* src/java.desktop/share/native/common/java2d/opengl/J2D_GL/glext.h
Symbols from previous JDKS:
* src/jdk.compiler/share/data/symbols
Huge Hotspot files:
* src/hotspot/cpu/*/*.ad
* src/hotspot/cpu/x86/assembler_x86.cpp
* src/hotspot/share/prims/jvmti.xml
Other:
* src/java.desktop/share/classes/javax/swing/plaf/nimbus/skin.laf
* src/java.base/share/classes/java/lang/invoke/MethodHandles.java
* src/java.sql.rowset/share/classes/com/sun/rowset/CachedRowSetImpl.java
And a binary file:
* src/demo/share/java2d/J2DBench/resources/cmm_images/img_icc_large.jpg

And here is the complete top list:

6.8M ./test/jdk/java/foreign/libTestUpcallStack.c
3.5M ./test/jdk/java/foreign/libTestDowncallStack.c
2.7M ./test/jdk/com/sun/net/httpserver/docs/test1/largefile.txt
2.6M ./src/java.base/share/data/unicodedata/NormalizationTest.txt
2.3M ./test/jdk/sun/nio/cs/EUC_TW_OLD.java
2.1M ./src/jdk.compiler/share/data/symbols/java.desktop-8.sym.txt
2.0M ./src/java.desktop/share/classes/javax/swing/plaf/nimbus/skin.laf
2.0M ./test/jdk/java/text/Normalizer/NormalizationTest-3.2.0.Corrigendum4.txt
2.0M ./test/jdk/java/text/Normalizer/NormalizationTest-3.2.0.txt
1.9M ./src/java.base/share/data/unicodedata/UnicodeData.txt
1.6M ./test/hotspot/jtreg/gc/TestBigObj.java
1.5M ./test/jdk/java/foreign/libTestUpcall.c
1.4M ./src/jdk.compiler/share/data/symbols/java.base-8.sym.txt
1.2M ./test/jdk/java/lang/String/concat/ImplicitStringConcatShapes.java
1.1M ./src/java.base/share/data/unicodedata/DerivedCoreProperties.txt
952K ./test/hotspot/jtreg/compiler/c2/stemmer/words
941K ./src/jdk.compiler/share/data/symbols/java.base-M.sym.txt
928K ./make/data/charsetmapping/EUC_TW.map
927K ./test/hotspot/jtreg/vmTestbase/vm/mlvm/mixed/stress/java/findDeadlock/INDIFY_Test.java
912K ./make/data/cldr/common/supplemental/likelySubtags.xml
898K ./make/data/charsetmapping/MS936.map
865K ./src/jdk.charsets/share/classes/sun/nio/cs/ext/IBM964.java.template
857K ./test/jdk/java/foreign/libTestDowncall.c
843K ./test/hotspot/jtreg/compiler/loopopts/superword/TestDependencyOffsets.java
830K ./src/java.desktop/share/native/common/java2d/opengl/J2D_GL/glext.h
794K ./make/data/cldr/common/main/ru.xml
774K ./test/jdk/sun/nio/cs/mapping/GB18030_2000.b2c
774K ./test/jdk/sun/nio/cs/mapping/GB18030.b2c
767K ./test/jdk/jdk/internal/math/ToDecimal/java.base/jdk/internal/math/DoubleToDecimalChecker.java
752K ./make/data/cldr/common/main/uk.xml
742K ./make/data/charsetmapping/Johab.map
741K ./test/jdk/sun/nio/cs/mapping/Johab.b2c
739K ./src/java.base/share/classes/sun/nio/cs/GB18030.java
733K ./src/jdk.incubator.vector/linux/native/libjsvml/jsvml_d_tan_linux_x86.S
731K ./make/data/charsetmapping/MS950.map
727K ./test/jdk/sun/nio/cs/mapping/MS950.b2c
709K ./src/java.base/share/data/lsrdata/language-subtag-registry.txt
698K ./make/data/charsetmapping/MS949.map
695K ./test/jdk/sun/nio/cs/mapping/MS949.b2c
655K ./src/hotspot/cpu/x86/assembler_x86.cpp
647K ./test/jdk/java/lang/instrument/BigClass.java
634K ./src/jdk.incubator.vector/linux/native/libjsvml/jsvml_d_sin_linux_x86.S
628K ./src/jdk.incubator.vector/linux/native/libjsvml/jsvml_d_cos_linux_x86.S
616K ./src/jdk.incubator.vector/windows/native/libjsvml/jsvml_d_tan_windows_x86.S
601K ./src/jdk.charsets/share/classes/sun/nio/cs/ext/IBM33722.java
597K ./src/hotspot/share/prims/jvmti.xml
597K ./test/jdk/sun/security/ec/SigGen-1.txt
593K ./make/data/cldr/common/main/lt.xml
582K ./make/data/cldr/common/main/cs.xml
579K ./src/java.base/share/native/libzip/zlib/crc32.h
577K ./make/data/cldr/common/main/sk.xml
577K ./src/jdk.compiler/share/data/symbols/java.desktop-9.sym.txt
572K ./test/jdk/javax/swing/text/html/parser/Parser/8078268/slowparse.html
567K ./test/jdk/sun/nio/cs/OLD/IBM933_OLD.java
539K ./test/jdk/sun/nio/cs/mapping/untested/gb18030_1.b2c
536K ./test/micro/org/openjdk/bench/vm/gc/RawAllocationRate.java
534K ./src/jdk.incubator.vector/windows/native/libjsvml/jsvml_d_sin_windows_x86.S
532K ./make/data/cldr/common/main/ff_Adlm.xml
531K ./src/jdk.incubator.vector/windows/native/libjsvml/jsvml_d_cos_windows_x86.S
526K ./test/jdk/sun/nio/cs/mapping/EUC_TW.b2c
524K ./src/jdk.compiler/share/data/symbols/java.desktop-B.sym.txt
523K ./make/data/cldr/common/main/pl.xml
520K ./test/hotspot/jtreg/vmTestbase/vm/mlvm/indy/stress/java/loopsAndThreads/INDIFY_Test.java
518K ./make/data/cldr/common/main/sl.xml
510K ./test/jdk/sun/nio/cs/OLD/IBM950_OLD.java
509K ./make/data/cldr/common/main/mr.xml
507K ./make/data/cldr/common/main/kn.xml
505K ./test/jdk/sun/nio/cs/OLD/IBM948_OLD.java
504K ./make/data/cldr/common/main/sr.xml
503K ./test/jdk/sun/nio/cs/OLD/IBM937_OLD.java
502K ./test/jdk/sun/net/www/protocol/jar/foo1.jar
501K ./make/data/cldr/common/main/ta.xml
496K ./test/jdk/sun/nio/cs/OLD/Johab_OLD.java
490K ./test/jdk/sun/nio/cs/OLD/MS949_OLD.java
489K ./test/hotspot/jtreg/vmTestbase/vm/jit/LongTransitions/LTTest.java
485K ./test/hotspot/jtreg/vmTestbase/jit/FloatingPoint/FPCompare/TestFPBinop/TestFPBinop.gold
485K ./src/hotspot/cpu/aarch64/aarch64.ad
478K ./src/hotspot/cpu/ppc/ppc.ad
467K ./make/data/cldr/common/main/gd.xml
466K ./src/java.base/share/classes/java/lang/Character.java
453K ./make/data/cldr/common/main/ar.xml
452K ./test/jdk/sun/nio/cs/OLD/IBM949_OLD.java
446K ./src/jdk.incubator.vector/linux/native/libjsvml/jsvml_d_pow_linux_x86.S
445K ./make/data/cldr/common/main/cy.xml
443K ./make/data/cldr/common/main/ml.xml
442K ./make/data/cldr/common/main/br.xml
442K ./test/jdk/sun/nio/cs/OLD/MS950_OLD.java
442K ./make/data/cldr/common/main/hr.xml
441K ./src/hotspot/cpu/x86/x86_32.ad
438K ./src/java.base/share/classes/java/lang/invoke/MethodHandles.java
436K ./test/jaxp/javax/xml/jaxp/unittest/transform/msgAttach.xml
433K ./make/data/cldr/common/main/el.xml
432K ./src/java.sql.rowset/share/classes/com/sun/rowset/CachedRowSetImpl.java
429K ./make/data/cldr/common/main/lv.xml
428K ./make/data/cldr/common/main/fi.xml
427K ./test/jdk/sun/nio/cs/OLD/GBK_OLD.java
421K ./src/demo/share/java2d/J2DBench/resources/cmm_images/img_icc_large.jpg
419K ./make/data/cldr/common/main/en.xml
418K ./src/hotspot/cpu/x86/x86.ad
416K ./make/data/cldr/common/main/sr_Latn.xml

The list was compiled by running:

find . -path ./.git -prune -o -type f -printf '%s %p\n' | sort -nr | 
numfmt --field=1 --to=iec | head -n 100
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/jdk-dev/attachments/20241024/c345e3d2/attachment-0001.htm>


More information about the jdk-dev mailing list