RFR: 8367585: Move Utf8Entry length validation earlier
Adam Sotona
asotona at openjdk.org
Wed Oct 1 15:03:16 UTC 2025
On Mon, 15 Sep 2025 05:43:43 GMT, Chen Liang <liach at openjdk.org> wrote:
> John Rose suggests in https://github.com/openjdk/jdk/pull/26802#issuecomment-3201402304 that ClassFile API should validate Utf8Entry length eagerly upon construction. Currently we validate upon writing to bytes, which avoids validation overhead. However, given that most class file utf8 data are shorter than 1/3 of the max length, which is always an encodable length, the performance impact should be low.
>
> Preventing the creation of unrepresentable UTF8 entries can prevent passing such invalid instances around, making such problems easier to debug than a failure at building.
>
> Tier 1-3 seems clear. The performance impact to jdk.classfile.Write or any of the regularly run transformation benchmarks seems neutral, less than 5% perturbations.
>
> I will update docs to reflect this change, given how widespread this is across JDK - it seems the only exempt classes are Signature, ClassSignature, and MethodSignature.
I still see no benefits, just drawbacks of this PR.
src/java.base/share/classes/java/lang/classfile/ClassSignature.java line 42:
> 40: * ClassSignature} can represent generic signatures that cannot be represented in
> 41: * classfile. There is no classfile representation checks for string or nominal
> 42: * descriptor arguments passed to static factory methods in this class.
I think this information is misleading. I've read it multiple times and still not sure what is the key takeaway for user.
src/java.base/share/classes/java/lang/classfile/ClassSignature.java line 113:
> 111: /**
> 112: * Parses a raw class signature string into a {@linkplain Signature}.
> 113: * The string may be unrepresentable by a {@link Utf8Entry}.
This also seems to me misleading. What does it mean for user?
src/java.base/share/classes/java/lang/classfile/attribute/SignatureAttribute.java line 111:
> 109: * @param classSignature the class signature
> 110: * @throws IllegalArgumentException if the raw signature string is not
> 111: * representable by a {@link Utf8Entry}
I think adding this paragraph potentially to every place where a String is converted to Utf8Entry is polluting the javadoc. General paragraph about throwing IAE from Class-File API should cover that situations.
We are not shadowing the specs here, we just need to say an IAE may happen if the class file is not possible to construct for any reason.
src/java.base/share/classes/java/lang/constant/package-info.java line 105:
> 103: * may result in errors. Consumers of nominal descriptors, such as bytecode
> 104: * reading and writing APIs, should define the behaviors when such descriptors
> 105: * are passed.
This seems to me unrelated to j.l.constant package and very confusing.
src/java.base/share/classes/jdk/internal/classfile/impl/AbstractPoolEntry.java line 160:
> 158: if (!ModifiedUtf.isValidLengthInConstantPool(s)) {
> 159: throw new IllegalArgumentException("utf8 length out of range of u2: " + ModifiedUtf.utfLen(s));
> 160: }
There might be multiple Utf8EntryImpl instances created and later reduced into a single entry to write, so the check performed here might be redundant.
-------------
PR Review: https://git.openjdk.org/jdk/pull/27281#pullrequestreview-3289656770
PR Review Comment: https://git.openjdk.org/jdk/pull/27281#discussion_r2394915306
PR Review Comment: https://git.openjdk.org/jdk/pull/27281#discussion_r2394916642
PR Review Comment: https://git.openjdk.org/jdk/pull/27281#discussion_r2394920101
PR Review Comment: https://git.openjdk.org/jdk/pull/27281#discussion_r2394922383
PR Review Comment: https://git.openjdk.org/jdk/pull/27281#discussion_r2394924423
More information about the core-libs-dev
mailing list