Class files in ByteBuffer

Mon Mar 10 18:46:46 UTC 2025

I think the use of ByteBuffer vs byte[] is a tradeoff - JIT compiler has a lot of trouble with ByteBuffer due to polymorphism and this might actually turn out to be a regression. (ClassFile API previously used ByteBuffer for stack map generation I think; it has been since eliminated for performance improvements) Also ClassFile API depends on some sweet properties of byte[], such as using some String intrinsics on byte array to quickly process ascii-compatible UTF8 entries.

Luckily the access to the array is nicely encapsulated in ClassReader for the most part and Utf8 entry is the only place where it escapes. You should be able to make a prototype of reading from ByteBuffer easily; your "using byte buffer as backing" approach might be accepted if you can prove there is no regression in the case of reading from plain byte arrays.

Regards, Chen
________________________________
From: classfile-api-dev <classfile-api-dev-retn at openjdk.org> on behalf of David Lloyd <david.lloyd at redhat.com>
Sent: Monday, March 10, 2025 12:38 PM
To: classfile-api-dev at openjdk.org <classfile-api-dev at openjdk.org>
Subject: Class files in ByteBuffer

When defining a class in the JDK, one may either use a byte array or a byte buffer to hold the contents of the class. The latter is useful when (for example) a JAR file containing uncompressed classes is mapped into memory. Thus, some class loaders depend on this form of the API for class definition.

If I were to supplement such a class loader with a class transformation step based on the class file API, I would have to copy the bytes of each class on to the heap as a byte[] before I could begin parsing it. This is potentially expensive, and definitely awkward.

After transformation, it doesn't really matter if you have a byte[] or ByteBuffer because either way, the class can be defined directly.

It would be nice if the class file parser could accept either a byte[] or a ByteBuffer. I did a quick bit of exploratory work and it looks like porting the code to read from a ByteBuffer instead of a byte[]  (using ByteBuffer.wrap() for the array case) would be largely straightforward *except* for the code which parses UTF-8 constants into strings. Also there could be some small performance differences (maybe positive, maybe negative) depending on how the buffer is accessed.

Is this something that might be considered?

--
- DML • he/him
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/classfile-api-dev/attachments/20250310/e4f2c982/attachment-0001.htm>