[External] : Re: Class files in ByteBuffer

Fri Mar 21 15:17:56 UTC 2025

Yes, you *could* wrap a byte[] with a MemorySegment, but then you'd just
have a heap memory segment; this is probably not what the user is looking
for (otherwise they'd probably just build to a byte[] to begin with). By
using the user-supplied allocator, I can (for example) write the class
bytes directly to a memory-mapped file.

That said, the buildTo* variants exist because Brian suggested that the API
might be unacceptably asymmetrical otherwise. Either way, the part that
matters the most to me is the parsing side.

The idea with the API is that, while the current impl would need to copy to
a byte array, it is at least theoretically possible that in the future,
that may change. As a user of the byte[]-based API, I am already copying
from my direct buffer, so it's not worse than the status quo. By putting
the copy into the JDK, if the JDK does get enhanced someday to use
MemorySegment internally, or maybe Unsafe, or whatever, then I'll get the
benefit of this change. If it doesn't, then I'm no worse off than I am
today (slightly better off actually, because it saves me a step going in).
Supporting this use case would be beneficial for the same reason that it is
beneficial to be able to define classes out of direct buffers (which has
been supported since JDK 1.5).

The thing I'm mainly in doubt about is that the ability to parse from or
generate to byte buffers is potentially redundant with respect to
MemorySegment. It would just be a bit weird if I could define a class using
an array or a byte buffer, but parsing classes used arrays or memory
segments. Is it weird enough to justify the third API variant? I don't know.

On Fri, Mar 21, 2025 at 9:27 AM Maurizio Cimadamore <
maurizio.cimadamore at oracle.com> wrote:

> I looked at the PR and I had a similar reaction as Adam.
>
> It seems like the buildToXYZ methods have one less copy, but... if the API
> returns a byte[], you can always wrap the byte[] as either a MemorySegment
> (MemorySegment::ofArrays) or a ByteBuffer (ByteBuffer::wrap). These methods
> do _not_ copy.
>
> The way I see it, is that this PR doesn't remove the need to copy on the
> way in (that would be a much more complex change), and it doesn't improve
> things significantly _on the way out_ (because, that side is already
> covered by existing API/views support).
>
> So, while I'm not strongly opposed to the changes in the PR, at the same
> time I don't see a lot of added value in the new methods. At the same time
> I'm a bit skeptical to commit to the new API in without doing the full
> exercise to see how these methods might be implemented.
>
> A good motto in API design is "when in doubt, leave it out" -- is this one
> of those cases?
>
> Maurizio
>
>
>
> On 21/03/2025 13:34, David Lloyd wrote:
>
> The idea is that in the future, it may be possible to do these things
> without the extra copy. In the PR, I found that we can already build to
> memory segments and byte buffers without more copies than what we're doing
> for arrays. On the parsing side, we can already sometimes work without
> copying in some cases that the user won't have access to (e.g. accessing
> the backing array of a memory segment, even if it's read-only). It's not
> hard to imagine that we could possibly have a way to parse without the
> extra copy in the native memory case in the future, without impacting
> current performance on arrays. But without the API support, it can never be
> possible.
>
> On Fri, Mar 21, 2025 at 8:26 AM Adam Sotona <adam.sotona at oracle.com>
> wrote:
>
>> I’m more thinking that the API already provides all the important entries
>> and conversion from and to `MemorySegment` can be done by simple call of
>> `MemorySegment::toArray` and `MemorySegment::ofArray`.
>>
>>
>>
>> *From: *David Lloyd <david.lloyd at redhat.com>
>> *Date: *Friday, 21 March 2025 at 13:37
>> *To: *Adam Sotona <adam.sotona at oracle.com>
>> *Cc: *classfile-api-dev at openjdk.org <classfile-api-dev at openjdk.org>
>> *Subject: *[External] : Re: Class files in ByteBuffer
>>
>> Please have a look at the PR. If you feel the API surface has grown too
>> much, maybe removing the `ByteBuffer` variants is a logical step, since
>> users can always wrap a `ByteBuffer` with a `MemorySegment`? If you could
>> comment on the PR if you feel that to be the case, I would appreciate it.
>>
>>
>>
>> Thanks.
>>
>>
>>
>> On Thu, Mar 20, 2025 at 4:09 PM Adam Sotona <adam.sotona at oracle.com>
>> wrote:
>>
>> I’m sorry to join the discussion a bit late.
>>
>>
>>
>> Here are the points to consider:
>>
>>    - Class-File API is implementation is after many rounds of
>>    performance optimizations purely based on byte arrays.
>>    - Internal use of ByteBuffer has been removed from the
>>    implementation, as it caused significant JDK bootstrap performance
>>    regression.
>>    - Enormous amount of work has been spent on the API surface reduction
>>    and removal of all unnecessary “conveniences”.
>>
>>
>>
>> Adam
>>
>>
>>
>>
>>
>>
>>
>> *From: *classfile-api-dev <classfile-api-dev-retn at openjdk.org> on behalf
>> of David Lloyd <david.lloyd at redhat.com>
>> *Date: *Thursday, 20 March 2025 at 21:11
>> *To: *classfile-api-dev at openjdk.org <classfile-api-dev at openjdk.org>
>> *Subject: *Re: Class files in ByteBuffer
>>
>> I've opened a bug [1] and pull request [2] incorporating this discussion
>> (more or less). I've implemented support for both `MemorySegment` and
>> `ByteBuffer`, but this could be revisited if it doesn't look OK. The
>> implementation is not terribly invasive for now, only grabbing a few
>> low-hanging optimizations.
>>
>>
>>
>> [1] https://bugs.openjdk.org/browse/JDK-8352536
>>
>> [2] https://github.com/openjdk/jdk/pull/24139
>> <https://urldefense.com/v3/__https:/github.com/openjdk/jdk/pull/24139__;!!ACWV5N9M2RV99hQ!MNdgtt56CgKWQQak5PkwC2t1L1q0tTZrT2fqzVPHYo3jer5CUfysIIx9uMuCoLb27YGoQWuUDgu3s_h8ZhlN-w$>
>>
>>
>>
>> On Mon, Mar 10, 2025 at 12:38 PM David Lloyd <david.lloyd at redhat.com>
>> wrote:
>>
>> When defining a class in the JDK, one may either use a byte array or a
>> byte buffer to hold the contents of the class. The latter is useful when
>> (for example) a JAR file containing uncompressed classes is mapped into
>> memory. Thus, some class loaders depend on this form of the API for class
>> definition.
>>
>>
>>
>> If I were to supplement such a class loader with a class transformation
>> step based on the class file API, I would have to copy the bytes of each
>> class on to the heap as a byte[] before I could begin parsing it. This is
>> potentially expensive, and definitely awkward.
>>
>>
>>
>> After transformation, it doesn't really matter if you have a byte[] or
>> ByteBuffer because either way, the class can be defined directly.
>>
>>
>>
>> It would be nice if the class file parser could accept either a byte[] or
>> a ByteBuffer. I did a quick bit of exploratory work and it looks like
>> porting the code to read from a ByteBuffer instead of a byte[]  (using
>> ByteBuffer.wrap() for the array case) would be largely straightforward
>> *except* for the code which parses UTF-8 constants into strings. Also there
>> could be some small performance differences (maybe positive, maybe
>> negative) depending on how the buffer is accessed.
>>
>>
>>
>> Is this something that might be considered?
>>
>>
>>
>> --
>>
>> - DML • he/him
>>
>>
>>
>>
>> --
>>
>> - DML • he/him
>>
>>
>>
>>
>> --
>>
>> - DML • he/him
>>
>
>
> --
> - DML • he/him
>
>

-- 
- DML • he/him
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/classfile-api-dev/attachments/20250321/f2bd0b7e/attachment-0001.htm>