[External] : Re: Class files in ByteBuffer
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Fri Mar 21 15:25:19 UTC 2025
On 21/03/2025 15:17, David Lloyd wrote:
> Yes, you *could* wrap a byte[] with a MemorySegment, but then you'd
> just have a heap memory segment; this is probably not what the user is
> looking for (otherwise they'd probably just build to a byte[] to begin
> with). By using the user-supplied allocator, I can (for example) write
> the class bytes directly to a memory-mapped file.
Ah! I now see that first allocator parameter -- which makes sense.
I'd suggest to turn that into SegmentAllocator -- note that Arena is
also a SegmentAllocator, so you can pass either your own allocating
lambda, or directly use an arena which is nice (I think).
ByteBuffer don't have the allocation abstraction -- so I wonder if we
really should let them go, and maybe only add the segment API with the
allocator parameter. Or maybe have a simpler API for ByteBuffer w/o
allocator parameter, which always allocates a heap buffer. Then if users
want something more complex (e.g. a mapped byte buffer) they can use the
MS API instead, and then wrap the MS into a BB after the fact.
Maurizio
>
> That said, the buildTo* variants exist because Brian suggested that
> the API might be unacceptably asymmetrical otherwise. Either way, the
> part that matters the most to me is the parsing side.
>
> The idea with the API is that, while the current impl would need to
> copy to a byte array, it is at least theoretically possible that in
> the future, that may change. As a user of the byte[]-based API, I am
> already copying from my direct buffer, so it's not worse than the
> status quo. By putting the copy into the JDK, if the JDK does get
> enhanced someday to use MemorySegment internally, or maybe Unsafe, or
> whatever, then I'll get the benefit of this change. If it doesn't,
> then I'm no worse off than I am today (slightly better off actually,
> because it saves me a step going in). Supporting this use case would
> be beneficial for the same reason that it is beneficial to be able to
> define classes out of direct buffers (which has been supported since
> JDK 1.5).
>
> The thing I'm mainly in doubt about is that the ability to parse from
> or generate to byte buffers is potentially redundant with respect to
> MemorySegment. It would just be a bit weird if I could define a class
> using an array or a byte buffer, but parsing classes used arrays or
> memory segments. Is it weird enough to justify the third API variant?
> I don't know.
>
> On Fri, Mar 21, 2025 at 9:27 AM Maurizio Cimadamore
> <maurizio.cimadamore at oracle.com> wrote:
>
> I looked at the PR and I had a similar reaction as Adam.
>
> It seems like the buildToXYZ methods have one less copy, but... if
> the API returns a byte[], you can always wrap the byte[] as either
> a MemorySegment (MemorySegment::ofArrays) or a ByteBuffer
> (ByteBuffer::wrap). These methods do _not_ copy.
>
> The way I see it, is that this PR doesn't remove the need to copy
> on the way in (that would be a much more complex change), and it
> doesn't improve things significantly _on the way out_ (because,
> that side is already covered by existing API/views support).
>
> So, while I'm not strongly opposed to the changes in the PR, at
> the same time I don't see a lot of added value in the new methods.
> At the same time I'm a bit skeptical to commit to the new API in
> without doing the full exercise to see how these methods might be
> implemented.
>
> A good motto in API design is "when in doubt, leave it out" -- is
> this one of those cases?
>
> Maurizio
>
>
>
> On 21/03/2025 13:34, David Lloyd wrote:
>> The idea is that in the future, it may be possible to do these
>> things without the extra copy. In the PR, I found that we can
>> already build to memory segments and byte buffers without more
>> copies than what we're doing for arrays. On the parsing side, we
>> can already sometimes work without copying in some cases that the
>> user won't have access to (e.g. accessing the backing array of a
>> memory segment, even if it's read-only). It's not hard to imagine
>> that we could possibly have a way to parse without the extra copy
>> in the native memory case in the future, without impacting
>> current performance on arrays. But without the API support, it
>> can never be possible.
>>
>> On Fri, Mar 21, 2025 at 8:26 AM Adam Sotona
>> <adam.sotona at oracle.com> wrote:
>>
>> I’m more thinking that the API already provides all the
>> important entries and conversion from and to `MemorySegment`
>> can be done by simple call of `MemorySegment::toArray` and
>> `MemorySegment::ofArray`.
>>
>> *From: *David Lloyd <david.lloyd at redhat.com>
>> *Date: *Friday, 21 March 2025 at 13:37
>> *To: *Adam Sotona <adam.sotona at oracle.com>
>> *Cc: *classfile-api-dev at openjdk.org
>> <classfile-api-dev at openjdk.org>
>> *Subject: *[External] : Re: Class files in ByteBuffer
>>
>> Please have a look at the PR. If you feel the API surface has
>> grown too much, maybe removing the `ByteBuffer` variants is a
>> logical step, since users can always wrap a `ByteBuffer` with
>> a `MemorySegment`? If you could comment on the PR if you feel
>> that to be the case, I would appreciate it.
>>
>> Thanks.
>>
>> On Thu, Mar 20, 2025 at 4:09 PM Adam Sotona
>> <adam.sotona at oracle.com> wrote:
>>
>> I’m sorry to join the discussion a bit late.
>>
>> Here are the points to consider:
>>
>> * Class-File API is implementation is after many rounds
>> of performance optimizations purely based on byte arrays.
>> * Internal use of ByteBuffer has been removed from the
>> implementation, as it caused significant JDK
>> bootstrap performance regression.
>> * Enormous amount of work has been spent on the API
>> surface reduction and removal of all unnecessary
>> “conveniences”.
>>
>> Adam
>>
>> *From: *classfile-api-dev
>> <classfile-api-dev-retn at openjdk.org> on behalf of David
>> Lloyd <david.lloyd at redhat.com>
>> *Date: *Thursday, 20 March 2025 at 21:11
>> *To: *classfile-api-dev at openjdk.org
>> <classfile-api-dev at openjdk.org>
>> *Subject: *Re: Class files in ByteBuffer
>>
>> I've opened a bug [1] and pull request [2] incorporating
>> this discussion (more or less). I've implemented support
>> for both `MemorySegment` and `ByteBuffer`, but this could
>> be revisited if it doesn't look OK. The implementation is
>> not terribly invasive for now, only grabbing a few
>> low-hanging optimizations.
>>
>> [1] https://bugs.openjdk.org/browse/JDK-8352536
>> <https://bugs.openjdk.org/browse/JDK-8352536>
>>
>> [2] https://github.com/openjdk/jdk/pull/24139
>> <https://urldefense.com/v3/__https:/github.com/openjdk/jdk/pull/24139__;!!ACWV5N9M2RV99hQ!MNdgtt56CgKWQQak5PkwC2t1L1q0tTZrT2fqzVPHYo3jer5CUfysIIx9uMuCoLb27YGoQWuUDgu3s_h8ZhlN-w$>
>>
>> On Mon, Mar 10, 2025 at 12:38 PM David Lloyd
>> <david.lloyd at redhat.com> wrote:
>>
>> When defining a class in the JDK, one may either use
>> a byte array or a byte buffer to hold the contents of
>> the class. The latter is useful when (for example) a
>> JAR file containing uncompressed classes is mapped
>> into memory. Thus, some class loaders depend on this
>> form of the API for class definition.
>>
>> If I were to supplement such a class loader with a
>> class transformation step based on the class file
>> API, I would have to copy the bytes of each class on
>> to the heap as a byte[] before I could begin parsing
>> it. This is potentially expensive, and definitely
>> awkward.
>>
>> After transformation, it doesn't really matter if you
>> have a byte[] or ByteBuffer because either way, the
>> class can be defined directly.
>>
>> It would be nice if the class file parser could
>> accept either a byte[] or a ByteBuffer. I did a quick
>> bit of exploratory work and it looks like porting the
>> code to read from a ByteBuffer instead of a byte[]
>> (using ByteBuffer.wrap() for the array case) would be
>> largely straightforward *except* for the code which
>> parses UTF-8 constants into strings. Also there could
>> be some small performance differences (maybe
>> positive, maybe negative) depending on how the buffer
>> is accessed.
>>
>> Is this something that might be considered?
>>
>> --
>>
>> - DML • he/him
>>
>>
>> --
>>
>> - DML • he/him
>>
>>
>> --
>>
>> - DML • he/him
>>
>>
>>
>> --
>> - DML • he/him
>
>
>
> --
> - DML • he/him
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/classfile-api-dev/attachments/20250321/ffc5d035/attachment-0001.htm>
More information about the classfile-api-dev
mailing list