<div dir="ltr"><div dir="ltr"><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Fri, Mar 21, 2025 at 10:25 AM Maurizio Cimadamore <<a href="mailto:maurizio.cimadamore@oracle.com">maurizio.cimadamore@oracle.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><u></u>
<div>
<p><br>
</p>
<div>On 21/03/2025 15:17, David Lloyd wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div style="font-family:arial,helvetica,sans-serif">Yes, you
*could* wrap a byte[] with a MemorySegment, but then you'd
just have a heap memory segment; this is probably not what
the user is looking for (otherwise they'd probably just
build to a byte[] to begin with). By using the
user-supplied allocator, I can (for example) write the
class bytes directly to a memory-mapped file.</div>
</div>
</div>
</div>
</blockquote>
<p>Ah! I now see that first allocator parameter -- which makes
sense.</p>
<p>I'd suggest to turn that into SegmentAllocator -- note that Arena
is also a SegmentAllocator, so you can pass either your own
allocating lambda, or directly use an arena which is nice (I
think).</p></div></blockquote><div><br></div><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Ah, that's an interesting idea. The idea with using the plain function is that the user would explicitly be in charge of deciding all of the characteristics of the buffer (including, for example, alignment). I'm not quite sure how that shakes out (ergonomically speaking) with using a `SegmentAllocator`, because either I'd have to send in the alignment (which means, I guess, accepting it as a parameter in the `buildToMemorySegment` methods), or else rely on the user to override that method in their allocator. With a plain function, the user could always pass in `mySegmentAllocator::allocate` or `size -> mySegmentAllocator.allocate(size, 8)` or whatever. Also, with a plain function, I could pass one in which easily yields a subsegment of a parent segment e.g. `size -> parentSegment.asSlice(offset, size)`. Would it still be easy to do this while accepting a `SegmentAllocator`?</div></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>
<p>ByteBuffer don't have the allocation abstraction -- so I wonder
if we really should let them go, and maybe only add the segment
API with the allocator parameter. Or maybe have a simpler API for
ByteBuffer w/o allocator parameter, which always allocates a heap
buffer. Then if users want something more complex (e.g. a mapped
byte buffer) they can use the MS API instead, and then wrap the MS
into a BB after the fact.<br></p></div></blockquote><div><br></div><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Well, ByteBuffer has the same allocation abstraction that (for example) arrays do when you're calling `Collection.toArray(generator)` - that is, `IntFunction<T>`. Using this abstraction, one can basically get all the same benefits described above - the ability to select direct or heap, the ability to return a subslice of a parent buffer, etc. But I agree, it seems that semantically the buffer stuff is pretty much exactly redundant with respect to the MemorySegment variations (since we have `MemorySegment.asByteBuffer()` and `MemorySegment.ofBuffer()`), so I could see dropping it and just living with the minor asymmetry with `ClassLoader.defineClass(...)`.</div></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><p>
</p>
<p>Maurizio<br>
</p>
<p><br>
</p>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div style="font-family:arial,helvetica,sans-serif"><br>
</div>
<div style="font-family:arial,helvetica,sans-serif">That said,
the buildTo* variants exist because Brian suggested that
the API might be unacceptably asymmetrical otherwise.
Either way, the part that matters the most to me is the
parsing side.</div>
<div style="font-family:arial,helvetica,sans-serif"><br>
</div>
<div style="font-family:arial,helvetica,sans-serif">The idea
with the API is that, while the current impl would need to
copy to a byte array, it is at least theoretically
possible that in the future, that may change. As a user of
the byte[]-based API, I am already copying from my direct
buffer, so it's not worse than the status quo. By putting
the copy into the JDK, if the JDK does get enhanced
someday to use MemorySegment internally, or maybe Unsafe,
or whatever, then I'll get the benefit of this change. If
it doesn't, then I'm no worse off than I am today
(slightly better off actually, because it saves me a step
going in). Supporting this use case would be beneficial
for the same reason that it is beneficial to be able to
define classes out of direct buffers (which has been
supported since JDK 1.5).</div>
</div>
<div><br>
</div>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">
<div style="font-family:arial,helvetica,sans-serif">The thing
I'm mainly in doubt about is that the ability to parse
from or generate to byte buffers is potentially
redundant with respect to MemorySegment. It would just
be a bit weird if I could define a class using an array
or a byte buffer, but parsing classes used arrays or
memory segments. Is it weird enough to justify the third
API variant? I don't know.</div>
</div>
<div dir="ltr" class="gmail_attr"><br>
</div>
<div dir="ltr" class="gmail_attr">On Fri, Mar 21, 2025 at
9:27 AM Maurizio Cimadamore <<a href="mailto:maurizio.cimadamore@oracle.com" target="_blank">maurizio.cimadamore@oracle.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>I looked at the PR and I had a similar reaction as
Adam.</p>
<p>It seems like the buildToXYZ methods have one less
copy, but... if the API returns a byte[], you can
always wrap the byte[] as either a MemorySegment
(MemorySegment::ofArrays) or a ByteBuffer
(ByteBuffer::wrap). These methods do _not_ copy.</p>
<p>The way I see it, is that this PR doesn't remove the
need to copy on the way in (that would be a much more
complex change), and it doesn't improve things
significantly _on the way out_ (because, that side is
already covered by existing API/views support).</p>
<p>So, while I'm not strongly opposed to the changes in
the PR, at the same time I don't see a lot of added
value in the new methods. At the same time I'm a bit
skeptical to commit to the new API in without doing
the full exercise to see how these methods might be
implemented.</p>
<p>A good motto in API design is "when in doubt, leave
it out" -- is this one of those cases?</p>
<p>Maurizio<br>
</p>
<p><br>
</p>
<p><br>
</p>
<div>On 21/03/2025 13:34, David Lloyd wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div style="font-family:arial,helvetica,sans-serif">The
idea is that in the future, it may be possible to
do these things without the extra copy. In the PR,
I found that we can already build to memory
segments and byte buffers without more copies than
what we're doing for arrays. On the parsing side,
we can already sometimes work without copying in
some cases that the user won't have access to
(e.g. accessing the backing array of a memory
segment, even if it's read-only). It's not hard to
imagine that we could possibly have a way to parse
without the extra copy in the native memory case
in the future, without impacting current
performance on arrays. But without the API
support, it can never be possible.</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Fri, Mar 21,
2025 at 8:26 AM Adam Sotona <<a href="mailto:adam.sotona@oracle.com" target="_blank">adam.sotona@oracle.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<div lang="EN-US">
<div>
<p class="MsoNormal"><span style="font-size:11pt">I’m more thinking
that the API already provides all the
important entries and conversion from
and to `MemorySegment` can be done by
simple call of `MemorySegment::toArray`
and `MemorySegment::ofArray`.</span></p>
<p class="MsoNormal"><span style="font-size:11pt"></span></p>
<p class="MsoNormal"><span style="font-size:11pt"> </span></p>
<div id="m_-3270313530559502774m_2122520250081023453m_-7536612924125414746m_-4103065197396529612mail-editor-reference-message-container">
<div>
<div>
<div style="border-right:none;border-bottom:none;border-left:none;border-top:1pt solid rgb(181,196,223);padding:3pt 0in 0in">
<p class="MsoNormal" style="margin-bottom:12pt"><b><span style="font-size:12pt;color:black">From: </span></b><span style="font-size:12pt;color:black">David Lloyd <<a href="mailto:david.lloyd@redhat.com" target="_blank">david.lloyd@redhat.com</a>><br>
<b>Date: </b>Friday, 21 March
2025 at 13:37<br>
<b>To: </b>Adam Sotona <<a href="mailto:adam.sotona@oracle.com" target="_blank">adam.sotona@oracle.com</a>><br>
<b>Cc: </b><a href="mailto:classfile-api-dev@openjdk.org" target="_blank">classfile-api-dev@openjdk.org</a>
<<a href="mailto:classfile-api-dev@openjdk.org" target="_blank">classfile-api-dev@openjdk.org</a>><br>
<b>Subject: </b>[External] :
Re: Class files in ByteBuffer</span></p>
</div>
<div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;font-family:Arial,sans-serif">Please have a look
at the PR. If you feel the API
surface has grown too much,
maybe removing the
`ByteBuffer` variants is a
logical step, since users can
always wrap a `ByteBuffer`
with a `MemorySegment`? If you
could comment on the PR if you
feel that to be the case, I
would appreciate it.</span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;font-family:Arial,sans-serif"> </span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;font-family:Arial,sans-serif">Thanks.</span></p>
</div>
</div>
<p class="MsoNormal"><span style="font-size:12pt"> </span></p>
<div>
<div>
<p class="MsoNormal"><span style="font-size:12pt">On Thu,
Mar 20, 2025 at 4:09</span><span style="font-size:12pt;font-family:Arial,sans-serif"> </span><span style="font-size:12pt">PM Adam
Sotona <<a href="mailto:adam.sotona@oracle.com" target="_blank">adam.sotona@oracle.com</a>> wrote:</span></p>
</div>
<blockquote style="border-top:none;border-right:none;border-bottom:none;border-left:1pt solid rgb(204,204,204);padding:0in 0in 0in 6pt;margin-left:4.8pt;margin-right:0in">
<div>
<div>
<div>
<p class="MsoNormal"><span style="font-size:11pt">I’m
sorry to join the
discussion a bit late.</span><span style="font-size:12pt"></span></p>
<p class="MsoNormal"><span style="font-size:11pt"> </span><span style="font-size:12pt"></span></p>
<p class="MsoNormal"><span style="font-size:11pt">Here
are the points to
consider:</span><span style="font-size:12pt"></span></p>
<ul type="disc">
<li> <span style="font-size:11pt">Class-File
API is implementation
is after many rounds
of performance
optimizations purely
based on byte arrays.</span></li>
<li> <span style="font-size:11pt">Internal
use of ByteBuffer has
been removed from the
implementation, as it
caused significant JDK
bootstrap performance
regression.</span></li>
<li> <span style="font-size:11pt">Enormous
amount of work has
been spent on the API
surface reduction and
removal of all
unnecessary
“conveniences”.</span></li>
</ul>
<p class="MsoNormal"><span style="font-size:11pt"> </span><span style="font-size:12pt"></span></p>
<p class="MsoNormal"><span style="font-size:11pt">Adam</span><span style="font-size:12pt"></span></p>
<p class="MsoNormal"><span style="font-size:11pt"> </span><span style="font-size:12pt"></span></p>
<p class="MsoNormal"><span style="font-size:11pt"> </span><span style="font-size:12pt"></span></p>
<p class="MsoNormal"><span style="font-size:11pt"> </span><span style="font-size:12pt"></span></p>
<div id="m_-3270313530559502774m_2122520250081023453m_-7536612924125414746m_-4103065197396529612m_1102407431525991501mail-editor-reference-message-container">
<div>
<div>
<div style="border-right:none;border-bottom:none;border-left:none;border-top:1pt solid rgb(181,196,223);padding:3pt 0in 0in">
<p class="MsoNormal" style="margin-bottom:12pt"><b><span style="font-size:12pt;color:black">From:
</span></b><span style="font-size:12pt;color:black">classfile-api-dev <<a href="mailto:classfile-api-dev-retn@openjdk.org" target="_blank">classfile-api-dev-retn@openjdk.org</a>>
on behalf of
David Lloyd <<a href="mailto:david.lloyd@redhat.com" target="_blank">david.lloyd@redhat.com</a>><br>
<b>Date: </b>Thursday,
20 March 2025 at
21:11<br>
<b>To: </b><a href="mailto:classfile-api-dev@openjdk.org" target="_blank">classfile-api-dev@openjdk.org</a>
<<a href="mailto:classfile-api-dev@openjdk.org" target="_blank">classfile-api-dev@openjdk.org</a>><br>
<b>Subject: </b>Re:
Class files in
ByteBuffer</span><span style="font-size:12pt"></span></p>
</div>
<div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;font-family:Arial,sans-serif">I've opened a bug
[1] and pull
request [2]
incorporating
this
discussion
(more or
less). I've
implemented
support for
both
`MemorySegment`
and
`ByteBuffer`,
but this could
be revisited
if it doesn't
look OK. The
implementation
is not
terribly
invasive for
now, only
grabbing a few
low-hanging
optimizations.</span><span style="font-size:12pt"></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;font-family:Arial,sans-serif"> </span><span style="font-size:12pt"></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;font-family:Arial,sans-serif">[1] </span><span style="font-size:12pt"><a href="https://bugs.openjdk.org/browse/JDK-8352536" target="_blank"><span style="font-family:Arial,sans-serif">https://bugs.openjdk.org/browse/JDK-8352536</span></a></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;font-family:Arial,sans-serif">[2] </span><span style="font-size:12pt"><a href="https://urldefense.com/v3/__https:/github.com/openjdk/jdk/pull/24139__;!!ACWV5N9M2RV99hQ!MNdgtt56CgKWQQak5PkwC2t1L1q0tTZrT2fqzVPHYo3jer5CUfysIIx9uMuCoLb27YGoQWuUDgu3s_h8ZhlN-w$" target="_blank"><span style="font-family:Arial,sans-serif">https://github.com/openjdk/jdk/pull/24139</span></a></span></p>
</div>
</div>
<p class="MsoNormal"><span style="font-size:12pt"> </span></p>
<div>
<div>
<p class="MsoNormal"><span style="font-size:12pt">On Mon, Mar 10, 2025 at
12:38</span><span style="font-size:12pt;font-family:Arial,sans-serif"> </span><span style="font-size:12pt">PM David Lloyd <<a href="mailto:david.lloyd@redhat.com" target="_blank">david.lloyd@redhat.com</a>>
wrote:</span></p>
</div>
<blockquote style="border-top:none;border-right:none;border-bottom:none;border-left:1pt solid rgb(204,204,204);padding:0in 0in 0in 6pt;margin:5pt 0in 5pt 4.8pt">
<div>
<div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;font-family:Arial,sans-serif">When defining a
class in the
JDK, one may
either use a
byte array or
a byte buffer
to hold the
contents of
the class. The
latter is
useful when
(for example)
a JAR file
containing
uncompressed
classes is
mapped into
memory. Thus,
some class
loaders depend
on this form
of the API for
class
definition.</span><span style="font-size:12pt"></span></p>
</div>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt"> </span></p>
</div>
<div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;font-family:Arial,sans-serif">If I were to
supplement
such a class
loader with a
class
transformation
step based on
the class file
API, I would
have to copy
the bytes of
each class on
to the heap as
a byte[]
before I could
begin parsing
it. This is
potentially
expensive, and
definitely
awkward.</span><span style="font-size:12pt"></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;font-family:Arial,sans-serif"> </span><span style="font-size:12pt"></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;font-family:Arial,sans-serif">After
transformation,
it doesn't
really matter
if you have a
byte[] or
ByteBuffer
because either
way, the class
can be defined
directly.</span><span style="font-size:12pt"></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;font-family:Arial,sans-serif"> </span><span style="font-size:12pt"></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;font-family:Arial,sans-serif">It would be nice if
the class file
parser could
accept either
a byte[] or a
ByteBuffer. I
did a quick
bit of
exploratory
work and it
looks like
porting the
code to read
from a
ByteBuffer
instead of a
byte[] (using
ByteBuffer.wrap() for the array case) would be largely straightforward
*except* for
the code which
parses UTF-8
constants into
strings. Also
there could be
some small
performance
differences
(maybe
positive,
maybe
negative)
depending on
how the buffer
is accessed.</span><span style="font-size:12pt"></span></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;font-family:Arial,sans-serif"> </span><span style="font-size:12pt"></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;font-family:Arial,sans-serif">Is this something
that might be
considered?</span><span style="font-size:12pt"></span></p>
</div>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt;font-family:Arial,sans-serif"> </span><span style="font-size:12pt"></span></p>
</div>
<p class="MsoNormal"><span><span style="font-size:12pt">-- </span></span><span style="font-size:12pt"></span></p>
<div>
<div>
<p class="MsoNormal"><span style="font-size:12pt">- DML • he/him</span></p>
</div>
</div>
</div>
</blockquote>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt"><br clear="all">
</span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt"> </span></p>
</div>
<p class="MsoNormal"><span><span style="font-size:12pt">-- </span></span><span style="font-size:12pt"></span></p>
<div>
<div>
<p class="MsoNormal"><span style="font-size:12pt">- DML • he/him</span></p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt"><br clear="all">
</span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12pt"> </span></p>
</div>
<p class="MsoNormal"><span><span style="font-size:12pt">-- </span></span><span style="font-size:12pt"></span></p>
<div>
<div>
<p class="MsoNormal"><span style="font-size:12pt">- DML •
he/him</span></p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<div><br clear="all">
</div>
<div><br>
</div>
<span class="gmail_signature_prefix">-- </span><br>
<div dir="ltr" class="gmail_signature">
<div dir="ltr">- DML • he/him<br>
</div>
</div>
</blockquote>
</div>
</blockquote>
</div>
<div><br clear="all">
</div>
<div><br>
</div>
<span class="gmail_signature_prefix">-- </span><br>
<div dir="ltr" class="gmail_signature">
<div dir="ltr">- DML • he/him<br>
</div>
</div>
</div>
</div>
</blockquote>
</div>
</blockquote></div><div><br clear="all"></div><div><br></div><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature"><div dir="ltr">- DML • he/him<br></div></div></div>