From david.lloyd at redhat.com Mon Mar 10 17:38:12 2025 From: david.lloyd at redhat.com (David Lloyd) Date: Mon, 10 Mar 2025 17:38:12 +0000 Subject: Class files in ByteBuffer Message-ID: When defining a class in the JDK, one may either use a byte array or a byte buffer to hold the contents of the class. The latter is useful when (for example) a JAR file containing uncompressed classes is mapped into memory. Thus, some class loaders depend on this form of the API for class definition. If I were to supplement such a class loader with a class transformation step based on the class file API, I would have to copy the bytes of each class on to the heap as a byte[] before I could begin parsing it. This is potentially expensive, and definitely awkward. After transformation, it doesn't really matter if you have a byte[] or ByteBuffer because either way, the class can be defined directly. It would be nice if the class file parser could accept either a byte[] or a ByteBuffer. I did a quick bit of exploratory work and it looks like porting the code to read from a ByteBuffer instead of a byte[] (using ByteBuffer.wrap() for the array case) would be largely straightforward *except* for the code which parses UTF-8 constants into strings. Also there could be some small performance differences (maybe positive, maybe negative) depending on how the buffer is accessed. Is this something that might be considered? -- - DML ? he/him -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Mar 10 17:52:10 2025 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 10 Mar 2025 13:52:10 -0400 Subject: Class files in ByteBuffer In-Reply-To: References: Message-ID: <61213c8a-ca86-4d37-8d5a-1aff31834481@oracle.com> It sounds like you are asking two questions.? At the API level, you are asking whether adding a Classfile.parse(ByteBuffer) method would be in scope.? But at the implementation level, you are asking whether we would be OK to make ByteBuffer *the primitive* on which processing the byte[] format is based, which is a more intrusive change. My first reaction is that the first seems fine in theory, but if the only reasonable implementation strategy is the latter, then I am pretty skeptical. A ByteBuffer-accepting factory that simply copied to a byte[] would be fine (this is what we do with the existing Path-accepting factory, it's a similar form of convenience), but it sounds like this would not make you any happier. On 3/10/2025 1:38 PM, David Lloyd wrote: > When defining a class in the JDK, one may either use a byte array or a > byte buffer to hold the contents of the class. The latter is useful > when (for example) a JAR file containing uncompressed classes is > mapped into memory. Thus, some class loaders depend on this form of > the API for class definition. > > If I were to supplement such a class loader with a class > transformation step based on the class file API, I would have to copy > the bytes of each class on to the heap as a byte[] before I could > begin parsing it. This is potentially expensive, and definitely awkward. > > After transformation, it doesn't really matter if you have a byte[] or > ByteBuffer because either way, the class can be defined directly. > > It would be nice if the class file parser could accept either a byte[] > or a ByteBuffer. I did a quick bit of exploratory work and it looks > like porting the code to read from a ByteBuffer instead of a byte[]? > (using ByteBuffer.wrap() for the array case) would be largely > straightforward *except* for the code which parses UTF-8 constants > into strings. Also there could be some small performance differences > (maybe positive, maybe negative) depending on how the buffer is accessed. > > Is this something that might be considered? > > -- > - DML ? he/him -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.lloyd at redhat.com Mon Mar 10 18:13:39 2025 From: david.lloyd at redhat.com (David Lloyd) Date: Mon, 10 Mar 2025 18:13:39 +0000 Subject: Class files in ByteBuffer In-Reply-To: <61213c8a-ca86-4d37-8d5a-1aff31834481@oracle.com> References: <61213c8a-ca86-4d37-8d5a-1aff31834481@oracle.com> Message-ID: Thanks for the response; comments inline. On Mon, Mar 10, 2025 at 12:52?PM Brian Goetz wrote: > It sounds like you are asking two questions. At the API level, you are > asking whether adding a Classfile.parse(ByteBuffer) method would be in > scope. But at the implementation level, you are asking whether we would be > OK to make ByteBuffer *the primitive* on which processing the byte[] format > is based, which is a more intrusive change. > > My first reaction is that the first seems fine in theory, but if the only > reasonable implementation strategy is the latter, then I am pretty > skeptical. > A ByteBuffer-accepting factory that simply copied to a byte[] would be fine > (this is what we do with the existing Path-accepting factory, it's a > similar form of convenience), but it sounds like this would not make you > any happier. > Well, it honestly wouldn't make me unhappy, because it's not worse than today's status quo. If the API exists, then optimization is always going to be a future possibility. So I for one would be fine with this as a starting point, especially if it would greatly increase the chances of such an API being included in time for Java 25. Trying to find an optimal implementation strategy might be a diverting future spare-time project for someone (maybe even myself if I ever find enough of those elusive "round tuits" I keep hearing about). -- - DML ? he/him -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Mar 10 18:18:19 2025 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 10 Mar 2025 14:18:19 -0400 Subject: Class files in ByteBuffer In-Reply-To: References: <61213c8a-ca86-4d37-8d5a-1aff31834481@oracle.com> Message-ID: So, the other half of this is the overloads for Classfile::buildToByteBuffer, which I assume has a similarly trivial initial implementation; we wouldn't want to do one without the other, as it will seem a gratuitous asymmetry.? If both are shallow implementations, I'm not averse to this -- though you'll probably want an @ImplNote that explains how the implementation works, to avoid unhappy performance surprises. On 3/10/2025 2:13 PM, David Lloyd wrote: > Thanks for the response; comments inline. > > On Mon, Mar 10, 2025 at 12:52?PM Brian Goetz > wrote: > > It sounds like you are asking two questions.? At the API level, > you are asking whether adding a Classfile.parse(ByteBuffer) method > would be in scope.? But at the implementation level, you are > asking whether we would be OK to make ByteBuffer *the primitive* > on which processing the byte[] format is based, which is a more > intrusive change. > > My first reaction is that the first seems fine in theory, but if > the only reasonable implementation strategy is the latter, then I > am pretty skeptical. > > > A ByteBuffer-accepting factory that simply copied to a byte[] > would be fine (this is what we do with the existing Path-accepting > factory, it's a similar form of convenience), but it sounds like > this would not make you any happier. > > > Well, it honestly wouldn't make me unhappy,?because it's not worse > than today's status quo. If the API exists, then optimization is > always going to be a future possibility. So I for one would be fine > with this as a starting point, especially if it would greatly increase > the chances of such an API being included in time for Java 25. Trying > to find an optimal implementation strategy might be a diverting future > spare-time project for someone (maybe even myself if I ever find > enough of those elusive "round?tuits" I keep hearing about). > > -- > - DML ? he/him -------------- next part -------------- An HTML attachment was scrubbed... URL: From chen.l.liang at oracle.com Mon Mar 10 18:46:46 2025 From: chen.l.liang at oracle.com (Chen Liang) Date: Mon, 10 Mar 2025 18:46:46 +0000 Subject: Class files in ByteBuffer In-Reply-To: References: Message-ID: I think the use of ByteBuffer vs byte[] is a tradeoff - JIT compiler has a lot of trouble with ByteBuffer due to polymorphism and this might actually turn out to be a regression. (ClassFile API previously used ByteBuffer for stack map generation I think; it has been since eliminated for performance improvements) Also ClassFile API depends on some sweet properties of byte[], such as using some String intrinsics on byte array to quickly process ascii-compatible UTF8 entries. Luckily the access to the array is nicely encapsulated in ClassReader for the most part and Utf8 entry is the only place where it escapes. You should be able to make a prototype of reading from ByteBuffer easily; your "using byte buffer as backing" approach might be accepted if you can prove there is no regression in the case of reading from plain byte arrays. Regards, Chen ________________________________ From: classfile-api-dev on behalf of David Lloyd Sent: Monday, March 10, 2025 12:38 PM To: classfile-api-dev at openjdk.org Subject: Class files in ByteBuffer When defining a class in the JDK, one may either use a byte array or a byte buffer to hold the contents of the class. The latter is useful when (for example) a JAR file containing uncompressed classes is mapped into memory. Thus, some class loaders depend on this form of the API for class definition. If I were to supplement such a class loader with a class transformation step based on the class file API, I would have to copy the bytes of each class on to the heap as a byte[] before I could begin parsing it. This is potentially expensive, and definitely awkward. After transformation, it doesn't really matter if you have a byte[] or ByteBuffer because either way, the class can be defined directly. It would be nice if the class file parser could accept either a byte[] or a ByteBuffer. I did a quick bit of exploratory work and it looks like porting the code to read from a ByteBuffer instead of a byte[] (using ByteBuffer.wrap() for the array case) would be largely straightforward *except* for the code which parses UTF-8 constants into strings. Also there could be some small performance differences (maybe positive, maybe negative) depending on how the buffer is accessed. Is this something that might be considered? -- - DML ? he/him -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.lloyd at redhat.com Wed Mar 12 13:27:31 2025 From: david.lloyd at redhat.com (David Lloyd) Date: Wed, 12 Mar 2025 13:27:31 +0000 Subject: Class files in ByteBuffer In-Reply-To: References: <61213c8a-ca86-4d37-8d5a-1aff31834481@oracle.com> Message-ID: Making the output fully symmetrical might be a little bit more challenging (interesting?) than it seemed to be at first glance. You'd have to think about questions like "should the buffer be direct?". We could possibly allow an `IntFunction` to be passed in, to support flexible allocation strategies and to allow (for example) writing to memory-mapped areas and things like that. Since we're currently doing a couple of `arraycopy` to write to the output, it should be trivial to create a variation which bulk-writes to a user-supplied `ByteBuffer`. This would be more broadly useful than just a naive `ByteBuffer.wrap()` on the byte array output. That effect could however still be achieved if the user passes in e.g. `ByteBuffer::allocate` as the buffer acquisition function (we could possibly supply an overload which uses this strategy). On Mon, Mar 10, 2025 at 1:18?PM Brian Goetz wrote: > So, the other half of this is the overloads for > Classfile::buildToByteBuffer, which I assume has a similarly trivial > initial implementation; we wouldn't want to do one without the other, as it > will seem a gratuitous asymmetry. If both are shallow implementations, I'm > not averse to this -- though you'll probably want an @ImplNote that > explains how the implementation works, to avoid unhappy performance > surprises. > > On 3/10/2025 2:13 PM, David Lloyd wrote: > > Thanks for the response; comments inline. > > On Mon, Mar 10, 2025 at 12:52?PM Brian Goetz > wrote: > >> It sounds like you are asking two questions. At the API level, you are >> asking whether adding a Classfile.parse(ByteBuffer) method would be in >> scope. But at the implementation level, you are asking whether we would be >> OK to make ByteBuffer *the primitive* on which processing the byte[] format >> is based, which is a more intrusive change. >> >> My first reaction is that the first seems fine in theory, but if the only >> reasonable implementation strategy is the latter, then I am pretty >> skeptical. >> > > A ByteBuffer-accepting factory that simply copied to a byte[] would be >> fine (this is what we do with the existing Path-accepting factory, it's a >> similar form of convenience), but it sounds like this would not make you >> any happier. >> > > Well, it honestly wouldn't make me unhappy, because it's not worse than > today's status quo. If the API exists, then optimization is always going to > be a future possibility. So I for one would be fine with this as a starting > point, especially if it would greatly increase the chances of such an API > being included in time for Java 25. Trying to find an optimal > implementation strategy might be a diverting future spare-time project for > someone (maybe even myself if I ever find enough of those elusive > "round tuits" I keep hearing about). > > -- > - DML ? he/him > > > -- - DML ? he/him -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Wed Mar 12 15:53:45 2025 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 12 Mar 2025 15:53:45 +0000 Subject: Class files in ByteBuffer In-Reply-To: <61213c8a-ca86-4d37-8d5a-1aff31834481@oracle.com> References: <61213c8a-ca86-4d37-8d5a-1aff31834481@oracle.com> Message-ID: <16a12929-7ba7-417c-a8f4-aa0f04e5e11e@oracle.com> On 10/03/2025 17:52, Brian Goetz wrote: > My first reaction is that the first seems fine in theory I wonder if an API accepting a MemorySegment would be more general -- you can construct a MS from a BB and you can of course go from MS to byte[] (which is what the impl needs). So I wonder if that would be more future-proof. (We can, of course, also provide both). Maurizio From brian.goetz at oracle.com Wed Mar 12 16:25:31 2025 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 12 Mar 2025 12:25:31 -0400 Subject: Class files in ByteBuffer In-Reply-To: <16a12929-7ba7-417c-a8f4-aa0f04e5e11e@oracle.com> References: <61213c8a-ca86-4d37-8d5a-1aff31834481@oracle.com> <16a12929-7ba7-417c-a8f4-aa0f04e5e11e@oracle.com> Message-ID: That does seem like a more future-proof choice.? (I suspect too it would be less intrusive to adapt the internals to MS than BB.) On 3/12/2025 11:53 AM, Maurizio Cimadamore wrote: > > On 10/03/2025 17:52, Brian Goetz wrote: >> My first reaction is that the first seems fine in theory > > I wonder if an API accepting a MemorySegment would be more general -- > you can construct a MS from a BB and you can of course go from MS to > byte[] (which is what the impl needs). So I wonder if that would be > more future-proof. (We can, of course, also provide both). > > Maurizio > -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Wed Mar 12 16:51:41 2025 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 12 Mar 2025 16:51:41 +0000 Subject: Class files in ByteBuffer In-Reply-To: References: <61213c8a-ca86-4d37-8d5a-1aff31834481@oracle.com> <16a12929-7ba7-417c-a8f4-aa0f04e5e11e@oracle.com> Message-ID: On 12/03/2025 16:25, Brian Goetz wrote: > That does seem like a more future-proof choice.? (I suspect too it > would be less intrusive to adapt the internals to MS than BB.) They are probably similar in spirit -- but at least you would know that the MS path is more aggressively/actively optimized. I do share some of Chen's concerns -- random access on MS (and BB) is not comparable to random access on a byte[]. So changing the internals of the classfile API to use MS/BB is something that needs to be done carefully (and with benchmarks at hands). One possible area where adopting a "more raw" buffer would be beneficial is when writing/reading custom attributes -- since BB/MS will already provide the primitives we need to access load/store primitive values from/in the buffer. But -- again, something that requires care and consideration, it's not a slam dunk. Maurizio > > On 3/12/2025 11:53 AM, Maurizio Cimadamore wrote: >> >> On 10/03/2025 17:52, Brian Goetz wrote: >>> My first reaction is that the first seems fine in theory >> >> I wonder if an API accepting a MemorySegment would be more general -- >> you can construct a MS from a BB and you can of course go from MS to >> byte[] (which is what the impl needs). So I wonder if that would be >> more future-proof. (We can, of course, also provide both). >> >> Maurizio >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.lloyd at redhat.com Wed Mar 12 19:10:44 2025 From: david.lloyd at redhat.com (David Lloyd) Date: Wed, 12 Mar 2025 19:10:44 +0000 Subject: Class files in ByteBuffer In-Reply-To: References: <61213c8a-ca86-4d37-8d5a-1aff31834481@oracle.com> <16a12929-7ba7-417c-a8f4-aa0f04e5e11e@oracle.com> Message-ID: On Wed, Mar 12, 2025 at 11:51?AM Maurizio Cimadamore < maurizio.cimadamore at oracle.com> wrote: > > On 12/03/2025 16:25, Brian Goetz wrote: > > That does seem like a more future-proof choice. (I suspect too it would > be less intrusive to adapt the internals to MS than BB.) > > They are probably similar in spirit -- but at least you would know that > the MS path is more aggressively/actively optimized. > > I do share some of Chen's concerns -- random access on MS (and BB) is not > comparable to random access on a byte[]. So changing the internals of the > classfile API to use MS/BB is something that needs to be done carefully > (and with benchmarks at hands). > > One possible area where adopting a "more raw" buffer would be beneficial > is when writing/reading custom attributes -- since BB/MS will already > provide the primitives we need to access load/store primitive values > from/in the buffer. But -- again, something that requires care and > consideration, it's not a slam dunk. > Internally, (on the parsing side at least) it is my expectation that we would not likely be able to get away with having a single, general access strategy using the `MemorySegment` API (but we could test that now, even without the suggested API changes - I would love to be wrong). It seems more likely that we'd want to keep the current array-based strategy (which uses `Unsafe` liberally) and add a new direct memory address-based access strategy (also using `Unsafe` in an equivalent manner), and select the strategy based on the kind of `MemorySegment` or `ByteBuffer`. Having three parse and build APIs (one for each of `byte[]`, `ByteBuffer`, and `MemorySegment`) makes sense to me because there's a use case for each of them, and they can be implemented in terms of one another to a great extent which gives a lot of flexibility. Particularly, it seems to me that as long as `ClassLoader.defineClass(String,ByteBuffer,ProtectionDomain)` exists, then ByteBuffer should be floated up to the API, even if it ends up being e.g. `MemorySegment.ofByteBuffer()` on the inside. (That said, I wouldn't hate it if a new `defineClass` which uses `MemorySegment` could be defined someday.) -- - DML ? he/him -------------- next part -------------- An HTML attachment was scrubbed... URL: From john at int4.org Sat Mar 15 09:18:48 2025 From: john at int4.org (John Hendrikx) Date: Sat, 15 Mar 2025 10:18:48 +0100 Subject: Define classes with circular dependency? Message-ID: <7d275be0-d3a4-40de-9e1d-8d5c82f6ced4@int4.org> Hi list, I'm trying to use the ClassFile API to automatically implement control classes (as found in JavaFX). These classes define inner class CssMetaData implementations that refer back to the outer class, and the outer class refers to these implementations via static fields.? When I define one of the inner types using Lookup::defineClass I get a NoClassDefFoundError for the outer type.? When I define the outer type first, I get a NoClassDefFoundError for one of the inner types.? The situation is essentially this: publicclassSample { privatefinalProperty b= newProperty(A); privatestaticfinalCssMetaData A= newCssMetaData() { @Override publicProperty getProperty(Object obj) { return((Sample)obj).b; } }; } abstractclassCssMetaData { abstractProperty getProperty(Object obj); } classProperty { publicProperty(CssMetaData a) { } } I'm trying to generate the Sample class.? The classes CssMetaData and Propery are pre-existing.? As you can see, Sample refers to A in a property it creates, while A refers to that property by direct field access after a cast. Note that the above is perfectly legal as a Java class, and I think the bytecode I generate is correct.? It seems I would need to be able to define both classes at the same time, but Lookup doesn't seem to have anything for this purpose. I'd appreciate any insights! --John -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.van.acken at gmail.com Sat Mar 15 09:53:20 2025 From: michael.van.acken at gmail.com (Michael van Acken) Date: Sat, 15 Mar 2025 10:53:20 +0100 Subject: Define classes with circular dependency? In-Reply-To: <7d275be0-d3a4-40de-9e1d-8d5c82f6ced4@int4.org> References: <7d275be0-d3a4-40de-9e1d-8d5c82f6ced4@int4.org> Message-ID: As far as I know, at time of definition only the class being extended must be available. >From your example, this seems to be j.l.Object both times, so this should not be the problem. But your case triggers a vague recollection, where I had the same behaviour of Lookup.defineClass() in some gnarly unit test of my compiler -- with a NoCDFE that I was not able to explain. Out of curiosity: what happens when you catch the exception and do a findClass using the same lookup and the dotted class name of the class you just tried to define? In my case, I got back the class instance in the catch clause, suggesting the defineClass completed after all. -- mva Am Sa., 15. M?rz 2025 um 10:19 Uhr schrieb John Hendrikx : > Hi list, > > I'm trying to use the ClassFile API to automatically implement control > classes (as found in JavaFX). These classes define inner class CssMetaData > implementations that refer back to the outer class, and the outer class > refers to these implementations via static fields. When I define one of > the inner types using Lookup::defineClass I get a NoClassDefFoundError for > the outer type. When I define the outer type first, I get a > NoClassDefFoundError for one of the inner types. The situation is > essentially this: > > public class Sample { > > private final Property b = new Property(A); > > private static final CssMetaData A = new CssMetaData() { > > @Override > > public Property getProperty(Object obj) { > > return ((Sample)obj).b; > > } > > }; > > } > > abstract class CssMetaData { > > abstract Property getProperty(Object obj); > > } > > class Property { > > public Property(CssMetaData a) { > > } > > } > > I'm trying to generate the Sample class. The classes CssMetaData and > Propery are pre-existing. As you can see, Sample refers to A in a property > it creates, while A refers to that property by direct field access after a > cast. > > Note that the above is perfectly legal as a Java class, and I think the > bytecode I generate is correct. It seems I would need to be able to define > both classes at the same time, but Lookup doesn't seem to have anything for > this purpose. > > I'd appreciate any insights! > > --John > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sat Mar 15 10:17:58 2025 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 15 Mar 2025 11:17:58 +0100 (CET) Subject: Define classes with circular dependency? In-Reply-To: <7d275be0-d3a4-40de-9e1d-8d5c82f6ced4@int4.org> References: <7d275be0-d3a4-40de-9e1d-8d5c82f6ced4@int4.org> Message-ID: <1277447524.167478769.1742033878278.JavaMail.zimbra@univ-eiffel.fr> Hello, If you need to define dynamically more than one class, the usual trick is to use an invokedynamic or a constant dynamic so the resolution of the other classes are not done using the ClassLoader but by the code of the bootstrap method of invokedynamic/constant dynamic. In you case, you can use constant dynamic to initialize the CSSMetaData, by emiting an LDC constant dynamic in the static to initialise the static field ('A' in the example). regards, R?mi > From: "John Hendrikx" > To: "classfile-api-dev" > Sent: Saturday, March 15, 2025 10:18:48 AM > Subject: Define classes with circular dependency? > Hi list, > I'm trying to use the ClassFile API to automatically implement control classes > (as found in JavaFX). These classes define inner class CssMetaData > implementations that refer back to the outer class, and the outer class refers > to these implementations via static fields. When I define one of the inner > types using Lookup::defineClass I get a NoClassDefFoundError for the outer > type. When I define the outer type first, I get a NoClassDefFoundError for one > of the inner types. The situation is essentially this: > public class Sample { > private final Property b = new Property( A ); > private static final CssMetaData A = new CssMetaData() { > @Override > public Property getProperty(Object obj) { > return ((Sample)obj). b ; > } > }; > } > abstract class CssMetaData { > abstract Property getProperty(Object obj); > } > class Property { > public Property(CssMetaData a ) { > } > } > I'm trying to generate the Sample class. The classes CssMetaData and Propery are > pre-existing. As you can see, Sample refers to A in a property it creates, > while A refers to that property by direct field access after a cast. > Note that the above is perfectly legal as a Java class, and I think the bytecode > I generate is correct. It seems I would need to be able to define both classes > at the same time, but Lookup doesn't seem to have anything for this purpose. > I'd appreciate any insights! > --John -------------- next part -------------- An HTML attachment was scrubbed... URL: From john at int4.org Sat Mar 15 10:31:24 2025 From: john at int4.org (John Hendrikx) Date: Sat, 15 Mar 2025 11:31:24 +0100 Subject: Define classes with circular dependency? In-Reply-To: References: <7d275be0-d3a4-40de-9e1d-8d5c82f6ced4@int4.org> Message-ID: Sorry, It seems I'm just not well enough versed in byte code generation and how the class verifier works.? The verifier was fine with a `getProperty(outer)` definition with the code within it referring to the outer type.? But the abstract class doesn't define that method, it defines `getProperty(Styleable)` so this method wouldn't actually implement the abstract method.? If the only change I made was the signature, then the verifier would reject it, as the method body refers to outer.? So I had this: cb.withMethodBody("getStyleableProperty", MethodTypeDesc.of(ClassDesc.of("javafx.css.StyleableProperty"), outer), ClassFile.ACC_PUBLIC, mb -> { mb.aload(1) .getfield(outer, cssData.fieldName, ClassDesc.of("javafx.css.CssMetaData")) .checkcast(ClassDesc.of("javafx.css.StyleableProperty")) .areturn(); }); The above works, but doesn't implement the abstract method as it has a different signature.? I found this very confusing as I'm hard referencing an outer ClassDesc here that doesn't exist yet, but its fine with it, yet with the signature modified to correctly implement the abstract method it rejects it suddenly: cb.withMethodBody("getStyleableProperty", MethodTypeDesc.of(ClassDesc.of("javafx.css.StyleableProperty"), ClassDesc.of("javafx.css.Styleable")), ClassFile.ACC_PUBLIC, mb -> { mb.aload(1) .getfield(outer, cssData.fieldName, ClassDesc.of("javafx.css.CssMetaData")) .checkcast(ClassDesc.of("javafx.css.StyleableProperty")) .areturn(); }); The above gets rejected with a NoClassDefFoundError for the outer type.? The solution was to insert a checkcast(outer): cb.withMethodBody("getStyleableProperty", MethodTypeDesc.of(ClassDesc.of("javafx.css.StyleableProperty"), ClassDesc.of("javafx.css.Styleable")), ClassFile.ACC_PUBLIC, mb -> { mb.aload(1) .checkcast(outer) .getfield(outer, cssData.fieldName, ClassDesc.of("javafx.css.CssMetaData")) .checkcast(ClassDesc.of("javafx.css.StyleableProperty")) .areturn(); }); I'm very happy this works.? At first I thought it was just something that only javac in combination with class loaders was allowed to do (a circular class reference) and it couldn't be done with Lookup::defineClass -- it turns out the actual reason seems to be that the reference isn't actually circular during loading,?but simply attempted at runtime.? --John On 15/03/2025 10:53, Michael van Acken wrote: > As far as I know, at time of definition only the class being extended > must be available. > From your example, this seems to be j.l.Object both times, so this > should not be the problem. > > But your case triggers a vague recollection, where I had the same > behaviour of? > Lookup.defineClass() in some gnarly unit test of my compiler -- with a > NoCDFE that > I was not able to explain. > > Out of curiosity: what happens when you catch the exception and do a > findClass using > the same lookup and the dotted class name of the class you just tried > to define?? In > my case, I got back the class instance in the catch clause, suggesting > the defineClass > completed after all. > > -- mva > > > > > > > Am Sa., 15. M?rz 2025 um 10:19?Uhr schrieb John Hendrikx : > > Hi list, > > I'm trying to use the ClassFile API to automatically implement > control classes (as found in JavaFX). These classes define inner > class CssMetaData implementations that refer back to the outer > class, and the outer class refers to these implementations via > static fields.? When I define one of the inner types using > Lookup::defineClass I get a NoClassDefFoundError for the outer > type.? When I define the outer type first, I get a > NoClassDefFoundError for one of the inner types.? The situation is > essentially this: > > publicclassSample { > > privatefinalProperty b= newProperty(A); > > privatestaticfinalCssMetaData A= newCssMetaData() { > > @Override > > publicProperty getProperty(Object obj) { > > return((Sample)obj).b; > > } > > }; > > } > > abstractclassCssMetaData { > > abstractProperty getProperty(Object obj); > > } > > classProperty { > > publicProperty(CssMetaData a) { > > } > > } > > I'm trying to generate the Sample class.? The classes CssMetaData > and Propery are pre-existing.? As you can see, Sample refers to A > in a property it creates, while A refers to that property by > direct field access after a cast. > > Note that the above is perfectly legal as a Java class, and I > think the bytecode I generate is correct.? It seems I would need > to be able to define both classes at the same time, but Lookup > doesn't seem to have anything for this purpose. > > I'd appreciate any insights! > > --John > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sat Mar 15 11:20:21 2025 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 15 Mar 2025 12:20:21 +0100 (CET) Subject: Define classes with circular dependency? In-Reply-To: References: <7d275be0-d3a4-40de-9e1d-8d5c82f6ced4@int4.org> Message-ID: <1223820992.167804476.1742037621797.JavaMail.zimbra@univ-eiffel.fr> > From: "John Hendrikx" > To: "Michael van Acken" > Cc: "classfile-api-dev" > Sent: Saturday, March 15, 2025 11:31:24 AM > Subject: Re: Define classes with circular dependency? > Sorry, > It seems I'm just not well enough versed in byte code generation and how the > class verifier works. > The verifier was fine with a `getProperty(outer)` definition with the code > within it referring to the outer type. But the abstract class doesn't define > that method, it defines `getProperty(Styleable)` so this method wouldn't > actually implement the abstract method. If the only change I made was the > signature, then the verifier would reject it, as the method body refers to > outer. So I had this: > cb.withMethodBody( "getStyleableProperty" , MethodTypeDesc. of (ClassDesc. of ( > "javafx.css.StyleableProperty" ), outer ), ClassFile. ACC_PUBLIC , mb -> { > mb.aload(1) > .getfield( outer , cssData. fieldName , ClassDesc. of ( "javafx.css.CssMetaData" > )) > .checkcast(ClassDesc. of ( "javafx.css.StyleableProperty" )) > .areturn(); > }); > The above works, but doesn't implement the abstract method as it has a different > signature. I found this very confusing as I'm hard referencing an outer > ClassDesc here that doesn't exist yet, but its fine with it, yet with the > signature modified to correctly implement the abstract method it rejects it > suddenly: > cb.withMethodBody( "getStyleableProperty" , MethodTypeDesc. of (ClassDesc. of ( > "javafx.css.StyleableProperty" ), ClassDesc. of ( "javafx.css.Styleable" )), > ClassFile. ACC_PUBLIC , mb -> { > mb.aload(1) > .getfield( outer , cssData. fieldName , ClassDesc. of ( "javafx.css.CssMetaData" > )) > .checkcast(ClassDesc. of ( "javafx.css.StyleableProperty" )) > .areturn(); > }); > The above gets rejected with a NoClassDefFoundError for the outer type. The > solution was to insert a checkcast(outer): > cb.withMethodBody( "getStyleableProperty" , MethodTypeDesc. of (ClassDesc. of ( > "javafx.css.StyleableProperty" ), ClassDesc. of ( "javafx.css.Styleable" )), > ClassFile. ACC_PUBLIC , mb -> { > mb.aload(1) > .checkcast( outer ) > .getfield( outer , cssData. fieldName , ClassDesc. of ( "javafx.css.CssMetaData" > )) > .checkcast(ClassDesc. of ( "javafx.css.StyleableProperty" )) > .areturn(); > }); > I'm very happy this works. At first I thought it was just something that only > javac in combination with class loaders was allowed to do (a circular class > reference) and it couldn't be done with Lookup::defineClass -- it turns out the > actual reason seems to be that the reference isn't actually circular during > loading, but simply attempted at runtime. yes, from the verifier POV checkcast only checks that the top of the stack is an object, the actual checkcast is done at runtime. see https://docs.oracle.com/javase/specs/jvms/se23/html/jvms-4.html#jvms-4.10.1.9.checkcast > --John R?mi > On 15/03/2025 10:53, Michael van Acken wrote: >> As far as I know, at time of definition only the class being extended must be >> available. >> From your example, this seems to be j.l.Object both times, so this should not be >> the problem. >> But your case triggers a vague recollection, where I had the same behaviour of >> Lookup.defineClass() in some gnarly unit test of my compiler -- with a NoCDFE >> that >> I was not able to explain. >> Out of curiosity: what happens when you catch the exception and do a findClass >> using >> the same lookup and the dotted class name of the class you just tried to define? >> In >> my case, I got back the class instance in the catch clause, suggesting the >> defineClass >> completed after all. >> -- mva >> Am Sa., 15. M?rz 2025 um 10:19 Uhr schrieb John Hendrikx < [ >> mailto:john at int4.org | john at int4.org ] >: >>> Hi list, >>> I'm trying to use the ClassFile API to automatically implement control classes >>> (as found in JavaFX). These classes define inner class CssMetaData >>> implementations that refer back to the outer class, and the outer class refers >>> to these implementations via static fields. When I define one of the inner >>> types using Lookup::defineClass I get a NoClassDefFoundError for the outer >>> type. When I define the outer type first, I get a NoClassDefFoundError for one >>> of the inner types. The situation is essentially this: >>> public class Sample { >>> private final Property b = new Property( A ); >>> private static final CssMetaData A = new CssMetaData() { >>> @Override >>> public Property getProperty(Object obj) { >>> return ((Sample)obj). b ; >>> } >>> }; >>> } >>> abstract class CssMetaData { >>> abstract Property getProperty(Object obj); >>> } >>> class Property { >>> public Property(CssMetaData a ) { >>> } >>> } >>> I'm trying to generate the Sample class. The classes CssMetaData and Propery are >>> pre-existing. As you can see, Sample refers to A in a property it creates, >>> while A refers to that property by direct field access after a cast. >>> Note that the above is perfectly legal as a Java class, and I think the bytecode >>> I generate is correct. It seems I would need to be able to define both classes >>> at the same time, but Lookup doesn't seem to have anything for this purpose. >>> I'd appreciate any insights! >>> --John -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.lloyd at redhat.com Thu Mar 20 20:09:57 2025 From: david.lloyd at redhat.com (David Lloyd) Date: Thu, 20 Mar 2025 15:09:57 -0500 Subject: Class files in ByteBuffer In-Reply-To: References: Message-ID: I've opened a bug [1] and pull request [2] incorporating this discussion (more or less). I've implemented support for both `MemorySegment` and `ByteBuffer`, but this could be revisited if it doesn't look OK. The implementation is not terribly invasive for now, only grabbing a few low-hanging optimizations. [1] https://bugs.openjdk.org/browse/JDK-8352536 [2] https://github.com/openjdk/jdk/pull/24139 On Mon, Mar 10, 2025 at 12:38?PM David Lloyd wrote: > When defining a class in the JDK, one may either use a byte array or a > byte buffer to hold the contents of the class. The latter is useful when > (for example) a JAR file containing uncompressed classes is mapped into > memory. Thus, some class loaders depend on this form of the API for class > definition. > > If I were to supplement such a class loader with a class transformation > step based on the class file API, I would have to copy the bytes of each > class on to the heap as a byte[] before I could begin parsing it. This is > potentially expensive, and definitely awkward. > > After transformation, it doesn't really matter if you have a byte[] or > ByteBuffer because either way, the class can be defined directly. > > It would be nice if the class file parser could accept either a byte[] or > a ByteBuffer. I did a quick bit of exploratory work and it looks like > porting the code to read from a ByteBuffer instead of a byte[] (using > ByteBuffer.wrap() for the array case) would be largely straightforward > *except* for the code which parses UTF-8 constants into strings. Also there > could be some small performance differences (maybe positive, maybe > negative) depending on how the buffer is accessed. > > Is this something that might be considered? > > -- > - DML ? he/him > -- - DML ? he/him -------------- next part -------------- An HTML attachment was scrubbed... URL: From adam.sotona at oracle.com Thu Mar 20 21:09:18 2025 From: adam.sotona at oracle.com (Adam Sotona) Date: Thu, 20 Mar 2025 21:09:18 +0000 Subject: Class files in ByteBuffer In-Reply-To: References: Message-ID: I?m sorry to join the discussion a bit late. Here are the points to consider: * Class-File API is implementation is after many rounds of performance optimizations purely based on byte arrays. * Internal use of ByteBuffer has been removed from the implementation, as it caused significant JDK bootstrap performance regression. * Enormous amount of work has been spent on the API surface reduction and removal of all unnecessary ?conveniences?. Adam From: classfile-api-dev on behalf of David Lloyd Date: Thursday, 20 March 2025 at 21:11 To: classfile-api-dev at openjdk.org Subject: Re: Class files in ByteBuffer I've opened a bug [1] and pull request [2] incorporating this discussion (more or less). I've implemented support for both `MemorySegment` and `ByteBuffer`, but this could be revisited if it doesn't look OK. The implementation is not terribly invasive for now, only grabbing a few low-hanging optimizations. [1] https://bugs.openjdk.org/browse/JDK-8352536 [2] https://github.com/openjdk/jdk/pull/24139 On Mon, Mar 10, 2025 at 12:38?PM David Lloyd > wrote: When defining a class in the JDK, one may either use a byte array or a byte buffer to hold the contents of the class. The latter is useful when (for example) a JAR file containing uncompressed classes is mapped into memory. Thus, some class loaders depend on this form of the API for class definition. If I were to supplement such a class loader with a class transformation step based on the class file API, I would have to copy the bytes of each class on to the heap as a byte[] before I could begin parsing it. This is potentially expensive, and definitely awkward. After transformation, it doesn't really matter if you have a byte[] or ByteBuffer because either way, the class can be defined directly. It would be nice if the class file parser could accept either a byte[] or a ByteBuffer. I did a quick bit of exploratory work and it looks like porting the code to read from a ByteBuffer instead of a byte[] (using ByteBuffer.wrap() for the array case) would be largely straightforward *except* for the code which parses UTF-8 constants into strings. Also there could be some small performance differences (maybe positive, maybe negative) depending on how the buffer is accessed. Is this something that might be considered? -- - DML ? he/him -- - DML ? he/him -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.lloyd at redhat.com Fri Mar 21 12:36:11 2025 From: david.lloyd at redhat.com (David Lloyd) Date: Fri, 21 Mar 2025 07:36:11 -0500 Subject: Class files in ByteBuffer In-Reply-To: References: Message-ID: Please have a look at the PR. If you feel the API surface has grown too much, maybe removing the `ByteBuffer` variants is a logical step, since users can always wrap a `ByteBuffer` with a `MemorySegment`? If you could comment on the PR if you feel that to be the case, I would appreciate it. Thanks. On Thu, Mar 20, 2025 at 4:09?PM Adam Sotona wrote: > I?m sorry to join the discussion a bit late. > > > > Here are the points to consider: > > - Class-File API is implementation is after many rounds of performance > optimizations purely based on byte arrays. > - Internal use of ByteBuffer has been removed from the implementation, > as it caused significant JDK bootstrap performance regression. > - Enormous amount of work has been spent on the API surface reduction > and removal of all unnecessary ?conveniences?. > > > > Adam > > > > > > > > *From: *classfile-api-dev on behalf > of David Lloyd > *Date: *Thursday, 20 March 2025 at 21:11 > *To: *classfile-api-dev at openjdk.org > *Subject: *Re: Class files in ByteBuffer > > I've opened a bug [1] and pull request [2] incorporating this discussion > (more or less). I've implemented support for both `MemorySegment` and > `ByteBuffer`, but this could be revisited if it doesn't look OK. The > implementation is not terribly invasive for now, only grabbing a few > low-hanging optimizations. > > > > [1] https://bugs.openjdk.org/browse/JDK-8352536 > > [2] https://github.com/openjdk/jdk/pull/24139 > > > > On Mon, Mar 10, 2025 at 12:38?PM David Lloyd > wrote: > > When defining a class in the JDK, one may either use a byte array or a > byte buffer to hold the contents of the class. The latter is useful when > (for example) a JAR file containing uncompressed classes is mapped into > memory. Thus, some class loaders depend on this form of the API for class > definition. > > > > If I were to supplement such a class loader with a class transformation > step based on the class file API, I would have to copy the bytes of each > class on to the heap as a byte[] before I could begin parsing it. This is > potentially expensive, and definitely awkward. > > > > After transformation, it doesn't really matter if you have a byte[] or > ByteBuffer because either way, the class can be defined directly. > > > > It would be nice if the class file parser could accept either a byte[] or > a ByteBuffer. I did a quick bit of exploratory work and it looks like > porting the code to read from a ByteBuffer instead of a byte[] (using > ByteBuffer.wrap() for the array case) would be largely straightforward > *except* for the code which parses UTF-8 constants into strings. Also there > could be some small performance differences (maybe positive, maybe > negative) depending on how the buffer is accessed. > > > > Is this something that might be considered? > > > > -- > > - DML ? he/him > > > > > -- > > - DML ? he/him > -- - DML ? he/him -------------- next part -------------- An HTML attachment was scrubbed... URL: From adam.sotona at oracle.com Fri Mar 21 13:26:20 2025 From: adam.sotona at oracle.com (Adam Sotona) Date: Fri, 21 Mar 2025 13:26:20 +0000 Subject: [External] : Re: Class files in ByteBuffer In-Reply-To: References: Message-ID: I?m more thinking that the API already provides all the important entries and conversion from and to `MemorySegment` can be done by simple call of `MemorySegment::toArray` and `MemorySegment::ofArray`. From: David Lloyd Date: Friday, 21 March 2025 at 13:37 To: Adam Sotona Cc: classfile-api-dev at openjdk.org Subject: [External] : Re: Class files in ByteBuffer Please have a look at the PR. If you feel the API surface has grown too much, maybe removing the `ByteBuffer` variants is a logical step, since users can always wrap a `ByteBuffer` with a `MemorySegment`? If you could comment on the PR if you feel that to be the case, I would appreciate it. Thanks. On Thu, Mar 20, 2025 at 4:09?PM Adam Sotona > wrote: I?m sorry to join the discussion a bit late. Here are the points to consider: * Class-File API is implementation is after many rounds of performance optimizations purely based on byte arrays. * Internal use of ByteBuffer has been removed from the implementation, as it caused significant JDK bootstrap performance regression. * Enormous amount of work has been spent on the API surface reduction and removal of all unnecessary ?conveniences?. Adam From: classfile-api-dev > on behalf of David Lloyd > Date: Thursday, 20 March 2025 at 21:11 To: classfile-api-dev at openjdk.org > Subject: Re: Class files in ByteBuffer I've opened a bug [1] and pull request [2] incorporating this discussion (more or less). I've implemented support for both `MemorySegment` and `ByteBuffer`, but this could be revisited if it doesn't look OK. The implementation is not terribly invasive for now, only grabbing a few low-hanging optimizations. [1] https://bugs.openjdk.org/browse/JDK-8352536 [2] https://github.com/openjdk/jdk/pull/24139 On Mon, Mar 10, 2025 at 12:38?PM David Lloyd > wrote: When defining a class in the JDK, one may either use a byte array or a byte buffer to hold the contents of the class. The latter is useful when (for example) a JAR file containing uncompressed classes is mapped into memory. Thus, some class loaders depend on this form of the API for class definition. If I were to supplement such a class loader with a class transformation step based on the class file API, I would have to copy the bytes of each class on to the heap as a byte[] before I could begin parsing it. This is potentially expensive, and definitely awkward. After transformation, it doesn't really matter if you have a byte[] or ByteBuffer because either way, the class can be defined directly. It would be nice if the class file parser could accept either a byte[] or a ByteBuffer. I did a quick bit of exploratory work and it looks like porting the code to read from a ByteBuffer instead of a byte[] (using ByteBuffer.wrap() for the array case) would be largely straightforward *except* for the code which parses UTF-8 constants into strings. Also there could be some small performance differences (maybe positive, maybe negative) depending on how the buffer is accessed. Is this something that might be considered? -- - DML ? he/him -- - DML ? he/him -- - DML ? he/him -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.lloyd at redhat.com Fri Mar 21 13:34:29 2025 From: david.lloyd at redhat.com (David Lloyd) Date: Fri, 21 Mar 2025 08:34:29 -0500 Subject: [External] : Re: Class files in ByteBuffer In-Reply-To: References: Message-ID: The idea is that in the future, it may be possible to do these things without the extra copy. In the PR, I found that we can already build to memory segments and byte buffers without more copies than what we're doing for arrays. On the parsing side, we can already sometimes work without copying in some cases that the user won't have access to (e.g. accessing the backing array of a memory segment, even if it's read-only). It's not hard to imagine that we could possibly have a way to parse without the extra copy in the native memory case in the future, without impacting current performance on arrays. But without the API support, it can never be possible. On Fri, Mar 21, 2025 at 8:26?AM Adam Sotona wrote: > I?m more thinking that the API already provides all the important entries > and conversion from and to `MemorySegment` can be done by simple call of > `MemorySegment::toArray` and `MemorySegment::ofArray`. > > > > *From: *David Lloyd > *Date: *Friday, 21 March 2025 at 13:37 > *To: *Adam Sotona > *Cc: *classfile-api-dev at openjdk.org > *Subject: *[External] : Re: Class files in ByteBuffer > > Please have a look at the PR. If you feel the API surface has grown too > much, maybe removing the `ByteBuffer` variants is a logical step, since > users can always wrap a `ByteBuffer` with a `MemorySegment`? If you could > comment on the PR if you feel that to be the case, I would appreciate it. > > > > Thanks. > > > > On Thu, Mar 20, 2025 at 4:09?PM Adam Sotona > wrote: > > I?m sorry to join the discussion a bit late. > > > > Here are the points to consider: > > - Class-File API is implementation is after many rounds of performance > optimizations purely based on byte arrays. > - Internal use of ByteBuffer has been removed from the implementation, > as it caused significant JDK bootstrap performance regression. > - Enormous amount of work has been spent on the API surface reduction > and removal of all unnecessary ?conveniences?. > > > > Adam > > > > > > > > *From: *classfile-api-dev on behalf > of David Lloyd > *Date: *Thursday, 20 March 2025 at 21:11 > *To: *classfile-api-dev at openjdk.org > *Subject: *Re: Class files in ByteBuffer > > I've opened a bug [1] and pull request [2] incorporating this discussion > (more or less). I've implemented support for both `MemorySegment` and > `ByteBuffer`, but this could be revisited if it doesn't look OK. The > implementation is not terribly invasive for now, only grabbing a few > low-hanging optimizations. > > > > [1] https://bugs.openjdk.org/browse/JDK-8352536 > > [2] https://github.com/openjdk/jdk/pull/24139 > > > > > On Mon, Mar 10, 2025 at 12:38?PM David Lloyd > wrote: > > When defining a class in the JDK, one may either use a byte array or a > byte buffer to hold the contents of the class. The latter is useful when > (for example) a JAR file containing uncompressed classes is mapped into > memory. Thus, some class loaders depend on this form of the API for class > definition. > > > > If I were to supplement such a class loader with a class transformation > step based on the class file API, I would have to copy the bytes of each > class on to the heap as a byte[] before I could begin parsing it. This is > potentially expensive, and definitely awkward. > > > > After transformation, it doesn't really matter if you have a byte[] or > ByteBuffer because either way, the class can be defined directly. > > > > It would be nice if the class file parser could accept either a byte[] or > a ByteBuffer. I did a quick bit of exploratory work and it looks like > porting the code to read from a ByteBuffer instead of a byte[] (using > ByteBuffer.wrap() for the array case) would be largely straightforward > *except* for the code which parses UTF-8 constants into strings. Also there > could be some small performance differences (maybe positive, maybe > negative) depending on how the buffer is accessed. > > > > Is this something that might be considered? > > > > -- > > - DML ? he/him > > > > > -- > > - DML ? he/him > > > > > -- > > - DML ? he/him > -- - DML ? he/him -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Fri Mar 21 14:27:32 2025 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 21 Mar 2025 14:27:32 +0000 Subject: [External] : Re: Class files in ByteBuffer In-Reply-To: References: Message-ID: I looked at the PR and I had a similar reaction as Adam. It seems like the buildToXYZ methods have one less copy, but... if the API returns a byte[], you can always wrap the byte[] as either a MemorySegment (MemorySegment::ofArrays) or a ByteBuffer (ByteBuffer::wrap). These methods do _not_ copy. The way I see it, is that this PR doesn't remove the need to copy on the way in (that would be a much more complex change), and it doesn't improve things significantly _on the way out_ (because, that side is already covered by existing API/views support). So, while I'm not strongly opposed to the changes in the PR, at the same time I don't see a lot of added value in the new methods. At the same time I'm a bit skeptical to commit to the new API in without doing the full exercise to see how these methods might be implemented. A good motto in API design is "when in doubt, leave it out" -- is this one of those cases? Maurizio On 21/03/2025 13:34, David Lloyd wrote: > The idea is that in the future, it may be possible to do these things > without the extra copy. In the PR, I found that we can already build > to memory segments and byte buffers without more copies than what > we're doing for arrays. On the parsing side, we can already sometimes > work without copying in some cases that the user won't have access to > (e.g. accessing the backing array of a memory segment, even if it's > read-only). It's not hard to imagine that we could possibly have a way > to parse without the extra copy in the native memory case in the > future, without impacting current performance on arrays. But without > the API support, it can never be possible. > > On Fri, Mar 21, 2025 at 8:26?AM Adam Sotona > wrote: > > I?m more thinking that the API already provides all the important > entries and conversion from and to `MemorySegment` can be done by > simple call of `MemorySegment::toArray` and `MemorySegment::ofArray`. > > *From: *David Lloyd > *Date: *Friday, 21 March 2025 at 13:37 > *To: *Adam Sotona > *Cc: *classfile-api-dev at openjdk.org > *Subject: *[External] : Re: Class files in ByteBuffer > > Please have a look at the PR. If you feel the API surface has > grown too much, maybe removing the `ByteBuffer` variants is a > logical step, since users can always wrap a `ByteBuffer` with a > `MemorySegment`? If you could comment on the PR if you feel that > to be the case, I would appreciate it. > > Thanks. > > On Thu, Mar 20, 2025 at 4:09?PM Adam Sotona > wrote: > > I?m sorry to join the discussion a bit late. > > Here are the points to consider: > > * Class-File API is implementation is after many rounds of > performance optimizations purely based on byte arrays. > * Internal use of ByteBuffer has been removed from the > implementation, as it caused significant JDK bootstrap > performance regression. > * Enormous amount of work has been spent on the API surface > reduction and removal of all unnecessary ?conveniences?. > > Adam > > *From: *classfile-api-dev > on behalf of David Lloyd > *Date: *Thursday, 20 March 2025 at 21:11 > *To: *classfile-api-dev at openjdk.org > > *Subject: *Re: Class files in ByteBuffer > > I've opened a bug [1] and pull request [2] incorporating this > discussion (more or less). I've implemented support for both > `MemorySegment` and `ByteBuffer`, but this could be revisited > if it doesn't look OK. The implementation is not terribly > invasive for now, only grabbing a few low-hanging optimizations. > > [1] https://bugs.openjdk.org/browse/JDK-8352536 > > > [2] https://github.com/openjdk/jdk/pull/24139 > > > On Mon, Mar 10, 2025 at 12:38?PM David Lloyd > wrote: > > When defining a class in the JDK, one may either use a > byte array or a byte buffer to hold the contents of the > class. The latter is useful when (for example) a JAR file > containing uncompressed classes is mapped into memory. > Thus, some class loaders depend on this form of the API > for class definition. > > If I were to supplement such a class loader with a class > transformation step based on the class file API, I would > have to copy the bytes of each class on to the heap as a > byte[] before I could begin parsing it. This is > potentially expensive, and definitely awkward. > > After transformation, it doesn't really matter if you have > a byte[] or ByteBuffer because either way, the class can > be defined directly. > > It would be nice if the class file parser could accept > either a byte[] or a ByteBuffer. I did a quick bit of > exploratory work and it looks like porting the code to > read from a ByteBuffer instead of a byte[]? (using > ByteBuffer.wrap() for the array case) would be largely > straightforward *except* for the code which parses UTF-8 > constants into strings. Also there could be some small > performance differences (maybe positive, maybe negative) > depending on how the buffer is accessed. > > Is this something that might be considered? > > -- > > - DML ? he/him > > > -- > > - DML ? he/him > > > -- > > - DML ? he/him > > > > -- > - DML ? he/him -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.lloyd at redhat.com Fri Mar 21 15:17:56 2025 From: david.lloyd at redhat.com (David Lloyd) Date: Fri, 21 Mar 2025 10:17:56 -0500 Subject: [External] : Re: Class files in ByteBuffer In-Reply-To: References: Message-ID: Yes, you *could* wrap a byte[] with a MemorySegment, but then you'd just have a heap memory segment; this is probably not what the user is looking for (otherwise they'd probably just build to a byte[] to begin with). By using the user-supplied allocator, I can (for example) write the class bytes directly to a memory-mapped file. That said, the buildTo* variants exist because Brian suggested that the API might be unacceptably asymmetrical otherwise. Either way, the part that matters the most to me is the parsing side. The idea with the API is that, while the current impl would need to copy to a byte array, it is at least theoretically possible that in the future, that may change. As a user of the byte[]-based API, I am already copying from my direct buffer, so it's not worse than the status quo. By putting the copy into the JDK, if the JDK does get enhanced someday to use MemorySegment internally, or maybe Unsafe, or whatever, then I'll get the benefit of this change. If it doesn't, then I'm no worse off than I am today (slightly better off actually, because it saves me a step going in). Supporting this use case would be beneficial for the same reason that it is beneficial to be able to define classes out of direct buffers (which has been supported since JDK 1.5). The thing I'm mainly in doubt about is that the ability to parse from or generate to byte buffers is potentially redundant with respect to MemorySegment. It would just be a bit weird if I could define a class using an array or a byte buffer, but parsing classes used arrays or memory segments. Is it weird enough to justify the third API variant? I don't know. On Fri, Mar 21, 2025 at 9:27?AM Maurizio Cimadamore < maurizio.cimadamore at oracle.com> wrote: > I looked at the PR and I had a similar reaction as Adam. > > It seems like the buildToXYZ methods have one less copy, but... if the API > returns a byte[], you can always wrap the byte[] as either a MemorySegment > (MemorySegment::ofArrays) or a ByteBuffer (ByteBuffer::wrap). These methods > do _not_ copy. > > The way I see it, is that this PR doesn't remove the need to copy on the > way in (that would be a much more complex change), and it doesn't improve > things significantly _on the way out_ (because, that side is already > covered by existing API/views support). > > So, while I'm not strongly opposed to the changes in the PR, at the same > time I don't see a lot of added value in the new methods. At the same time > I'm a bit skeptical to commit to the new API in without doing the full > exercise to see how these methods might be implemented. > > A good motto in API design is "when in doubt, leave it out" -- is this one > of those cases? > > Maurizio > > > > On 21/03/2025 13:34, David Lloyd wrote: > > The idea is that in the future, it may be possible to do these things > without the extra copy. In the PR, I found that we can already build to > memory segments and byte buffers without more copies than what we're doing > for arrays. On the parsing side, we can already sometimes work without > copying in some cases that the user won't have access to (e.g. accessing > the backing array of a memory segment, even if it's read-only). It's not > hard to imagine that we could possibly have a way to parse without the > extra copy in the native memory case in the future, without impacting > current performance on arrays. But without the API support, it can never be > possible. > > On Fri, Mar 21, 2025 at 8:26?AM Adam Sotona > wrote: > >> I?m more thinking that the API already provides all the important entries >> and conversion from and to `MemorySegment` can be done by simple call of >> `MemorySegment::toArray` and `MemorySegment::ofArray`. >> >> >> >> *From: *David Lloyd >> *Date: *Friday, 21 March 2025 at 13:37 >> *To: *Adam Sotona >> *Cc: *classfile-api-dev at openjdk.org >> *Subject: *[External] : Re: Class files in ByteBuffer >> >> Please have a look at the PR. If you feel the API surface has grown too >> much, maybe removing the `ByteBuffer` variants is a logical step, since >> users can always wrap a `ByteBuffer` with a `MemorySegment`? If you could >> comment on the PR if you feel that to be the case, I would appreciate it. >> >> >> >> Thanks. >> >> >> >> On Thu, Mar 20, 2025 at 4:09?PM Adam Sotona >> wrote: >> >> I?m sorry to join the discussion a bit late. >> >> >> >> Here are the points to consider: >> >> - Class-File API is implementation is after many rounds of >> performance optimizations purely based on byte arrays. >> - Internal use of ByteBuffer has been removed from the >> implementation, as it caused significant JDK bootstrap performance >> regression. >> - Enormous amount of work has been spent on the API surface reduction >> and removal of all unnecessary ?conveniences?. >> >> >> >> Adam >> >> >> >> >> >> >> >> *From: *classfile-api-dev on behalf >> of David Lloyd >> *Date: *Thursday, 20 March 2025 at 21:11 >> *To: *classfile-api-dev at openjdk.org >> *Subject: *Re: Class files in ByteBuffer >> >> I've opened a bug [1] and pull request [2] incorporating this discussion >> (more or less). I've implemented support for both `MemorySegment` and >> `ByteBuffer`, but this could be revisited if it doesn't look OK. The >> implementation is not terribly invasive for now, only grabbing a few >> low-hanging optimizations. >> >> >> >> [1] https://bugs.openjdk.org/browse/JDK-8352536 >> >> [2] https://github.com/openjdk/jdk/pull/24139 >> >> >> >> >> On Mon, Mar 10, 2025 at 12:38?PM David Lloyd >> wrote: >> >> When defining a class in the JDK, one may either use a byte array or a >> byte buffer to hold the contents of the class. The latter is useful when >> (for example) a JAR file containing uncompressed classes is mapped into >> memory. Thus, some class loaders depend on this form of the API for class >> definition. >> >> >> >> If I were to supplement such a class loader with a class transformation >> step based on the class file API, I would have to copy the bytes of each >> class on to the heap as a byte[] before I could begin parsing it. This is >> potentially expensive, and definitely awkward. >> >> >> >> After transformation, it doesn't really matter if you have a byte[] or >> ByteBuffer because either way, the class can be defined directly. >> >> >> >> It would be nice if the class file parser could accept either a byte[] or >> a ByteBuffer. I did a quick bit of exploratory work and it looks like >> porting the code to read from a ByteBuffer instead of a byte[] (using >> ByteBuffer.wrap() for the array case) would be largely straightforward >> *except* for the code which parses UTF-8 constants into strings. Also there >> could be some small performance differences (maybe positive, maybe >> negative) depending on how the buffer is accessed. >> >> >> >> Is this something that might be considered? >> >> >> >> -- >> >> - DML ? he/him >> >> >> >> >> -- >> >> - DML ? he/him >> >> >> >> >> -- >> >> - DML ? he/him >> > > > -- > - DML ? he/him > > -- - DML ? he/him -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.lloyd at redhat.com Fri Mar 21 16:09:45 2025 From: david.lloyd at redhat.com (David Lloyd) Date: Fri, 21 Mar 2025 11:09:45 -0500 Subject: [External] : Re: Class files in ByteBuffer In-Reply-To: <70127bf6-d919-40f8-a232-804e488653f3@oracle.com> References: <14d200ae-768c-4014-a194-1e8188be5cae@oracle.com> <70127bf6-d919-40f8-a232-804e488653f3@oracle.com> Message-ID: On Fri, Mar 21, 2025 at 10:58?AM Maurizio Cimadamore < maurizio.cimadamore at oracle.com> wrote: > > On 21/03/2025 15:44, David Lloyd wrote: > > > > On Fri, Mar 21, 2025 at 10:25?AM Maurizio Cimadamore < > maurizio.cimadamore at oracle.com> wrote: > >> >> On 21/03/2025 15:17, David Lloyd wrote: >> >> Yes, you *could* wrap a byte[] with a MemorySegment, but then you'd just >> have a heap memory segment; this is probably not what the user is looking >> for (otherwise they'd probably just build to a byte[] to begin with). By >> using the user-supplied allocator, I can (for example) write the class >> bytes directly to a memory-mapped file. >> >> Ah! I now see that first allocator parameter -- which makes sense. >> >> I'd suggest to turn that into SegmentAllocator -- note that Arena is also >> a SegmentAllocator, so you can pass either your own allocating lambda, or >> directly use an arena which is nice (I think). >> > > Ah, that's an interesting idea. The idea with using the plain function is > that the user would explicitly be in charge of deciding all of the > characteristics of the buffer (including, for example, alignment). I'm not > quite sure how that shakes out (ergonomically speaking) with using a > `SegmentAllocator`, because either I'd have to send in the alignment (which > means, I guess, accepting it as a parameter in the `buildToMemorySegment` > methods), or else rely on the user to override that method in their > allocator. With a plain function, the user could always pass in > `mySegmentAllocator::allocate` or `size -> > mySegmentAllocator.allocate(size, 8)` or whatever. Also, with a plain > function, I could pass one in which easily yields a subsegment of a parent > segment e.g. `size -> parentSegment.asSlice(offset, size)`. Would it still > be easy to do this while accepting a `SegmentAllocator`? > > Note: SementAllocator is a functional interface. So you can always pass a > lambda to implement it -- e.g. > > buildToMemorySegment((size, _) -> getMeASegment(size)); > > And, SegmentAllocator already suports sliced allocation - see > SegmentAllocator::slicingAllocator (which creates a segment allocator from > an existing segment and keeps slicing from it, at consecutive offsets until > it runs out), or SegmentAllocator::prefixAllocator (which creates a segment > allocator from an existing segment and keeps slicing from it from the start > of the segment -- possibly overwriting each time) > Ah, perfect. I can make that change then. > ByteBuffer don't have the allocation abstraction -- so I wonder if we >> really should let them go, and maybe only add the segment API with the >> allocator parameter. Or maybe have a simpler API for ByteBuffer w/o >> allocator parameter, which always allocates a heap buffer. Then if users >> want something more complex (e.g. a mapped byte buffer) they can use the MS >> API instead, and then wrap the MS into a BB after the fact. >> > > Well, ByteBuffer has the same allocation abstraction that (for example) > arrays do when you're calling `Collection.toArray(generator)` - that is, > `IntFunction`. Using this abstraction, one can basically get all the > same benefits described above - the ability to select direct or heap, the > ability to return a subslice of a parent buffer, etc. But I agree, it seems > that semantically the buffer stuff is pretty much exactly redundant with > respect to the MemorySegment variations (since we have > `MemorySegment.asByteBuffer()` and `MemorySegment.ofBuffer()`), so I could > see dropping it and just living with the minor asymmetry with > `ClassLoader.defineClass(...)`. > > Yeah -- we could go both ways... I guess my sense is that since > SegmentAllocator is an official thing (TM ;-) ), it has more right to be in > the API than a "random" BB-generating function. Also, I guess what I'm > saying is that, if you squint, the memory segment-accepting methods are > really the primitve ones as all the others (including byte[]) can be > derived from there. > OK, I think I'll go ahead and drop the `ByteBuffer` variants then. > But, at the moment we know we can't go all the way down there... > Yeah, that's a pity. -- - DML ? he/him -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Fri Mar 21 15:25:19 2025 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 21 Mar 2025 15:25:19 +0000 Subject: [External] : Re: Class files in ByteBuffer In-Reply-To: References: Message-ID: <14d200ae-768c-4014-a194-1e8188be5cae@oracle.com> On 21/03/2025 15:17, David Lloyd wrote: > Yes, you *could* wrap a byte[] with a MemorySegment, but then you'd > just have a heap memory segment; this is probably not what the user is > looking for (otherwise they'd probably just build to a byte[] to begin > with). By using the user-supplied allocator, I can (for example) write > the class bytes directly to a memory-mapped file. Ah! I now see that first allocator parameter -- which makes sense. I'd suggest to turn that into SegmentAllocator -- note that Arena is also a SegmentAllocator, so you can pass either your own allocating lambda, or directly use an arena which is nice (I think). ByteBuffer don't have the allocation abstraction -- so I wonder if we really should let them go, and maybe only add the segment API with the allocator parameter. Or maybe have a simpler API for ByteBuffer w/o allocator parameter, which always allocates a heap buffer. Then if users want something more complex (e.g. a mapped byte buffer) they can use the MS API instead, and then wrap the MS into a BB after the fact. Maurizio > > That said, the buildTo* variants exist because Brian suggested that > the API might be unacceptably asymmetrical otherwise. Either way, the > part that matters the most to me is the parsing side. > > The idea with the API is that, while the current impl would need to > copy to a byte array, it is at least theoretically possible that in > the future, that may change. As a user of the byte[]-based API, I am > already copying from my direct buffer, so it's not worse than the > status quo. By putting the copy into the JDK, if the JDK does get > enhanced someday to use MemorySegment internally, or maybe Unsafe, or > whatever, then I'll get the benefit of this change. If it doesn't, > then I'm no worse off than I am today (slightly better off actually, > because it saves me a step going in). Supporting this use case would > be beneficial for the same reason that it is beneficial to be able to > define classes out of direct buffers (which has been supported since > JDK 1.5). > > The thing I'm mainly in doubt about is that the ability to parse from > or generate to byte buffers is potentially redundant with respect to > MemorySegment. It would just be a bit weird if I could define a class > using an array or a byte buffer, but parsing classes used arrays or > memory segments. Is it weird enough to justify the third API variant? > I don't know. > > On Fri, Mar 21, 2025 at 9:27?AM Maurizio Cimadamore > wrote: > > I looked at the PR and I had a similar reaction as Adam. > > It seems like the buildToXYZ methods have one less copy, but... if > the API returns a byte[], you can always wrap the byte[] as either > a MemorySegment (MemorySegment::ofArrays) or a ByteBuffer > (ByteBuffer::wrap). These methods do _not_ copy. > > The way I see it, is that this PR doesn't remove the need to copy > on the way in (that would be a much more complex change), and it > doesn't improve things significantly _on the way out_ (because, > that side is already covered by existing API/views support). > > So, while I'm not strongly opposed to the changes in the PR, at > the same time I don't see a lot of added value in the new methods. > At the same time I'm a bit skeptical to commit to the new API in > without doing the full exercise to see how these methods might be > implemented. > > A good motto in API design is "when in doubt, leave it out" -- is > this one of those cases? > > Maurizio > > > > On 21/03/2025 13:34, David Lloyd wrote: >> The idea is that in the future, it may be possible to do these >> things without the extra copy. In the PR, I found that we can >> already build to memory segments and byte buffers without more >> copies than what we're doing for arrays. On the parsing side, we >> can already sometimes work without copying in some cases that the >> user won't have access to (e.g. accessing the backing array of a >> memory segment, even if it's read-only). It's not hard to imagine >> that we could possibly have a way to parse without the extra copy >> in the native memory case in the future, without impacting >> current performance on arrays. But without the API support, it >> can never be possible. >> >> On Fri, Mar 21, 2025 at 8:26?AM Adam Sotona >> wrote: >> >> I?m more thinking that the API already provides all the >> important entries and conversion from and to `MemorySegment` >> can be done by simple call of `MemorySegment::toArray` and >> `MemorySegment::ofArray`. >> >> *From: *David Lloyd >> *Date: *Friday, 21 March 2025 at 13:37 >> *To: *Adam Sotona >> *Cc: *classfile-api-dev at openjdk.org >> >> *Subject: *[External] : Re: Class files in ByteBuffer >> >> Please have a look at the PR. If you feel the API surface has >> grown too much, maybe removing the `ByteBuffer` variants is a >> logical step, since users can always wrap a `ByteBuffer` with >> a `MemorySegment`? If you could comment on the PR if you feel >> that to be the case, I would appreciate it. >> >> Thanks. >> >> On Thu, Mar 20, 2025 at 4:09?PM Adam Sotona >> wrote: >> >> I?m sorry to join the discussion a bit late. >> >> Here are the points to consider: >> >> * Class-File API is implementation is after many rounds >> of performance optimizations purely based on byte arrays. >> * Internal use of ByteBuffer has been removed from the >> implementation, as it caused significant JDK >> bootstrap performance regression. >> * Enormous amount of work has been spent on the API >> surface reduction and removal of all unnecessary >> ?conveniences?. >> >> Adam >> >> *From: *classfile-api-dev >> on behalf of David >> Lloyd >> *Date: *Thursday, 20 March 2025 at 21:11 >> *To: *classfile-api-dev at openjdk.org >> >> *Subject: *Re: Class files in ByteBuffer >> >> I've opened a bug [1] and pull request [2] incorporating >> this discussion (more or less). I've implemented support >> for both `MemorySegment` and `ByteBuffer`, but this could >> be revisited if it doesn't look OK. The implementation is >> not terribly invasive for now, only grabbing a few >> low-hanging optimizations. >> >> [1] https://bugs.openjdk.org/browse/JDK-8352536 >> >> >> [2] https://github.com/openjdk/jdk/pull/24139 >> >> >> On Mon, Mar 10, 2025 at 12:38?PM David Lloyd >> wrote: >> >> When defining a class in the JDK, one may either use >> a byte array or a byte buffer to hold the contents of >> the class. The latter is useful when (for example) a >> JAR file containing uncompressed classes is mapped >> into memory. Thus, some class loaders depend on this >> form of the API for class definition. >> >> If I were to supplement such a class loader with a >> class transformation step based on the class file >> API, I would have to copy the bytes of each class on >> to the heap as a byte[] before I could begin parsing >> it. This is potentially expensive, and definitely >> awkward. >> >> After transformation, it doesn't really matter if you >> have a byte[] or ByteBuffer because either way, the >> class can be defined directly. >> >> It would be nice if the class file parser could >> accept either a byte[] or a ByteBuffer. I did a quick >> bit of exploratory work and it looks like porting the >> code to read from a ByteBuffer instead of a byte[]? >> (using ByteBuffer.wrap() for the array case) would be >> largely straightforward *except* for the code which >> parses UTF-8 constants into strings. Also there could >> be some small performance differences (maybe >> positive, maybe negative) depending on how the buffer >> is accessed. >> >> Is this something that might be considered? >> >> -- >> >> - DML ? he/him >> >> >> -- >> >> - DML ? he/him >> >> >> -- >> >> - DML ? he/him >> >> >> >> -- >> - DML ? he/him > > > > -- > - DML ? he/him -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.lloyd at redhat.com Fri Mar 21 15:44:36 2025 From: david.lloyd at redhat.com (David Lloyd) Date: Fri, 21 Mar 2025 10:44:36 -0500 Subject: [External] : Re: Class files in ByteBuffer In-Reply-To: <14d200ae-768c-4014-a194-1e8188be5cae@oracle.com> References: <14d200ae-768c-4014-a194-1e8188be5cae@oracle.com> Message-ID: On Fri, Mar 21, 2025 at 10:25?AM Maurizio Cimadamore < maurizio.cimadamore at oracle.com> wrote: > > On 21/03/2025 15:17, David Lloyd wrote: > > Yes, you *could* wrap a byte[] with a MemorySegment, but then you'd just > have a heap memory segment; this is probably not what the user is looking > for (otherwise they'd probably just build to a byte[] to begin with). By > using the user-supplied allocator, I can (for example) write the class > bytes directly to a memory-mapped file. > > Ah! I now see that first allocator parameter -- which makes sense. > > I'd suggest to turn that into SegmentAllocator -- note that Arena is also > a SegmentAllocator, so you can pass either your own allocating lambda, or > directly use an arena which is nice (I think). > Ah, that's an interesting idea. The idea with using the plain function is that the user would explicitly be in charge of deciding all of the characteristics of the buffer (including, for example, alignment). I'm not quite sure how that shakes out (ergonomically speaking) with using a `SegmentAllocator`, because either I'd have to send in the alignment (which means, I guess, accepting it as a parameter in the `buildToMemorySegment` methods), or else rely on the user to override that method in their allocator. With a plain function, the user could always pass in `mySegmentAllocator::allocate` or `size -> mySegmentAllocator.allocate(size, 8)` or whatever. Also, with a plain function, I could pass one in which easily yields a subsegment of a parent segment e.g. `size -> parentSegment.asSlice(offset, size)`. Would it still be easy to do this while accepting a `SegmentAllocator`? ByteBuffer don't have the allocation abstraction -- so I wonder if we > really should let them go, and maybe only add the segment API with the > allocator parameter. Or maybe have a simpler API for ByteBuffer w/o > allocator parameter, which always allocates a heap buffer. Then if users > want something more complex (e.g. a mapped byte buffer) they can use the MS > API instead, and then wrap the MS into a BB after the fact. > Well, ByteBuffer has the same allocation abstraction that (for example) arrays do when you're calling `Collection.toArray(generator)` - that is, `IntFunction`. Using this abstraction, one can basically get all the same benefits described above - the ability to select direct or heap, the ability to return a subslice of a parent buffer, etc. But I agree, it seems that semantically the buffer stuff is pretty much exactly redundant with respect to the MemorySegment variations (since we have `MemorySegment.asByteBuffer()` and `MemorySegment.ofBuffer()`), so I could see dropping it and just living with the minor asymmetry with `ClassLoader.defineClass(...)`. > Maurizio > > > > That said, the buildTo* variants exist because Brian suggested that the > API might be unacceptably asymmetrical otherwise. Either way, the part that > matters the most to me is the parsing side. > > The idea with the API is that, while the current impl would need to copy > to a byte array, it is at least theoretically possible that in the future, > that may change. As a user of the byte[]-based API, I am already copying > from my direct buffer, so it's not worse than the status quo. By putting > the copy into the JDK, if the JDK does get enhanced someday to use > MemorySegment internally, or maybe Unsafe, or whatever, then I'll get the > benefit of this change. If it doesn't, then I'm no worse off than I am > today (slightly better off actually, because it saves me a step going in). > Supporting this use case would be beneficial for the same reason that it is > beneficial to be able to define classes out of direct buffers (which has > been supported since JDK 1.5). > > The thing I'm mainly in doubt about is that the ability to parse from or > generate to byte buffers is potentially redundant with respect to > MemorySegment. It would just be a bit weird if I could define a class using > an array or a byte buffer, but parsing classes used arrays or memory > segments. Is it weird enough to justify the third API variant? I don't know. > > On Fri, Mar 21, 2025 at 9:27?AM Maurizio Cimadamore < > maurizio.cimadamore at oracle.com> wrote: > >> I looked at the PR and I had a similar reaction as Adam. >> >> It seems like the buildToXYZ methods have one less copy, but... if the >> API returns a byte[], you can always wrap the byte[] as either a >> MemorySegment (MemorySegment::ofArrays) or a ByteBuffer (ByteBuffer::wrap). >> These methods do _not_ copy. >> >> The way I see it, is that this PR doesn't remove the need to copy on the >> way in (that would be a much more complex change), and it doesn't improve >> things significantly _on the way out_ (because, that side is already >> covered by existing API/views support). >> >> So, while I'm not strongly opposed to the changes in the PR, at the same >> time I don't see a lot of added value in the new methods. At the same time >> I'm a bit skeptical to commit to the new API in without doing the full >> exercise to see how these methods might be implemented. >> >> A good motto in API design is "when in doubt, leave it out" -- is this >> one of those cases? >> >> Maurizio >> >> >> >> On 21/03/2025 13:34, David Lloyd wrote: >> >> The idea is that in the future, it may be possible to do these things >> without the extra copy. In the PR, I found that we can already build to >> memory segments and byte buffers without more copies than what we're doing >> for arrays. On the parsing side, we can already sometimes work without >> copying in some cases that the user won't have access to (e.g. accessing >> the backing array of a memory segment, even if it's read-only). It's not >> hard to imagine that we could possibly have a way to parse without the >> extra copy in the native memory case in the future, without impacting >> current performance on arrays. But without the API support, it can never be >> possible. >> >> On Fri, Mar 21, 2025 at 8:26?AM Adam Sotona >> wrote: >> >>> I?m more thinking that the API already provides all the important >>> entries and conversion from and to `MemorySegment` can be done by simple >>> call of `MemorySegment::toArray` and `MemorySegment::ofArray`. >>> >>> >>> >>> *From: *David Lloyd >>> *Date: *Friday, 21 March 2025 at 13:37 >>> *To: *Adam Sotona >>> *Cc: *classfile-api-dev at openjdk.org >>> *Subject: *[External] : Re: Class files in ByteBuffer >>> >>> Please have a look at the PR. If you feel the API surface has grown too >>> much, maybe removing the `ByteBuffer` variants is a logical step, since >>> users can always wrap a `ByteBuffer` with a `MemorySegment`? If you could >>> comment on the PR if you feel that to be the case, I would appreciate it. >>> >>> >>> >>> Thanks. >>> >>> >>> >>> On Thu, Mar 20, 2025 at 4:09?PM Adam Sotona >>> wrote: >>> >>> I?m sorry to join the discussion a bit late. >>> >>> >>> >>> Here are the points to consider: >>> >>> - Class-File API is implementation is after many rounds of >>> performance optimizations purely based on byte arrays. >>> - Internal use of ByteBuffer has been removed from the >>> implementation, as it caused significant JDK bootstrap performance >>> regression. >>> - Enormous amount of work has been spent on the API surface >>> reduction and removal of all unnecessary ?conveniences?. >>> >>> >>> >>> Adam >>> >>> >>> >>> >>> >>> >>> >>> *From: *classfile-api-dev on >>> behalf of David Lloyd >>> *Date: *Thursday, 20 March 2025 at 21:11 >>> *To: *classfile-api-dev at openjdk.org >>> *Subject: *Re: Class files in ByteBuffer >>> >>> I've opened a bug [1] and pull request [2] incorporating this discussion >>> (more or less). I've implemented support for both `MemorySegment` and >>> `ByteBuffer`, but this could be revisited if it doesn't look OK. The >>> implementation is not terribly invasive for now, only grabbing a few >>> low-hanging optimizations. >>> >>> >>> >>> [1] https://bugs.openjdk.org/browse/JDK-8352536 >>> >>> [2] https://github.com/openjdk/jdk/pull/24139 >>> >>> >>> >>> >>> On Mon, Mar 10, 2025 at 12:38?PM David Lloyd >>> wrote: >>> >>> When defining a class in the JDK, one may either use a byte array or a >>> byte buffer to hold the contents of the class. The latter is useful when >>> (for example) a JAR file containing uncompressed classes is mapped into >>> memory. Thus, some class loaders depend on this form of the API for class >>> definition. >>> >>> >>> >>> If I were to supplement such a class loader with a class transformation >>> step based on the class file API, I would have to copy the bytes of each >>> class on to the heap as a byte[] before I could begin parsing it. This is >>> potentially expensive, and definitely awkward. >>> >>> >>> >>> After transformation, it doesn't really matter if you have a byte[] or >>> ByteBuffer because either way, the class can be defined directly. >>> >>> >>> >>> It would be nice if the class file parser could accept either a byte[] >>> or a ByteBuffer. I did a quick bit of exploratory work and it looks like >>> porting the code to read from a ByteBuffer instead of a byte[] (using >>> ByteBuffer.wrap() for the array case) would be largely straightforward >>> *except* for the code which parses UTF-8 constants into strings. Also there >>> could be some small performance differences (maybe positive, maybe >>> negative) depending on how the buffer is accessed. >>> >>> >>> >>> Is this something that might be considered? >>> >>> >>> >>> -- >>> >>> - DML ? he/him >>> >>> >>> >>> >>> -- >>> >>> - DML ? he/him >>> >>> >>> >>> >>> -- >>> >>> - DML ? he/him >>> >> >> >> -- >> - DML ? he/him >> >> > > -- > - DML ? he/him > > -- - DML ? he/him -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Fri Mar 21 15:58:31 2025 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 21 Mar 2025 15:58:31 +0000 Subject: [External] : Re: Class files in ByteBuffer In-Reply-To: References: <14d200ae-768c-4014-a194-1e8188be5cae@oracle.com> Message-ID: <70127bf6-d919-40f8-a232-804e488653f3@oracle.com> On 21/03/2025 15:44, David Lloyd wrote: > > > On Fri, Mar 21, 2025 at 10:25?AM Maurizio Cimadamore > wrote: > > > On 21/03/2025 15:17, David Lloyd wrote: >> Yes, you *could* wrap a byte[] with a MemorySegment, but then >> you'd just have a heap memory segment; this is probably not what >> the user is looking for (otherwise they'd probably just build to >> a byte[] to begin with). By using the user-supplied allocator, I >> can (for example) write the class bytes directly to a >> memory-mapped file. > > Ah! I now see that first allocator parameter -- which makes sense. > > I'd suggest to turn that into SegmentAllocator -- note that Arena > is also a SegmentAllocator, so you can pass either your own > allocating lambda, or directly use an arena which is nice (I think). > > > Ah, that's an interesting idea. The idea with using the plain function > is that the user would explicitly be in charge of deciding all of the > characteristics of the buffer (including, for example, alignment). I'm > not quite sure how that shakes out (ergonomically speaking) with using > a `SegmentAllocator`, because either I'd have to send in the alignment > (which means, I guess, accepting it as a parameter in the > `buildToMemorySegment` methods), or else rely on the user to override > that method in their allocator. With a plain function, the user could > always pass in `mySegmentAllocator::allocate` or `size -> > mySegmentAllocator.allocate(size, 8)` or whatever. Also, with a plain > function, I could pass one in which easily yields a subsegment of a > parent segment e.g. `size -> parentSegment.asSlice(offset, size)`. > Would it still be easy to do this while accepting a `SegmentAllocator`? Note: SementAllocator is a functional interface. So you can always pass a lambda to implement it -- e.g. buildToMemorySegment((size, _) -> getMeASegment(size)); And, SegmentAllocator already suports sliced allocation - see SegmentAllocator::slicingAllocator (which creates a segment allocator from an existing segment and keeps slicing from it, at consecutive offsets until it runs out), or SegmentAllocator::prefixAllocator (which creates a segment allocator from an existing segment and keeps slicing from it from the start of the segment -- possibly overwriting each time) > > ByteBuffer don't have the allocation abstraction -- so I wonder if > we really should let them go, and maybe only add the segment API > with the allocator parameter. Or maybe have a simpler API for > ByteBuffer w/o allocator parameter, which always allocates a heap > buffer. Then if users want something more complex (e.g. a mapped > byte buffer) they can use the MS API instead, and then wrap the MS > into a BB after the fact. > > > Well, ByteBuffer has the same allocation abstraction that (for > example) arrays do when you're calling `Collection.toArray(generator)` > - that is, `IntFunction`. Using this abstraction, one can basically > get all the same benefits described above - the ability to select > direct or heap, the ability to return a subslice of a parent buffer, > etc. But I agree, it seems that semantically the buffer stuff is > pretty much exactly redundant with respect to the MemorySegment > variations (since we have `MemorySegment.asByteBuffer()` and > `MemorySegment.ofBuffer()`), so I could see dropping it and just > living with the minor asymmetry with `ClassLoader.defineClass(...)`. Yeah -- we could go both ways... I guess my sense is that since SegmentAllocator is an official thing (TM ;-) ), it has more right to be in the API than a "random" BB-generating function. Also, I guess what I'm saying is that, if you squint, the memory segment-accepting methods are really the primitve ones as all the others (including byte[]) can be derived from there. But, at the moment we know we can't go all the way down there... Maurizio > Maurizio > > >> >> That said, the buildTo* variants exist because Brian suggested >> that the API might be unacceptably asymmetrical otherwise. Either >> way, the part that matters the most to me is the parsing side. >> >> The idea with the API is that, while the current impl would need >> to copy to a byte array, it is at least theoretically possible >> that in the future, that may change. As a user of the >> byte[]-based API, I am already copying from my direct buffer, so >> it's not worse than the status quo. By putting the copy into the >> JDK, if the JDK does get enhanced someday to use MemorySegment >> internally, or maybe Unsafe, or whatever, then I'll get the >> benefit of this change. If it doesn't, then I'm no worse off than >> I am today (slightly better off actually, because it saves me a >> step going in). Supporting this use case would be beneficial for >> the same reason that it is beneficial to be able to define >> classes out of direct buffers (which has been supported since JDK >> 1.5). >> >> The thing I'm mainly in doubt about is that the ability to parse >> from or generate to byte buffers is potentially redundant with >> respect to MemorySegment. It would just be a bit weird if I could >> define a class using an array or a byte buffer, but parsing >> classes used arrays or memory segments. Is it weird enough to >> justify the third API variant? I don't know. >> >> On Fri, Mar 21, 2025 at 9:27?AM Maurizio Cimadamore >> wrote: >> >> I looked at the PR and I had a similar reaction as Adam. >> >> It seems like the buildToXYZ methods have one less copy, >> but... if the API returns a byte[], you can always wrap the >> byte[] as either a MemorySegment (MemorySegment::ofArrays) or >> a ByteBuffer (ByteBuffer::wrap). These methods do _not_ copy. >> >> The way I see it, is that this PR doesn't remove the need to >> copy on the way in (that would be a much more complex >> change), and it doesn't improve things significantly _on the >> way out_ (because, that side is already covered by existing >> API/views support). >> >> So, while I'm not strongly opposed to the changes in the PR, >> at the same time I don't see a lot of added value in the new >> methods. At the same time I'm a bit skeptical to commit to >> the new API in without doing the full exercise to see how >> these methods might be implemented. >> >> A good motto in API design is "when in doubt, leave it out" >> -- is this one of those cases? >> >> Maurizio >> >> >> >> On 21/03/2025 13:34, David Lloyd wrote: >>> The idea is that in the future, it may be possible to do >>> these things without the extra copy. In the PR, I found that >>> we can already build to memory segments and byte buffers >>> without more copies than what we're doing for arrays. On the >>> parsing side, we can already sometimes work without copying >>> in some cases that the user won't have access to (e.g. >>> accessing the backing array of a memory segment, even if >>> it's read-only). It's not hard to imagine that we could >>> possibly have a way to parse without the extra copy in the >>> native memory case in the future, without impacting current >>> performance on arrays. But without the API support, it can >>> never be possible. >>> >>> On Fri, Mar 21, 2025 at 8:26?AM Adam Sotona >>> wrote: >>> >>> I?m more thinking that the API already provides all the >>> important entries and conversion from and to >>> `MemorySegment` can be done by simple call of >>> `MemorySegment::toArray` and `MemorySegment::ofArray`. >>> >>> *From: *David Lloyd >>> *Date: *Friday, 21 March 2025 at 13:37 >>> *To: *Adam Sotona >>> *Cc: *classfile-api-dev at openjdk.org >>> >>> *Subject: *[External] : Re: Class files in ByteBuffer >>> >>> Please have a look at the PR. If you feel the API >>> surface has grown too much, maybe removing the >>> `ByteBuffer` variants is a logical step, since users can >>> always wrap a `ByteBuffer` with a `MemorySegment`? If >>> you could comment on the PR if you feel that to be the >>> case, I would appreciate it. >>> >>> Thanks. >>> >>> On Thu, Mar 20, 2025 at 4:09?PM Adam Sotona >>> wrote: >>> >>> I?m sorry to join the discussion a bit late. >>> >>> Here are the points to consider: >>> >>> * Class-File API is implementation is after many >>> rounds of performance optimizations purely based >>> on byte arrays. >>> * Internal use of ByteBuffer has been removed from >>> the implementation, as it caused significant JDK >>> bootstrap performance regression. >>> * Enormous amount of work has been spent on the >>> API surface reduction and removal of all >>> unnecessary ?conveniences?. >>> >>> Adam >>> >>> *From: *classfile-api-dev >>> on behalf of >>> David Lloyd >>> *Date: *Thursday, 20 March 2025 at 21:11 >>> *To: *classfile-api-dev at openjdk.org >>> >>> *Subject: *Re: Class files in ByteBuffer >>> >>> I've opened a bug [1] and pull request [2] >>> incorporating this discussion (more or less). I've >>> implemented support for both `MemorySegment` and >>> `ByteBuffer`, but this could be revisited if it >>> doesn't look OK. The implementation is not terribly >>> invasive for now, only grabbing a few low-hanging >>> optimizations. >>> >>> [1] https://bugs.openjdk.org/browse/JDK-8352536 >>> >>> >>> [2] https://github.com/openjdk/jdk/pull/24139 >>> >>> >>> On Mon, Mar 10, 2025 at 12:38?PM David Lloyd >>> wrote: >>> >>> When defining a class in the JDK, one may either >>> use a byte array or a byte buffer to hold the >>> contents of the class. The latter is useful when >>> (for example) a JAR file containing uncompressed >>> classes is mapped into memory. Thus, some class >>> loaders depend on this form of the API for class >>> definition. >>> >>> If I were to supplement such a class loader with >>> a class transformation step based on the class >>> file API, I would have to copy the bytes of each >>> class on to the heap as a byte[] before I could >>> begin parsing it. This is potentially expensive, >>> and definitely awkward. >>> >>> After transformation, it doesn't really matter >>> if you have a byte[] or ByteBuffer because >>> either way, the class can be defined directly. >>> >>> It would be nice if the class file parser could >>> accept either a byte[] or a ByteBuffer. I did a >>> quick bit of exploratory work and it looks like >>> porting the code to read from a ByteBuffer >>> instead of a byte[]? (using ByteBuffer.wrap() >>> for the array case) would be largely >>> straightforward *except* for the code which >>> parses UTF-8 constants into strings. Also there >>> could be some small performance differences >>> (maybe positive, maybe negative) depending on >>> how the buffer is accessed. >>> >>> Is this something that might be considered? >>> >>> -- >>> >>> - DML ? he/him >>> >>> >>> -- >>> >>> - DML ? he/him >>> >>> >>> -- >>> >>> - DML ? he/him >>> >>> >>> >>> -- >>> - DML ? he/him >> >> >> >> -- >> - DML ? he/him > > > > -- > - DML ? he/him -------------- next part -------------- An HTML attachment was scrubbed... URL: