From John.Rose at Sun.COM Fri Mar 14 16:47:37 2008 From: John.Rose at Sun.COM (John Rose) Date: Fri, 14 Mar 2008 16:47:37 -0700 Subject: Hello, and other things In-Reply-To: <47C8A8EE.8000809@sun.com> References: <47C8A8EE.8000809@sun.com> Message-ID: <963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com> On Feb 29, 2008, at 4:53 PM, Jason Fordham wrote: > I started thinking about targeting GCC for the JVM last week. That's a neat project! I have heard of JVMs being used to simulate very small assembly-level systems, on the order of 16-bit computers. The challenges with this come from building in a second level of virtualization. The execution of the simulated unsafe CPU is hard to integrate with the JVM's libraries. > It quickly became clear that the JVM instruction set is designed to > make > the C programming model difficult: the separation of bytecodes, > stacks, > frames, and object space, and the generally unconvertible addressType > quickly led me to a model where the JVM stacks are ignored except for > primitive operations, while memory - for data, bss and heap - is > modeled > in a large array. In order to model C's function calls by pointer, I > figured a handle pair, class and method, hashing the strings, with a > linking stage after compilation to perform fixup - much as I imagine > slide 17 in the LangNet presentation implies. I agree that method handles will help with this sort of thing. The hard part, though, is the essentially untyped nature of C memory. I've seen C implementations that run over typed heaps, but they are artful compromises, rather than simple ports to a new backend. Centerline C and Zeta-C come to mind. (Both are old projects, that may pre-date the Google cache. I don't have references handy.) The latter was a C compiler for the Symbolic Lisp Machine which used ordered pairs (cons cells) for all C pointers, to represent the combination of a base address and an arbitrary offset. A similar product was Bounds-Check C, which widened pointers into little 3-tuples (min, max, cur). The idea is that a tuple-based pointer will never be allowed to "reach beyond" the heap object it was created for; such operations are always indeterminate, since there is no guaranteed distance (or ordering) of heap objects, from one instruction to the next, in a system like the Symbolics with a powerful GC. That would work very nicely on the JVM also. You could use the sun.misc.Unsafe API (with great care!) to handle punning among memory-resident primitive types. You must avoid using Unsafe to pun between primitives and references, because there is absolutely no way to control when the GC might want to move things around underneath your code. > The key obstacles I see are that the instruction set makes > implementing > a C-like stack expensive: there are no neat push and pop operations > for > this memory model, it feels like microcoding. Though I understand the > motivation, which is to protect the bytecodes from malicious or > lazy use > of buffer overflows, and other mechanisms for executing data. The stack is really just a shorthand for operand renaming. Feel free to generate code to a register-to-register machine, and map your virtual registers to JVM locals. > I like the method handle mechanism, for a variety of reasons, and I > would like to see some easing up on where the a stack is located so > that > operations which index into the stack are more flexible, and fast. Is > this possible? If you need a memory-resident stack, you can just build an array to hold it, can't you? I'm not sure where the pain point is here, yet. Best wishes, -- John From Kenneth.Russell at Sun.COM Fri Mar 14 16:53:36 2008 From: Kenneth.Russell at Sun.COM (Kenneth Russell) Date: Fri, 14 Mar 2008 16:53:36 -0700 Subject: Hello, and other things In-Reply-To: <963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com> References: <47C8A8EE.8000809@sun.com> <963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com> Message-ID: <47DB1000.7030806@sun.com> Quick pointer to a project a co-worker told me about a while back: http://www.xwt.org/mips2java/ http://www.thisiscool.com/mips2java.htm -Ken John Rose wrote: > On Feb 29, 2008, at 4:53 PM, Jason Fordham wrote: > >> I started thinking about targeting GCC for the JVM last week. > > That's a neat project! > > I have heard of JVMs being used to simulate very small assembly-level > systems, > on the order of 16-bit computers. The challenges with this come from > building > in a second level of virtualization. The execution of the simulated > unsafe > CPU is hard to integrate with the JVM's libraries. > >> It quickly became clear that the JVM instruction set is designed to >> make >> the C programming model difficult: the separation of bytecodes, >> stacks, >> frames, and object space, and the generally unconvertible addressType >> quickly led me to a model where the JVM stacks are ignored except for >> primitive operations, while memory - for data, bss and heap - is >> modeled >> in a large array. In order to model C's function calls by pointer, I >> figured a handle pair, class and method, hashing the strings, with a >> linking stage after compilation to perform fixup - much as I imagine >> slide 17 in the LangNet presentation implies. > > I agree that method handles will help with this sort of thing. > > The hard part, though, is the essentially untyped nature of C memory. > I've seen C implementations that run over typed heaps, but they > are artful compromises, rather than simple ports to a new backend. > Centerline C and Zeta-C come to mind. (Both are old projects, that > may pre-date the Google cache. I don't have references handy.) > > The latter was a C compiler for the Symbolic Lisp Machine which > used ordered pairs (cons cells) for all C pointers, to represent the > combination of a base address and an arbitrary offset. > A similar product was Bounds-Check C, which widened > pointers into little 3-tuples (min, max, cur). The idea is > that a tuple-based pointer will never be allowed to "reach > beyond" the heap object it was created for; such operations > are always indeterminate, since there is no guaranteed > distance (or ordering) of heap objects, from one instruction > to the next, in a system like the Symbolics with a powerful GC. > > That would work very nicely on the JVM also. You could use > the sun.misc.Unsafe API (with great care!) to handle punning > among memory-resident primitive types. You must avoid > using Unsafe to pun between primitives and references, because > there is absolutely no way to control when the GC might want > to move things around underneath your code. > >> The key obstacles I see are that the instruction set makes >> implementing >> a C-like stack expensive: there are no neat push and pop operations >> for >> this memory model, it feels like microcoding. Though I understand the >> motivation, which is to protect the bytecodes from malicious or >> lazy use >> of buffer overflows, and other mechanisms for executing data. > > The stack is really just a shorthand for operand renaming. > Feel free to generate code to a register-to-register machine, > and map your virtual registers to JVM locals. > >> I like the method handle mechanism, for a variety of reasons, and I >> would like to see some easing up on where the a stack is located so >> that >> operations which index into the stack are more flexible, and fast. Is >> this possible? > > If you need a memory-resident stack, you can just build an array > to hold it, can't you? I'm not sure where the pain point is here, yet. > > Best wishes, > -- John > > _______________________________________________ > mlvm-dev mailing list > mlvm-dev at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev From John.Rose at Sun.COM Fri Mar 14 17:39:00 2008 From: John.Rose at Sun.COM (John Rose) Date: Fri, 14 Mar 2008 17:39:00 -0700 Subject: Hello, and other things In-Reply-To: <47DB1000.7030806@sun.com> References: <47C8A8EE.8000809@sun.com> <963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com> <47DB1000.7030806@sun.com> Message-ID: Thanks, Ken! The MIPS is the one I was trying to remember. I had forgotten (or my brain refused to remember) the startling fact that it was a 32-bit system, with a software page table. -- John On Mar 14, 2008, at 4:53 PM, Kenneth Russell wrote: > Quick pointer to a project a co-worker told me about a while back: > > http://www.xwt.org/mips2java/ > http://www.thisiscool.com/mips2java.htm > > -Ken > > John Rose wrote: >> I have heard of JVMs being used to simulate very small assembly-level >> systems, >> on the order of 16-bit computers. From pdoubleya at gmail.com Sat Mar 15 01:52:08 2008 From: pdoubleya at gmail.com (Patrick Wright) Date: Sat, 15 Mar 2008 09:52:08 +0100 Subject: Hello, and other things In-Reply-To: <47DB1000.7030806@sun.com> References: <47C8A8EE.8000809@sun.com> <963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com> <47DB1000.7030806@sun.com> Message-ID: <64efa1ba0803150152n635d98f1tf500abcc50510592@mail.gmail.com> I think I read mips2java is now NestedVM; http://nestedvm.ibex.org/ and http://wiki.brianweb.net/NestedVM/NestedVM. (actually, seems like the same guy works on a JVM backend for GHC called LambdaVM). There's also Cibyl, http://spel.bth.se/index.php/Cibyl, "Cibyl is a programming environment and binary translator that allows compiled C programs to execute on J2ME-capable phones. Cibyl uses GCC to compile the C programs to MIPS binaries, and these are then recompiled into Java bytecode" This page, http://www.answers.com/topic/c-to-java-byte-code-compiler?cat=technology, has some links, including one to a research paper from Dartmouth on the topic. Regards Patrick From pdoubleya at gmail.com Sat Mar 15 02:20:23 2008 From: pdoubleya at gmail.com (Patrick Wright) Date: Sat, 15 Mar 2008 10:20:23 +0100 Subject: Hello, and other things In-Reply-To: <64efa1ba0803150152n635d98f1tf500abcc50510592@mail.gmail.com> References: <47C8A8EE.8000809@sun.com> <963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com> <47DB1000.7030806@sun.com> <64efa1ba0803150152n635d98f1tf500abcc50510592@mail.gmail.com> Message-ID: <64efa1ba0803150220h2f37ad0cy17af21302dc656b2@mail.gmail.com> And here's a research paper on optimizations done in Cibyl, might be interesting to you as well http://www.ipd.bth.se/ska/phd-cibyl-performance.pdf >From the abstract, "This paper presents the optimization framework used by Cibyl to provide com- pact and well-performing translated code. Cibyl optimizes expensive multiplications/divisions, floating point support, function co-location to Java methods and provides a peep- hole optimizer. The paper also evaluates Cibyl perfor- mance both in a real-world GPS navigation application where the optimizations increase display update frequency with around 15% and a comparison against native Java and the NestedVM binary translator where we show that Cibyl can provide significant advantages for common code patternshigh-level Java code) might not be a good match for the compiler structure. The general design of Cibyl has been described in an earlier paper [8], and this paper focuses on optimizations made to reduce the size and improve the per- formance of the translated binaries. The optimizations we employ for Cibyl share some sim- ilarities with regular compiler optimizations, e.g., use of function inlining and constant propagation, but is also sig- nificantly different. Since the GCC compiler has already optimized the high-level C code, the goal of the Cibyl bi- nary translator is to make the translation into Java bytecode". Sorry if this is off-topic for the list, seems related to Jason's original question on the thread. Regards Patrick > There's also Cibyl, http://spel.bth.se/index.php/Cibyl, "Cibyl is a > programming environment and binary translator that allows compiled C > programs to execute on J2ME-capable phones. Cibyl uses GCC to compile > the C programs to MIPS binaries, and these are then recompiled into > Java bytecode" From John.Rose at Sun.COM Sat Mar 15 11:20:14 2008 From: John.Rose at Sun.COM (John Rose) Date: Sat, 15 Mar 2008 11:20:14 -0700 Subject: Hello, and other things In-Reply-To: <64efa1ba0803150220h2f37ad0cy17af21302dc656b2@mail.gmail.com> References: <47C8A8EE.8000809@sun.com> <963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com> <47DB1000.7030806@sun.com> <64efa1ba0803150152n635d98f1tf500abcc50510592@mail.gmail.com> <64efa1ba0803150220h2f37ad0cy17af21302dc656b2@mail.gmail.com> Message-ID: <8AC5FDE2-74EE-4EE6-91A2-8A094CD1463B@sun.com> Thanks for the excellent references. Since this list is archived[1], they are now bookmarked for us. On Mar 15, 2008, at 2:20 AM, Patrick Wright wrote: > Sorry if this is off-topic for the list, seems related to Jason's > original question on the thread. It's on-topic because of the such work may expose pain points[2] in the JVM for compiler back ends in general. E.g., a botched JIT optimization forcing back end complexity for a C compiler would probably count as a point point. From a quick scan of your first reference, I don't see any yet. They probably have a lot more work to do moving their backend output closer to the JVM. For example, most C pointers can probably be rendered as offsets plus a base of a Java objects or array. This requires a big pointer analysis, plus oracular advice from the user, but I think it would pay off. For a low-level account of JIT optimizations, see (and as you make discoveries contribute to) this wiki: http://wikis.sun.com/display/HotSpotInternals/PerformanceTechniques http://wikis.sun.com/display/HotSpotInternals/ Best wishes, -- John [1] http://mail.openjdk.java.net/pipermail/mlvm-dev/ [2] http://openjdk.java.net/projects/mlvm/pdf/LangNet20080128.pdf -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/mlvm-dev/attachments/20080315/903e7ed6/attachment.html From Jason.Fordham at Sun.COM Sun Mar 16 08:54:19 2008 From: Jason.Fordham at Sun.COM (Jason Fordham) Date: Sun, 16 Mar 2008 08:54:19 -0700 Subject: Hello, and other things In-Reply-To: <963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com> References: <47C8A8EE.8000809@sun.com> <963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com> Message-ID: <47DD42AB.9050602@sun.com> Hi John, On 3/14/2008 4:47 PM, John Rose wrote: > The hard part, though, is the essentially untyped nature of C memory. > I've seen C implementations that run over typed heaps, but they > are artful compromises, rather than simple ports to a new backend. > Centerline C and Zeta-C come to mind. (Both are old projects, that > may pre-date the Google cache. I don't have references handy.) > > It seems to me that the ability of (machineRadix *)pointers to overrun - above and below - the arrays they were based on is a feature of C. The memory model I'm proposing makes it possible to leverage the existing code generation models, and the libraries. > The latter was a C compiler for the Symbolic Lisp Machine which > used ordered pairs (cons cells) for all C pointers, to represent the > combination of a base address and an arbitrary offset. > A similar product was Bounds-Check C, which widened > pointers into little 3-tuples (min, max, cur). The idea is > that a tuple-based pointer will never be allowed to "reach > beyond" the heap object it was created for; such operations > are always indeterminate, since there is no guaranteed > distance (or ordering) of heap objects, from one instruction > to the next, in a system like the Symbolics with a powerful GC. > > While I understand that many C programmers have a secret wish that the GC in GCC could stand for Garbage Collection, it doesn't: I think that it's OK to avoid the Java GC; philosophically, I regard the ability to leave malloced objects on the heap without references to them as a C "feature", just like buffer over/underruns. > That would work very nicely on the JVM also. You could use > the sun.misc.Unsafe API (with great care!) to handle punning > among memory-resident primitive types. You must avoid > using Unsafe to pun between primitives and references, because > there is absolutely no way to control when the GC might want > to move things around underneath your code. > > I hadn't come across this before, and it doesn't seem to have any documentation! Given your limited description of the features, it sounds as though it would be very easy to leave a gap where the compiler could be used to break Java protection, which I would not want to do. >> The key obstacles I see are that the instruction set makes >> implementing >> a C-like stack expensive: there are no neat push and pop operations >> for >> this memory model, it feels like microcoding. Though I understand the >> motivation, which is to protect the bytecodes from malicious or >> lazy use >> of buffer overflows, and other mechanisms for executing data. >> > > The stack is really just a shorthand for operand renaming. > Feel free to generate code to a register-to-register machine, > and map your virtual registers to JVM locals. > > Again, I'm inclined to retain the classic stack-based calling pragma in the memory model, because it makes it trivial to construct and manipulate pointers to C objects allocated in the local frame - they're the same as pointers to objects on the heap, because they're in the same untyped array - machineRadix[] memory. >> I like the method handle mechanism, for a variety of reasons, and I >> would like to see some easing up on where the a stack is located so >> that >> operations which index into the stack are more flexible, and fast. Is >> this possible? >> > > If you need a memory-resident stack, you can just build an array > to hold it, can't you? I'm not sure where the pain point is here, yet. > > Stack operations - manipulating and indexing the BP and SP - will be frequent multi-bytecode operations. I don't know how well the JIT compiler will work out what's going on. Jason From Jason.Fordham at Sun.COM Sun Mar 16 09:28:17 2008 From: Jason.Fordham at Sun.COM (Jason Fordham) Date: Sun, 16 Mar 2008 09:28:17 -0700 Subject: Hello, and other things In-Reply-To: <8AC5FDE2-74EE-4EE6-91A2-8A094CD1463B@sun.com> References: <47C8A8EE.8000809@sun.com> <963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com> <47DB1000.7030806@sun.com> <64efa1ba0803150152n635d98f1tf500abcc50510592@mail.gmail.com> <64efa1ba0803150220h2f37ad0cy17af21302dc656b2@mail.gmail.com> <8AC5FDE2-74EE-4EE6-91A2-8A094CD1463B@sun.com> Message-ID: <47DD4AA1.8060709@sun.com> John, Thanks for the JIT information! The two things I need to come up with a design for now are how to get GCC to generate .class files, and what the runtime setup code needs to do. The former is a big task, because I need to have a fairly detailed design for the calling protocol. But that's all going to have to wait for later... Jason On 3/15/2008 11:20 AM, John Rose wrote: > Thanks for the excellent references. > Since this list is archived[1], they are now > bookmarked for us. > > On Mar 15, 2008, at 2:20 AM, Patrick Wright wrote: >> Sorry if this is off-topic for the list, seems related to Jason's >> original question on the thread. > > It's on-topic because of the such work may > expose pain points[2] in the JVM for compiler > back ends in general. E.g., a botched JIT > optimization forcing back end complexity > for a C compiler would probably count as > a point point. From a quick scan of your first > reference, I don't see any yet. They probably > have a lot more work to do moving their > backend output closer to the JVM. > For example, most C pointers can probably > be rendered as offsets plus a base of a > Java objects or array. This requires a > big pointer analysis, plus oracular advice > from the user, but I think it would pay off. > > For a low-level account of JIT optimizations, > see (and as you make discoveries contribute to) > this wiki: > http://wikis.sun.com/display/HotSpotInternals/PerformanceTechniques > http://wikis.sun.com/display/HotSpotInternals/ > > Best wishes, > -- John > > [1] http://mail.openjdk.java.net/pipermail/mlvm-dev/ > [2] http://openjdk.java.net/projects/mlvm/pdf/LangNet20080128.pdf > ------------------------------------------------------------------------ > > _______________________________________________ > mlvm-dev mailing list > mlvm-dev at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From forax at univ-mlv.fr Sun Mar 16 09:56:30 2008 From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=) Date: Sun, 16 Mar 2008 17:56:30 +0100 Subject: Hello, and other things In-Reply-To: <963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com> References: <47C8A8EE.8000809@sun.com> <963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com> Message-ID: <47DD513E.3000701@univ-mlv.fr> John Rose a ?crit : ... > That would work very nicely on the JVM also. You could use > the sun.misc.Unsafe API (with great care!) to handle punning > among memory-resident primitive types. You must avoid > using Unsafe to pun between primitives and references, because > there is absolutely no way to control when the GC might want > to move things around underneath your code. > > Yop, the last time i've tried to read a reference as an int (a kind of fast hashcode) the construction of the JIT IR (which is typed) crashed before the GC was triggered. R?mi From simon.kagstrom at gmail.com Sun Mar 16 11:56:51 2008 From: simon.kagstrom at gmail.com (Simon Kagstrom) Date: Sun, 16 Mar 2008 19:56:51 +0100 Subject: Some words about Cibyl (MIPS to Java bytecode binary translation) Message-ID: <20080316195651.0edc5fc7@gmail.com> Hello! I'm the author of Cibyl, which translates MIPS binaries into Java bytecode. Patrick Wright pointed me to this list and the discussion about compiling C into Java bytecode (thanks!), so I thought I'd share some comments about how this is done in Cibyl. Most of it is also applicable to NestedVM, which does essentially the same thing with a set of implementation differences. NestedVM also predates Cibyl, so the origin of the idea should be attributed to them. Cibyl targets portability of C and C++ applications to J2ME devices, so it also provides an interface to the MIDP API. The translation is fairly straight-forward. Cibyl depends on GCC to generate an ELF binary (with symbol and relocation information intact), and the translation is done with a 1-1 mapping between C functions (call destinations in the ELF binary) and static Java methods in a class. Most MIPS instructions can be translated pretty much 1-1 to Java bytecode. NestedVM does this a bit different and does not have the 1-1-mapping. Both methods have benefits and disadvantages. With the NestedVM approach, it's easier to support e.g., longjmp, while the Cibyl approach makes the class look more like a "real" Java class for example in crash dumps or profilers. From benchmarks I've made, the Cibyl approach also seems easier to achive good performance with, mostly because it always uses Java local variables for the MIPS register representation throughout. So to the interesting part :-). While implementing the translation has mostly been pretty straight-forward, there are two cases where Java bytecode poses some problems: * The 64KB method size limit, which is perhaps the largest issue. If the bytecode had not had this limitation, the translation would be done to a single method, which would improve performance and simplify the implementation quite a bit. Cibyl also allows co-locating multiple C functions in a single Java method, which can improve performance quite a bit. This is of course also a problem with very big C functions. In practice, it has only been a problem in one application so far (the fetch-and-decode loop of an emulator). Cibyl currently does not handle this situation automatically, and I guess this would also be an issue for a JBC compiler backend. * Untyped memory, which I also saw you took up. In Cibyl, I've used a big int-array as the "memory" representation. This fits MIPS quite well, since unaligned memory access is limited to special instructions, and most accesses tend to be 32-bit accesses. However, when 8- or 16-bit loads and stores are done there is a significant performance hit because of this. Since Cibyl targets embedded (J2ME) devices, it will just allocate a fixed amount of memory for the C program at startup (for stack/heap). NestedVM targets other systems and uses a two-level structure that allows a sparse memory layout. Obviously there are also some MIPS instructions which are a bit tricky to translate, but that's not really the fault of JBC. So if I could have one wish for Java bytecode, it would be to lift the 64KB method size limit (I'm pretty sure the NestedVM developers agree with this). I understand that the type-safety will not be lifted, so I guess that untyped memory will be a problem for any C backend. Sorry for the long mail :-). I'll follow Jason's work on a Java GCC backend, that would be quite nice to have. I guess you are also familiar with LLVM, which perhaps could be an easier starting point than plain GCC? -- // Simon From reachbach at gmail.com Sun Mar 16 19:50:08 2008 From: reachbach at gmail.com (Bharath Ravi Kumar) Date: Mon, 17 Mar 2008 08:20:08 +0530 Subject: mlvm-dev Digest, Vol 4, Issue 2 In-Reply-To: References: Message-ID: <76b5ba080803161950u10c20e28r3be3ce31d89e23a1@mail.gmail.com> John, Looks like the link to the hotspot compiler wiki is broken. I got a 404 - http://wikis.sun.com/display/HotSpotInternals/Compiler -Bharath Date: Sat, 15 Mar 2008 11:20:14 -0700 > From: John Rose > Subject: Re: Hello, and other things > To: Patrick Wright > Cc: mlvm-dev at openjdk.java.net > Message-ID: <8AC5FDE2-74EE-4EE6-91A2-8A094CD1463B at sun.com> > Content-Type: text/plain; charset="us-ascii" > > Thanks for the excellent references. > Since this list is archived[1], they are now > bookmarked for us. > > On Mar 15, 2008, at 2:20 AM, Patrick Wright wrote: > > Sorry if this is off-topic for the list, seems related to Jason's > > original question on the thread. > > It's on-topic because of the such work may > expose pain points[2] in the JVM for compiler > back ends in general. E.g., a botched JIT > optimization forcing back end complexity > for a C compiler would probably count as > a point point. From a quick scan of your first > reference, I don't see any yet. They probably > have a lot more work to do moving their > backend output closer to the JVM. > For example, most C pointers can probably > be rendered as offsets plus a base of a > Java objects or array. This requires a > big pointer analysis, plus oracular advice > from the user, but I think it would pay off. > > For a low-level account of JIT optimizations, > see (and as you make discoveries contribute to) > this wiki: > http://wikis.sun.com/display/HotSpotInternals/PerformanceTechniques > http://wikis.sun.com/display/HotSpotInternals/ > > Best wishes, > -- John > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/mlvm-dev/attachments/20080317/3eb57d80/attachment.html From John.Rose at Sun.COM Sun Mar 16 23:02:37 2008 From: John.Rose at Sun.COM (John Rose) Date: Sun, 16 Mar 2008 23:02:37 -0700 Subject: Some words about Cibyl (MIPS to Java bytecode binary translation) In-Reply-To: <20080316195651.0edc5fc7@gmail.com> References: <20080316195651.0edc5fc7@gmail.com> Message-ID: <570F7E62-9324-4B9C-9053-3781476D2A59@sun.com> On Mar 16, 2008, at 11:56 AM, Simon Kagstrom wrote: > ...in crash dumps or profilers. From benchmarks I've made, the Cibyl > approach also seems easier to achive good performance with, mostly > because it always uses Java local variables for the MIPS register > representation throughout. By the same token, there would be benefits to raising C structs where possible into Java objects. (It probably requires user advice, as I noted before.) The int-array memory model would be used only for "hard cases". I think there is probably some analysis that could be made, supported by user annotations, that for most C programs, would allow a majority of the data structures to go into real Java objects. The nested VM could keep a type profile at every indirection operation to direct its translation of base+offset to fields. > * The 64KB method size limit, which is perhaps the largest issue. If > the bytecode had not had this limitation, the translation would be > done to a single method, which would improve performance and > simplify > the implementation quite a bit. Cibyl also allows co-locating > multiple C functions in a single Java method, which can improve > performance quite a bit. This is an interesting problem. It would be a good MLVM project. The messy part is finding all the structs with 16-bit offsets and making 32-bit versions available. It's so messy, I think, that people have wanted to wait for a major format revision. (I think basing offsets on the Pack200 UNSIGNED5 format would make for a better revolutionary change, better than adding 32-bit twin structures. Regarding Twins--suddenly I think of DeVito and Schwarzenegger.) > * Untyped memory, which I also saw you took up. In Cibyl, I've used a > big int-array as the "memory" representation. This fits MIPS quite > well, since unaligned memory access is limited to special > instructions, and most accesses tend to be 32-bit accesses. However, > when 8- or 16-bit loads and stores are done there is a significant > performance hit because of this. Yes, that is a nice fit. It's an amazing application of a (very) RISC ISA, to an execution platform realized in software not silicon. Best, -- John From simon.kagstrom at gmail.com Mon Mar 17 13:38:24 2008 From: simon.kagstrom at gmail.com (Simon Kagstrom) Date: Mon, 17 Mar 2008 21:38:24 +0100 Subject: Some words about Cibyl (MIPS to Java bytecode binary translation) In-Reply-To: <570F7E62-9324-4B9C-9053-3781476D2A59@sun.com> References: <20080316195651.0edc5fc7@gmail.com> <570F7E62-9324-4B9C-9053-3781476D2A59@sun.com> Message-ID: <20080317213824.1aa1e1be@lska2> Hi again, On Sun, 16 Mar 2008 23:02:37 -0700 John Rose wrote: > > * Untyped memory, which I also saw you took up. In Cibyl, I've used > > a big int-array as the "memory" representation. This fits MIPS > > quite well, since unaligned memory access is limited to special > > instructions, and most accesses tend to be 32-bit accesses. > > However, when 8- or 16-bit loads and stores are done there is a > > significant performance hit because of this. > Yes, that is a nice fit. It's an amazing application of a (very) > RISC ISA, to an execution platform realized in software not silicon. Yes, this was the main reason why I selected MIPS for this. It's a beautiful instruction set :-) I actually forgot one obstacle which requires some trickery in Cibyl: Register-indirect branches and calls. Since Java bytecode doesn't allow computed gotos, I use a generated "call table" for method calls and a method-local jump table for local computed gotos. // Simon From forax at univ-mlv.fr Wed Mar 19 14:49:16 2008 From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=) Date: Wed, 19 Mar 2008 22:49:16 +0100 Subject: Bugs in Da Vinci patches Message-ID: <47E18A5C.4000607@univ-mlv.fr> Hi john, hi all, When i tried to compile the VM patched on my laptop (fedora core 6), the compiler find two problems: a cut&paste problem in classFileParser.cpp:472 case T_DOUBLE: cp->long_at_put(index, value.d); break; should be: case T_DOUBLE: cp->double_at_put(index, value.d); break; ^^^^^ and in vm/oops/klass.cpp:492 assert(strlen(result) == result_len, ""); strcpy(result + result_len, hash_buf); assert(strlen(result) == result_len + hash_len, ""); the two asserts compare signed and unsigned int, so i've changed to: assert((int)strlen(result) == result_len, ""); strcpy(result + result_len, hash_buf); assert((int)strlen(result) == result_len + hash_len, ""); It seems to work but i don't develop in C since more than ten years :) cheers, R?mi From forax at univ-mlv.fr Sat Mar 22 12:12:38 2008 From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=) Date: Sat, 22 Mar 2008 20:12:38 +0100 Subject: Bugs in Da Vinci patches Message-ID: <47E55A26.2010203@univ-mlv.fr> Hi john, hi all, When i tried to compile the VM patched on my laptop (fedora core 6), the compiler find two problems: a cut&paste problem in classFileParser.cpp:472 case T_DOUBLE: cp->long_at_put(index, value.d); break; should be: case T_DOUBLE: cp->double_at_put(index, value.d); break; ^^^^^ and in vm/oops/klass.cpp:492 assert(strlen(result) == result_len, ""); strcpy(result + result_len, hash_buf); assert(strlen(result) == result_len + hash_len, ""); the two asserts compare signed and unsigned int, so i've changed to: assert((int)strlen(result) == result_len, ""); strcpy(result + result_len, hash_buf); assert((int)strlen(result) == result_len + hash_len, ""); It seems to work but i don't develop in C since more than ten years :) cheers, R?mi From lukas.stadler at jku.at Wed Mar 26 16:17:12 2008 From: lukas.stadler at jku.at (Lukas Stadler) Date: Thu, 27 Mar 2008 00:17:12 +0100 Subject: stack manipulation APIs Message-ID: <47EAD978.1030506@jku.at> *Hi!* I am currently working on APIs for the multi-language VM that would allow Java code to access and manipulate its own stacks. The overall scope is pretty general, ranging from call-with-current-continuation (call/cc) for languages like scheme and coroutine implementations to dynamic recompilation for compilers (like what the JIT does, but for ?-to-bytecode compilers). In the end it should be possible to change, replace, remove, etc. stack frames or even synthesize a new stack from scratch. *But*: It seems almost impossible to make this secure - some useful higher-level APIs will be needed. How these could be implemented under the covers - that's what I'm thinking about right now. I'm starting with the most simple use case for now - continuations. Not really "stack manipulation", just saving/restoring stacks. Some questions/remarks that crossed my mind: (most of this is "... am I correct assuming that:") * From what I've understood a call/cc can be invoked even after the method in which it was created has returned. This could lead to all sorts of harmful behavior, like exiting a monitor twice, etc. Should this be possible, or will only a restricted case be implemented? There would have to be a big red sign (and possibly some kind of verifier) with all the things that aren't allowed in such code. I recently looked at the apache commons Javaflow library - they implement storing the current stack state using only bytecode instrumentation and a small and unintrusive runtime framework. (I can write a short summary of how they're doing this if anyone is interested.) I think that, as their implementation is inherently safe, we could partly adopt its behaviour. * I think that there are two variants to consider: one-shot continuations that really only qualify as nonlocal returns and full-fledged continuations that are invoked many times from everywhere. I'm just now starting to explore the OpenJDK code, so any remarks/pointers are very welcome - especially where to look for examples on how to deal with stack frames (I thought about the deoptimization code...) Some more use cases for the stack-manipulation that came to my mind: * sophisticated error logs (stack traces with local variables etc.) * checkpointing in servers * transferring threads between servers / nomadic lightweight threads (Second Life's agent execution model) * script interpreters/compilers that can switch between interpreted and compiled mode, like the JVM Thanks, Lukas From John.Rose at Sun.COM Wed Mar 26 22:34:25 2008 From: John.Rose at Sun.COM (John Rose) Date: Wed, 26 Mar 2008 22:34:25 -0700 Subject: stack manipulation APIs In-Reply-To: <47EAD978.1030506@jku.at> References: <47EAD978.1030506@jku.at> Message-ID: <71A53C13-555C-4CB4-9A2B-77FECC27ED66@sun.com> On Mar 26, 2008, at 4:17 PM, Lukas Stadler wrote: > I am currently working on APIs for the multi-language VM that would > allow Java code to access and manipulate its own stacks. The overall > scope is pretty general, ranging from call-with-current-continuation > (call/cc) for languages like scheme and coroutine implementations to > dynamic recompilation for compilers (like what the JIT does, but > for ?-to-bytecode compilers). In the end it should be possible to > change, replace, remove, etc. stack frames or even synthesize a new > stack from scratch. *But*: It seems almost impossible to make this > secure - some useful higher-level APIs will be needed. How these could > be implemented under the covers - that's what I'm thinking about > right now. > > I'm starting with the most simple use case for now - continuations. > Not > really "stack manipulation", just saving/restoring stacks. > > Some questions/remarks that crossed my mind: (most of this is "... > am I > correct assuming that:") > > * From what I've understood a call/cc can be invoked even > after the > method in which it was created has returned. Yes. From a low-level point of view, a copyStack operation can return more than once. The first time it returns, it produces a new snapshot of (at least part of) the thread stack. If that snapshot is then passed to restoreStack, the thread makes a discontinuous jump back to the state of affairs as of the corresponding copyStack operation, and the copyStack operation returns a second time. (By discontinuous, I mean that the control stack as of the restoreStack call is at least partially irrelevant to the future of the computation. In that sense it is like a throw.) On this second return from the same call to copyStack, the control stack is once again in the state it was when the snapshot was made. For generality and convenience, the call to returnStack should be able to specify either a return value or a throwable with which to continue (normally or with a throw) from the copyStack call. (BTW, the method names and details are from some POC code, for which I owe a blog and code review.) So, yes, at least part of the call/cc computation can occur more than once, because of those discontinuous restoreStack calls. > This could lead to all sorts of harmful behavior, like exiting a > monitor twice, etc. Should this be possible, or will only a > restricted case be implemented? There would have to be a big red > sign (and possibly some kind of verifier) with all the things that > aren't allowed in such code. Yes. This is probably the single biggest problem with the low-level copyStack/restoreStack idea. The basic idea is that if method has bracket pairs that must be properly matched, any copyStack that copies that method's stack frame while one or more bracket is open must ensure that the brackets continue to be matched properly. (By bracket pairs I mean especially monitorenter vs. monitorexit and the entry and exit of security states in doPrivileged, etc. They can also include anything that try/finally is used to clean up, such as open/close of a file, or some sort of push/pop on a thread local variable.) Java programmers are used to putting try/finally in their code to make sure a closing bracket gets executed. But there's no corresponding convention to make sure an opening bracket gets re-executed. With continuations the brackets are more symmetric. Scheme's version of try/finally has a sort of 'initially' clause which is reliably executed before the main body is executed. This is not the Scheme syntax, but Java-ified it might be: initially { x.monitorenter(); } try { doSomethingWithXLocked(); } finally { x.monitorexit(); } Both the initially and finally clauses may be executed more than once, but they are always properly matched. (If they were to print 'I' and 'F', then the output would always match the regular expression (IF)+.) It's almost (but perhaps not quite) possible to recover the block structure of 'synchronized' statements from bytecodes. I think that's like the verifier thing you are talking about. However, that does not guarantee that the code is likely to work properly if it is re-entered even with the intended monitorenter instructions. If there is doubt, I think it is better to require that methods positively declare that they are re-entry safe, and provide the right hooks for 'initially' actions, before restoreStack is allowed to re-enter them. The interesting question is whether there is some large class of methods for which a positive declaration of safety is not needed, because there isn't doubt. I don't know the answer to this; I think the answer will come by looking carefully at actual code and from experience. > I recently looked at the apache commons Javaflow library - they > implement storing the current stack state using only bytecode > instrumentation and a small and unintrusive runtime framework. (I > can write a short summary of how they're doing this if anyone is > interested.) I think that, as their implementation is inherently > safe, we could partly adopt its behaviour. Yes, that is the sort of experience I'm hoping we can use. > * I think that there are two variants to consider: one-shot > continuations that really only qualify as nonlocal returns and > full-fledged continuations that are invoked many times from > everywhere. There's another degree of freedom: How deep is the stack captured by copyStack? (Relatedly, how many frames does the restoreStack operation change? There's a Hamming distance between stack traces.) In the use case of a coroutine-like generator, each restoreStack will not pop any frames, and will just push a frame or two of suspended generator state. (I'm not saying that copyStack/restoreStack is the best way to implement generators, but I am suggesting that it is a good way to experiment with them.) In the use case of an application reoptimizing itself, restoreStack operations will be infrequent but will replace most or all stack frames. > I'm just now starting to explore the OpenJDK code, so any > remarks/pointers are very welcome - especially where to look for > examples on how to deal with stack frames (I thought about the > deoptimization code...) Look a vframes and vframe arrays. A vframe is a virtualized view onto a stack frame. > Some more use cases for the stack-manipulation that came to my mind: > > * sophisticated error logs (stack traces with local variables > etc.) > * checkpointing in servers > * transferring threads between servers / nomadic lightweight > threads > (Second Life's agent execution model) > * script interpreters/compilers that can switch between > interpreted > and compiled mode, like the JVM The Hotspot JVM switches back and forth between interpreted and compiled mode, as it optimizes and deoptimizes code in response to dynamically changing properties of the application (as measured by changes to things like type profiles and the class hierarchy). I believe future VM-based systems will provide something like vframes to pluggable library-specific and application- specific optimization frameworks, which will do similar tricks, based on their own profiles and metrics. Think on-the-fly profile-directed parallelization. Continuations may be part of the optimization arsenal required to run many-core systems efficiently. It's great that you are looking at this stuff! Best wishes, -- John From John.Rose at Sun.COM Thu Mar 27 19:03:08 2008 From: John.Rose at Sun.COM (John Rose) Date: Thu, 27 Mar 2008 19:03:08 -0700 Subject: Bugs in Da Vinci patches In-Reply-To: <47E55A26.2010203@univ-mlv.fr> References: <47E55A26.2010203@univ-mlv.fr> Message-ID: <5BF9A3B1-277E-4F53-902C-C3F9E47B2F75@sun.com> Thanks for the fixes, R?mi! I can integrate them into the next respin of the patch. (I note with satisfaction that you have signed the OpenJDK contributor agreement. http://www.sun.com/software/opensource/contributor_agreement.jsp ) Getting up a repo. for the patches is an item on my (long) to-do list. I'm going to be using an HG sub-repository for the patches themselves, which (I hope) will simplify the task of developing multiple changes on a moving target (the OpenJDK itself). Would you like to be a test subject for the new repo. (when it exists) and check in the changes yourself? Best, -- John On Mar 22, 2008, at 12:12 PM, R?mi Forax wrote: > Hi john, hi all, > > When i tried to compile the VM patched on my laptop (fedora core 6), > the compiler find two problems: From forax at univ-mlv.fr Fri Mar 28 02:44:34 2008 From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=) Date: Fri, 28 Mar 2008 10:44:34 +0100 Subject: Bugs in Da Vinci patches In-Reply-To: <5BF9A3B1-277E-4F53-902C-C3F9E47B2F75@sun.com> References: <47E55A26.2010203@univ-mlv.fr> <5BF9A3B1-277E-4F53-902C-C3F9E47B2F75@sun.com> Message-ID: <47ECBE02.9010103@univ-mlv.fr> John Rose a ?crit : > Thanks for the fixes, R?mi! I can integrate them into the next respin > of the patch. More will come, i've generified AnonymousClassLoader, found an infinite loop in checkHostClass(), etc. > > (I note with satisfaction that you have signed the OpenJDK contributor > agreement. > http://www.sun.com/software/opensource/contributor_agreement.jsp ) yes, i was even a JDK contributor before it goes opensource. > > Getting up a repo. for the patches is an item on my (long) to-do list. > I'm going to be using an HG sub-repository for the patches themselves, > which (I hope) will simplify the task of developing multiple changes on > a moving target (the OpenJDK itself). A sub-repository of the OpenJDK hotspot one will be cool. I currently use a kijaro.dev.java.net branch (a SVN repository) but it mixes my patches against Da Vinci VM and the properties runtime library that use it. > > Would you like to be a test subject for the new repo. (when it exists) > and check in the changes yourself? yes. after posting a patch to mlvm-dev and if the patch is reviewed (or at least nobody is against it) i can check in it myself. Futhermore, i will be happy to review patches of other members of this list and check in them too if the volume is low. > > Best, > -- John R?mi > > On Mar 22, 2008, at 12:12 PM, R?mi Forax wrote: > >> Hi john, hi all, >> >> When i tried to compile the VM patched on my laptop (fedora core 6), >> the compiler find two problems: