From John.Rose at Sun.COM  Fri Mar 14 16:47:37 2008
From: John.Rose at Sun.COM (John Rose)
Date: Fri, 14 Mar 2008 16:47:37 -0700
Subject: Hello, and other things
In-Reply-To: <47C8A8EE.8000809@sun.com>
References: <47C8A8EE.8000809@sun.com>
Message-ID: <963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com>

On Feb 29, 2008, at 4:53 PM, Jason Fordham wrote:

> I started thinking about targeting GCC for the JVM last week.

That's a neat project!

I have heard of JVMs being used to simulate very small assembly-level  
systems,
on the order of 16-bit computers.  The challenges with this come from  
building
in a second level of virtualization.  The execution of the simulated  
unsafe
CPU is hard to integrate with the JVM's libraries.

> It quickly became clear that the JVM instruction set is designed to  
> make
> the C programming model difficult: the separation of bytecodes,  
> stacks,
> frames, and object space, and the generally unconvertible addressType
> quickly led me to a model where the JVM stacks are ignored except for
> primitive operations, while memory - for data, bss and heap - is  
> modeled
> in a large array. In order to model C's function calls by pointer, I
> figured a handle pair, class and method, hashing the strings, with a
> linking stage after compilation to perform fixup - much as I imagine
> slide 17 in the LangNet presentation implies.

I agree that method handles will help with this sort of thing.

The hard part, though, is the essentially untyped nature of C memory.
I've seen C implementations that run over typed heaps, but they
are artful compromises, rather than simple ports to a new backend.
Centerline C and Zeta-C come to mind.  (Both are old projects, that
may pre-date the Google cache.  I don't have references handy.)

The latter was a C compiler for the Symbolic Lisp Machine which
used ordered pairs (cons cells) for all C pointers, to represent the
combination of a base address and an arbitrary offset.
A similar product was Bounds-Check C, which widened
pointers into little 3-tuples (min, max, cur).  The idea is
that a tuple-based pointer will never be allowed to "reach
beyond" the heap object it was created for; such operations
are always indeterminate, since there is no guaranteed
distance (or ordering) of heap objects, from one instruction
to the next, in a system like the Symbolics with a powerful GC.

That would work very nicely on the JVM also.  You could use
the sun.misc.Unsafe API (with great care!) to handle punning
among memory-resident primitive types.  You must avoid
using Unsafe to pun between primitives and references, because
there is absolutely no way to control when the GC might want
to move things around underneath your code.

> The key obstacles I see are that the instruction set makes  
> implementing
> a C-like stack expensive: there are no neat push and pop operations  
> for
> this memory model, it feels like microcoding. Though I understand the
> motivation, which is to protect the bytecodes from malicious or  
> lazy use
> of buffer overflows, and other mechanisms for executing data.

The stack is really just a shorthand for operand renaming.
Feel free to generate code to a register-to-register machine,
and map your virtual registers to JVM locals.

> I like the method handle mechanism, for a variety of reasons, and I
> would like to see some easing up on where the a stack is located so  
> that
> operations which index into the stack are more flexible, and fast. Is
> this possible?

If you need a memory-resident stack, you can just build an array
to hold it, can't you?  I'm not sure where the pain point is here, yet.

Best wishes,
-- John


From Kenneth.Russell at Sun.COM  Fri Mar 14 16:53:36 2008
From: Kenneth.Russell at Sun.COM (Kenneth Russell)
Date: Fri, 14 Mar 2008 16:53:36 -0700
Subject: Hello, and other things
In-Reply-To: <963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com>
References: <47C8A8EE.8000809@sun.com>
	<963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com>
Message-ID: <47DB1000.7030806@sun.com>

Quick pointer to a project a co-worker told me about a while back:

http://www.xwt.org/mips2java/
http://www.thisiscool.com/mips2java.htm

-Ken

John Rose wrote:
> On Feb 29, 2008, at 4:53 PM, Jason Fordham wrote:
> 
>> I started thinking about targeting GCC for the JVM last week.
> 
> That's a neat project!
> 
> I have heard of JVMs being used to simulate very small assembly-level  
> systems,
> on the order of 16-bit computers.  The challenges with this come from  
> building
> in a second level of virtualization.  The execution of the simulated  
> unsafe
> CPU is hard to integrate with the JVM's libraries.
> 
>> It quickly became clear that the JVM instruction set is designed to  
>> make
>> the C programming model difficult: the separation of bytecodes,  
>> stacks,
>> frames, and object space, and the generally unconvertible addressType
>> quickly led me to a model where the JVM stacks are ignored except for
>> primitive operations, while memory - for data, bss and heap - is  
>> modeled
>> in a large array. In order to model C's function calls by pointer, I
>> figured a handle pair, class and method, hashing the strings, with a
>> linking stage after compilation to perform fixup - much as I imagine
>> slide 17 in the LangNet presentation implies.
> 
> I agree that method handles will help with this sort of thing.
> 
> The hard part, though, is the essentially untyped nature of C memory.
> I've seen C implementations that run over typed heaps, but they
> are artful compromises, rather than simple ports to a new backend.
> Centerline C and Zeta-C come to mind.  (Both are old projects, that
> may pre-date the Google cache.  I don't have references handy.)
> 
> The latter was a C compiler for the Symbolic Lisp Machine which
> used ordered pairs (cons cells) for all C pointers, to represent the
> combination of a base address and an arbitrary offset.
> A similar product was Bounds-Check C, which widened
> pointers into little 3-tuples (min, max, cur).  The idea is
> that a tuple-based pointer will never be allowed to "reach
> beyond" the heap object it was created for; such operations
> are always indeterminate, since there is no guaranteed
> distance (or ordering) of heap objects, from one instruction
> to the next, in a system like the Symbolics with a powerful GC.
> 
> That would work very nicely on the JVM also.  You could use
> the sun.misc.Unsafe API (with great care!) to handle punning
> among memory-resident primitive types.  You must avoid
> using Unsafe to pun between primitives and references, because
> there is absolutely no way to control when the GC might want
> to move things around underneath your code.
> 
>> The key obstacles I see are that the instruction set makes  
>> implementing
>> a C-like stack expensive: there are no neat push and pop operations  
>> for
>> this memory model, it feels like microcoding. Though I understand the
>> motivation, which is to protect the bytecodes from malicious or  
>> lazy use
>> of buffer overflows, and other mechanisms for executing data.
> 
> The stack is really just a shorthand for operand renaming.
> Feel free to generate code to a register-to-register machine,
> and map your virtual registers to JVM locals.
> 
>> I like the method handle mechanism, for a variety of reasons, and I
>> would like to see some easing up on where the a stack is located so  
>> that
>> operations which index into the stack are more flexible, and fast. Is
>> this possible?
> 
> If you need a memory-resident stack, you can just build an array
> to hold it, can't you?  I'm not sure where the pain point is here, yet.
> 
> Best wishes,
> -- John
> 
> _______________________________________________
> mlvm-dev mailing list
> mlvm-dev at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


From John.Rose at Sun.COM  Fri Mar 14 17:39:00 2008
From: John.Rose at Sun.COM (John Rose)
Date: Fri, 14 Mar 2008 17:39:00 -0700
Subject: Hello, and other things
In-Reply-To: <47DB1000.7030806@sun.com>
References: <47C8A8EE.8000809@sun.com>
	<963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com>
	<47DB1000.7030806@sun.com>
Message-ID: <C822DA3A-7862-4C0B-83F7-35B6FEDEB08C@Sun.COM>

Thanks, Ken!  The MIPS is the one I was trying to remember.
I had forgotten (or my brain refused to remember) the startling fact
that it was a 32-bit system, with a software page table.
-- John

On Mar 14, 2008, at 4:53 PM, Kenneth Russell wrote:

> Quick pointer to a project a co-worker told me about a while back:
>
> http://www.xwt.org/mips2java/
> http://www.thisiscool.com/mips2java.htm
>
> -Ken
>
> John Rose wrote:
>> I have heard of JVMs being used to simulate very small assembly-level
>> systems,
>> on the order of 16-bit computers.


From pdoubleya at gmail.com  Sat Mar 15 01:52:08 2008
From: pdoubleya at gmail.com (Patrick Wright)
Date: Sat, 15 Mar 2008 09:52:08 +0100
Subject: Hello, and other things
In-Reply-To: <47DB1000.7030806@sun.com>
References: <47C8A8EE.8000809@sun.com>
	<963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com>
	<47DB1000.7030806@sun.com>
Message-ID: <64efa1ba0803150152n635d98f1tf500abcc50510592@mail.gmail.com>

I think I read mips2java is now NestedVM; http://nestedvm.ibex.org/
and http://wiki.brianweb.net/NestedVM/NestedVM. (actually, seems like
the same guy works on a JVM backend for GHC called LambdaVM).

There's also Cibyl, http://spel.bth.se/index.php/Cibyl, "Cibyl is a
programming environment and binary translator that allows compiled C
programs to execute on J2ME-capable phones. Cibyl uses GCC to compile
the C programs to MIPS binaries, and these are then recompiled into
Java bytecode"

This page, http://www.answers.com/topic/c-to-java-byte-code-compiler?cat=technology,
has some links, including one to a research paper from Dartmouth on
the topic.

Regards
Patrick


From pdoubleya at gmail.com  Sat Mar 15 02:20:23 2008
From: pdoubleya at gmail.com (Patrick Wright)
Date: Sat, 15 Mar 2008 10:20:23 +0100
Subject: Hello, and other things
In-Reply-To: <64efa1ba0803150152n635d98f1tf500abcc50510592@mail.gmail.com>
References: <47C8A8EE.8000809@sun.com>
	<963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com>
	<47DB1000.7030806@sun.com>
	<64efa1ba0803150152n635d98f1tf500abcc50510592@mail.gmail.com>
Message-ID: <64efa1ba0803150220h2f37ad0cy17af21302dc656b2@mail.gmail.com>

And here's a research paper on optimizations done in Cibyl, might be
interesting to you as well
 http://www.ipd.bth.se/ska/phd-cibyl-performance.pdf

>From the abstract, "This paper presents
the optimization framework used by Cibyl to provide com-
pact and well-performing translated code. Cibyl optimizes
expensive multiplications/divisions, floating point support,
function co-location to Java methods and provides a peep-
hole optimizer. The paper also evaluates Cibyl perfor-
mance both in a real-world GPS navigation application
where the optimizations increase display update frequency
with around 15% and a comparison against native Java
and the NestedVM binary translator where we show that
Cibyl can provide significant advantages for common code
patternshigh-level Java code) might not be a good match for the
compiler structure. The general design of Cibyl has been
described in an earlier paper [8], and this paper focuses on
optimizations made to reduce the size and improve the per-
formance of the translated binaries.
The optimizations we employ for Cibyl share some sim-
ilarities with regular compiler optimizations, e.g., use of
function inlining and constant propagation, but is also sig-
nificantly different. Since the GCC compiler has already
optimized the high-level C code, the goal of the Cibyl bi-
nary translator is to make the translation into Java bytecode".

Sorry if this is off-topic for the list, seems related to Jason's
original question on the thread.

Regards
Patrick
>  There's also Cibyl, http://spel.bth.se/index.php/Cibyl, "Cibyl is a
>  programming environment and binary translator that allows compiled C
>  programs to execute on J2ME-capable phones. Cibyl uses GCC to compile
>  the C programs to MIPS binaries, and these are then recompiled into
>  Java bytecode"


From John.Rose at Sun.COM  Sat Mar 15 11:20:14 2008
From: John.Rose at Sun.COM (John Rose)
Date: Sat, 15 Mar 2008 11:20:14 -0700
Subject: Hello, and other things
In-Reply-To: <64efa1ba0803150220h2f37ad0cy17af21302dc656b2@mail.gmail.com>
References: <47C8A8EE.8000809@sun.com>
	<963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com>
	<47DB1000.7030806@sun.com>
	<64efa1ba0803150152n635d98f1tf500abcc50510592@mail.gmail.com>
	<64efa1ba0803150220h2f37ad0cy17af21302dc656b2@mail.gmail.com>
Message-ID: <8AC5FDE2-74EE-4EE6-91A2-8A094CD1463B@sun.com>

Thanks for the excellent references.
Since this list is archived[1], they are now
bookmarked for us.

On Mar 15, 2008, at 2:20 AM, Patrick Wright wrote:
> Sorry if this is off-topic for the list, seems related to Jason's
> original question on the thread.

It's on-topic because of the such work may
expose pain points[2] in the JVM for compiler
back ends in general.  E.g., a botched JIT
optimization forcing back end complexity
for a C compiler would probably count as
a point point.  From a quick scan of your first
reference, I don't see any yet.  They probably
have a lot more work to do moving their
backend output closer to the JVM.
For example, most C pointers can probably
be rendered as offsets plus a base of a
Java objects or array.  This requires a
big pointer analysis, plus oracular advice
from the user, but I think it would pay off.

For a low-level account of JIT optimizations,
see (and as you make discoveries contribute to)
this wiki:
   http://wikis.sun.com/display/HotSpotInternals/PerformanceTechniques
   http://wikis.sun.com/display/HotSpotInternals/

Best wishes,
-- John

[1] http://mail.openjdk.java.net/pipermail/mlvm-dev/
[2] http://openjdk.java.net/projects/mlvm/pdf/LangNet20080128.pdf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/mlvm-dev/attachments/20080315/903e7ed6/attachment.html 

From Jason.Fordham at Sun.COM  Sun Mar 16 08:54:19 2008
From: Jason.Fordham at Sun.COM (Jason Fordham)
Date: Sun, 16 Mar 2008 08:54:19 -0700
Subject: Hello, and other things
In-Reply-To: <963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com>
References: <47C8A8EE.8000809@sun.com>
	<963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com>
Message-ID: <47DD42AB.9050602@sun.com>

Hi John,

On 3/14/2008 4:47 PM, John Rose wrote:
> The hard part, though, is the essentially untyped nature of C memory.
> I've seen C implementations that run over typed heaps, but they
> are artful compromises, rather than simple ports to a new backend.
> Centerline C and Zeta-C come to mind.  (Both are old projects, that
> may pre-date the Google cache.  I don't have references handy.)
>
>   
It seems to me that the ability of (machineRadix *)pointers to overrun - 
above and below - the arrays they were based on is a feature of C. The 
memory model I'm proposing makes it possible to leverage the existing 
code generation models, and the libraries.
> The latter was a C compiler for the Symbolic Lisp Machine which
> used ordered pairs (cons cells) for all C pointers, to represent the
> combination of a base address and an arbitrary offset.
> A similar product was Bounds-Check C, which widened
> pointers into little 3-tuples (min, max, cur).  The idea is
> that a tuple-based pointer will never be allowed to "reach
> beyond" the heap object it was created for; such operations
> are always indeterminate, since there is no guaranteed
> distance (or ordering) of heap objects, from one instruction
> to the next, in a system like the Symbolics with a powerful GC.
>
>   

While I understand that many C programmers have a secret wish that the 
GC in GCC could stand for Garbage Collection, it doesn't: I think that 
it's OK to avoid the Java GC; philosophically, I regard the ability to 
leave malloced objects on the heap without references to them as a C 
"feature", just like buffer over/underruns.

> That would work very nicely on the JVM also.  You could use
> the sun.misc.Unsafe API (with great care!) to handle punning
> among memory-resident primitive types.  You must avoid
> using Unsafe to pun between primitives and references, because
> there is absolutely no way to control when the GC might want
> to move things around underneath your code.
>
>   

I hadn't come across this before, and it doesn't seem to have any 
documentation! Given your limited description of the features, it sounds 
as though it would be very easy to leave a gap where the compiler could 
be used to break Java protection, which I would not want to do.

>> The key obstacles I see are that the instruction set makes  
>> implementing
>> a C-like stack expensive: there are no neat push and pop operations  
>> for
>> this memory model, it feels like microcoding. Though I understand the
>> motivation, which is to protect the bytecodes from malicious or  
>> lazy use
>> of buffer overflows, and other mechanisms for executing data.
>>     
>
> The stack is really just a shorthand for operand renaming.
> Feel free to generate code to a register-to-register machine,
> and map your virtual registers to JVM locals.
>
>   

Again, I'm inclined to retain the classic stack-based calling pragma in 
the memory model, because it makes it trivial to construct and 
manipulate pointers to C objects allocated in the local frame - they're 
the same as pointers to objects on the heap, because they're in the same 
untyped array - machineRadix[] memory.
>> I like the method handle mechanism, for a variety of reasons, and I
>> would like to see some easing up on where the a stack is located so  
>> that
>> operations which index into the stack are more flexible, and fast. Is
>> this possible?
>>     
>
> If you need a memory-resident stack, you can just build an array
> to hold it, can't you?  I'm not sure where the pain point is here, yet.
>
>   

Stack operations - manipulating and indexing the BP and SP - will be 
frequent multi-bytecode operations. I don't know how well the JIT 
compiler will work out what's going on.

Jason


From Jason.Fordham at Sun.COM  Sun Mar 16 09:28:17 2008
From: Jason.Fordham at Sun.COM (Jason Fordham)
Date: Sun, 16 Mar 2008 09:28:17 -0700
Subject: Hello, and other things
In-Reply-To: <8AC5FDE2-74EE-4EE6-91A2-8A094CD1463B@sun.com>
References: <47C8A8EE.8000809@sun.com>
	<963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com>
	<47DB1000.7030806@sun.com>
	<64efa1ba0803150152n635d98f1tf500abcc50510592@mail.gmail.com>
	<64efa1ba0803150220h2f37ad0cy17af21302dc656b2@mail.gmail.com>
	<8AC5FDE2-74EE-4EE6-91A2-8A094CD1463B@sun.com>
Message-ID: <47DD4AA1.8060709@sun.com>

John,

Thanks for the JIT information!

The two things I need to come up with a design for now are how to get 
GCC to generate .class files, and what the runtime setup code needs to 
do. The former is a big task, because I need to have a fairly detailed 
design for the calling protocol. But that's all going to have to wait 
for later...

Jason

On 3/15/2008 11:20 AM, John Rose wrote:
> Thanks for the excellent references.
> Since this list is archived[1], they are now
> bookmarked for us.
>
> On Mar 15, 2008, at 2:20 AM, Patrick Wright wrote:
>> Sorry if this is off-topic for the list, seems related to Jason's
>> original question on the thread.
>
> It's on-topic because of the such work may
> expose pain points[2] in the JVM for compiler
> back ends in general.  E.g., a botched JIT
> optimization forcing back end complexity
> for a C compiler would probably count as
> a point point.  From a quick scan of your first
> reference, I don't see any yet.  They probably
> have a lot more work to do moving their
> backend output closer to the JVM.
> For example, most C pointers can probably
> be rendered as offsets plus a base of a
> Java objects or array.  This requires a
> big pointer analysis, plus oracular advice
> from the user, but I think it would pay off.
>
> For a low-level account of JIT optimizations,
> see (and as you make discoveries contribute to)
> this wiki:
>   http://wikis.sun.com/display/HotSpotInternals/PerformanceTechniques
>   http://wikis.sun.com/display/HotSpotInternals/
>
> Best wishes,
> -- John
>
> [1] http://mail.openjdk.java.net/pipermail/mlvm-dev/
> [2] http://openjdk.java.net/projects/mlvm/pdf/LangNet20080128.pdf
> ------------------------------------------------------------------------
>
> _______________________________________________
> mlvm-dev mailing list
> mlvm-dev at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
>   

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


From forax at univ-mlv.fr  Sun Mar 16 09:56:30 2008
From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=)
Date: Sun, 16 Mar 2008 17:56:30 +0100
Subject: Hello, and other things
In-Reply-To: <963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com>
References: <47C8A8EE.8000809@sun.com>
	<963ED873-CE2C-41EB-AC76-A6FACE460605@sun.com>
Message-ID: <47DD513E.3000701@univ-mlv.fr>

John Rose a ?crit :
...
> That would work very nicely on the JVM also.  You could use
> the sun.misc.Unsafe API (with great care!) to handle punning
> among memory-resident primitive types.  You must avoid
> using Unsafe to pun between primitives and references, because
> there is absolutely no way to control when the GC might want
> to move things around underneath your code.
>
>   
Yop, the last time i've tried to read a reference as an int (a kind of 
fast hashcode)
the construction of the JIT IR (which is typed) crashed before
the GC was triggered.

R?mi


From simon.kagstrom at gmail.com  Sun Mar 16 11:56:51 2008
From: simon.kagstrom at gmail.com (Simon Kagstrom)
Date: Sun, 16 Mar 2008 19:56:51 +0100
Subject: Some words about Cibyl (MIPS to Java bytecode binary translation)
Message-ID: <20080316195651.0edc5fc7@gmail.com>

Hello!

I'm the author of Cibyl, which translates MIPS binaries into Java
bytecode. Patrick Wright pointed me to this list and the discussion
about compiling C into Java bytecode (thanks!), so I thought I'd share
some comments about how this is done in Cibyl. Most of it is also
applicable to NestedVM, which does essentially the same thing with a
set of implementation differences. NestedVM also predates Cibyl, so the
origin of the idea should be attributed to them.

Cibyl targets portability of C and C++ applications to J2ME devices, so
it also provides an interface to the MIDP API. The translation is
fairly straight-forward. Cibyl depends on GCC to generate an ELF binary
(with symbol and relocation information intact), and the translation is
done with a 1-1 mapping between C functions (call destinations in the
ELF binary) and static Java methods in a class. Most MIPS instructions
can be translated pretty much 1-1 to Java bytecode.

NestedVM does this a bit different and does not have the 1-1-mapping.
Both methods have benefits and disadvantages. With the NestedVM
approach, it's easier to support e.g., longjmp, while the Cibyl
approach makes the class look more like a "real" Java class for example
in crash dumps or profilers. From benchmarks I've made, the Cibyl
approach also seems easier to achive good performance with, mostly
because it always uses Java local variables for the MIPS register
representation throughout.


So to the interesting part :-). While implementing the translation has
mostly been pretty straight-forward, there are two cases where Java
bytecode poses some problems:

* The 64KB method size limit, which is perhaps the largest issue. If
  the bytecode had not had this limitation, the translation would be
  done to a single method, which would improve performance and simplify
  the implementation quite a bit. Cibyl also allows co-locating
  multiple C functions in a single Java method, which can improve
  performance quite a bit.

  This is of course also a problem with very big C functions. In
  practice, it has only been a problem in one application so far (the
  fetch-and-decode loop of an emulator). Cibyl currently does not
  handle this situation automatically, and I guess this would also be
  an issue for a JBC compiler backend.

* Untyped memory, which I also saw you took up. In Cibyl, I've used a
  big int-array as the "memory" representation. This fits MIPS quite
  well, since unaligned memory access is limited to special
  instructions, and most accesses tend to be 32-bit accesses. However,
  when 8- or 16-bit loads and stores are done there is a significant
  performance hit because of this.

  Since Cibyl targets embedded (J2ME) devices, it will just allocate a
  fixed amount of memory for the C program at startup (for stack/heap).
  NestedVM targets other systems and uses a two-level structure that
  allows a sparse memory layout.

Obviously there are also some MIPS instructions which are a bit tricky
to translate, but that's not really the fault of JBC. So if I could
have one wish for Java bytecode, it would be to lift the 64KB method
size limit (I'm pretty sure the NestedVM developers agree with this).

I understand that the type-safety will not be lifted, so I guess that
untyped memory will be a problem for any C backend.


Sorry for the long mail :-). I'll follow Jason's work on a Java GCC
backend, that would be quite nice to have. I guess you are also
familiar with LLVM, which perhaps could be an easier starting point
than plain GCC?

-- 
// Simon


From reachbach at gmail.com  Sun Mar 16 19:50:08 2008
From: reachbach at gmail.com (Bharath Ravi Kumar)
Date: Mon, 17 Mar 2008 08:20:08 +0530
Subject: mlvm-dev Digest, Vol 4, Issue 2
In-Reply-To: <mailman.31.1205607601.11667.mlvm-dev@openjdk.java.net>
References: <mailman.31.1205607601.11667.mlvm-dev@openjdk.java.net>
Message-ID: <76b5ba080803161950u10c20e28r3be3ce31d89e23a1@mail.gmail.com>

John,

Looks like the link to the hotspot compiler wiki is broken. I got a 404 -
http://wikis.sun.com/display/HotSpotInternals/Compiler

-Bharath

Date: Sat, 15 Mar 2008 11:20:14 -0700
> From: John Rose <John.Rose at Sun.COM>
> Subject: Re: Hello, and other things
> To: Patrick Wright <pdoubleya at gmail.com>
> Cc: mlvm-dev at openjdk.java.net
> Message-ID: <8AC5FDE2-74EE-4EE6-91A2-8A094CD1463B at sun.com>
> Content-Type: text/plain; charset="us-ascii"
>
> Thanks for the excellent references.
> Since this list is archived[1], they are now
> bookmarked for us.
>
> On Mar 15, 2008, at 2:20 AM, Patrick Wright wrote:
> > Sorry if this is off-topic for the list, seems related to Jason's
> > original question on the thread.
>
> It's on-topic because of the such work may
> expose pain points[2] in the JVM for compiler
> back ends in general.  E.g., a botched JIT
> optimization forcing back end complexity
> for a C compiler would probably count as
> a point point.  From a quick scan of your first
> reference, I don't see any yet.  They probably
> have a lot more work to do moving their
> backend output closer to the JVM.
> For example, most C pointers can probably
> be rendered as offsets plus a base of a
> Java objects or array.  This requires a
> big pointer analysis, plus oracular advice
> from the user, but I think it would pay off.
>
> For a low-level account of JIT optimizations,
> see (and as you make discoveries contribute to)
> this wiki:
>   http://wikis.sun.com/display/HotSpotInternals/PerformanceTechniques
>   http://wikis.sun.com/display/HotSpotInternals/
>
> Best wishes,
> -- John
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/mlvm-dev/attachments/20080317/3eb57d80/attachment.html 

From John.Rose at Sun.COM  Sun Mar 16 23:02:37 2008
From: John.Rose at Sun.COM (John Rose)
Date: Sun, 16 Mar 2008 23:02:37 -0700
Subject: Some words about Cibyl (MIPS to Java bytecode binary translation)
In-Reply-To: <20080316195651.0edc5fc7@gmail.com>
References: <20080316195651.0edc5fc7@gmail.com>
Message-ID: <570F7E62-9324-4B9C-9053-3781476D2A59@sun.com>

On Mar 16, 2008, at 11:56 AM, Simon Kagstrom wrote:

> ...in crash dumps or profilers. From benchmarks I've made, the Cibyl
> approach also seems easier to achive good performance with, mostly
> because it always uses Java local variables for the MIPS register
> representation throughout.

By the same token, there would be benefits to raising C structs
where possible into Java objects.  (It probably requires user advice,
as I noted before.)  The int-array memory model would be used
only for "hard cases".  I think there is probably some analysis
that could be made, supported by user annotations, that for most
C programs, would allow a majority of the data structures to go
into real Java objects.  The nested VM could keep a type profile
at every indirection operation to direct its translation of base+offset
to fields.

> * The 64KB method size limit, which is perhaps the largest issue. If
>   the bytecode had not had this limitation, the translation would be
>   done to a single method, which would improve performance and  
> simplify
>   the implementation quite a bit. Cibyl also allows co-locating
>   multiple C functions in a single Java method, which can improve
>   performance quite a bit.

This is an interesting problem.  It would be a good MLVM project.
The messy part is finding all the structs with 16-bit offsets and making
32-bit versions available.  It's so messy, I think, that people have
wanted to wait for a major format revision.  (I think basing offsets
on the Pack200 UNSIGNED5 format would make for a better
revolutionary change, better than adding 32-bit twin structures.
Regarding Twins--suddenly I think of DeVito and Schwarzenegger.)

> * Untyped memory, which I also saw you took up. In Cibyl, I've used a
>   big int-array as the "memory" representation. This fits MIPS quite
>   well, since unaligned memory access is limited to special
>   instructions, and most accesses tend to be 32-bit accesses. However,
>   when 8- or 16-bit loads and stores are done there is a significant
>   performance hit because of this.
Yes, that is a nice fit.  It's an amazing application of a (very)  
RISC ISA,
to an execution platform realized in software not silicon.

Best,
-- John


From simon.kagstrom at gmail.com  Mon Mar 17 13:38:24 2008
From: simon.kagstrom at gmail.com (Simon Kagstrom)
Date: Mon, 17 Mar 2008 21:38:24 +0100
Subject: Some words about Cibyl (MIPS to Java bytecode binary translation)
In-Reply-To: <570F7E62-9324-4B9C-9053-3781476D2A59@sun.com>
References: <20080316195651.0edc5fc7@gmail.com>
	<570F7E62-9324-4B9C-9053-3781476D2A59@sun.com>
Message-ID: <20080317213824.1aa1e1be@lska2>

Hi again,

On Sun, 16 Mar 2008 23:02:37 -0700
John Rose <John.Rose at Sun.COM> wrote:

> > * Untyped memory, which I also saw you took up. In Cibyl, I've used
> >   a big int-array as the "memory" representation. This fits MIPS
> >   quite well, since unaligned memory access is limited to special
> >   instructions, and most accesses tend to be 32-bit accesses.
> >   However, when 8- or 16-bit loads and stores are done there is a
> >   significant performance hit because of this.
> Yes, that is a nice fit.  It's an amazing application of a (very)  
> RISC ISA, to an execution platform realized in software not silicon.

Yes, this was the main reason why I selected MIPS for this. It's a
beautiful instruction set :-)


I actually forgot one obstacle which requires some trickery in Cibyl:
Register-indirect branches and calls. Since Java bytecode doesn't allow
computed gotos, I use a generated "call table" for method calls and a
method-local jump table for local computed gotos.

// Simon


From forax at univ-mlv.fr  Wed Mar 19 14:49:16 2008
From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=)
Date: Wed, 19 Mar 2008 22:49:16 +0100
Subject: Bugs in Da Vinci patches
Message-ID: <47E18A5C.4000607@univ-mlv.fr>

Hi john, hi all,

When i tried to compile the VM patched on my laptop (fedora core 6),
the compiler find two problems:

a cut&paste problem in classFileParser.cpp:472
case T_DOUBLE: cp->long_at_put(index,  value.d); break;

should be:
case T_DOUBLE: cp->double_at_put(index,  value.d); break;
                                      ^^^^^

and in vm/oops/klass.cpp:492
assert(strlen(result) == result_len, "");
strcpy(result + result_len, hash_buf);
assert(strlen(result) == result_len + hash_len, "");

the two asserts compare signed and unsigned int,
so i've changed to:
assert((int)strlen(result) == result_len, "");
strcpy(result + result_len, hash_buf);
assert((int)strlen(result) == result_len + hash_len, "");

It seems to work but i don't develop in C since more than ten years :)

cheers,
R?mi


From forax at univ-mlv.fr  Sat Mar 22 12:12:38 2008
From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=)
Date: Sat, 22 Mar 2008 20:12:38 +0100
Subject: Bugs in Da Vinci patches
Message-ID: <47E55A26.2010203@univ-mlv.fr>

Hi john, hi all,

When i tried to compile the VM patched on my laptop (fedora core 6),
the compiler find two problems:

a cut&paste problem in classFileParser.cpp:472
case T_DOUBLE: cp->long_at_put(index,  value.d); break;

should be:
case T_DOUBLE: cp->double_at_put(index,  value.d); break;
                                      ^^^^^

and in vm/oops/klass.cpp:492
assert(strlen(result) == result_len, "");
strcpy(result + result_len, hash_buf);
assert(strlen(result) == result_len + hash_len, "");

the two asserts compare signed and unsigned int,
so i've changed to:
assert((int)strlen(result) == result_len, "");
strcpy(result + result_len, hash_buf);
assert((int)strlen(result) == result_len + hash_len, "");

It seems to work but i don't develop in C since more than ten years  :) 

cheers,
R?mi


From lukas.stadler at jku.at  Wed Mar 26 16:17:12 2008
From: lukas.stadler at jku.at (Lukas Stadler)
Date: Thu, 27 Mar 2008 00:17:12 +0100
Subject: stack manipulation APIs
Message-ID: <47EAD978.1030506@jku.at>

*Hi!*

I am currently working on APIs for the multi-language VM that would 
allow Java code to access and manipulate its own stacks. The overall 
scope is pretty general, ranging from call-with-current-continuation 
(call/cc) for languages like scheme and coroutine implementations to 
dynamic recompilation for compilers (like what the JIT does, but 
for ?-to-bytecode compilers). In the end it should be possible to 
change, replace, remove, etc. stack frames or even synthesize a new 
stack from scratch. *But*: It seems almost impossible to make this 
secure - some useful higher-level APIs will be needed. How these could 
be implemented under the covers - that's what I'm thinking about right now.

I'm starting with the most simple use case for now - continuations. Not 
really "stack manipulation", just saving/restoring stacks.

Some questions/remarks that crossed my mind: (most of this is "... am I 
correct assuming that:")

    *  From what I've understood a call/cc can be invoked even after the
      method in which it was created has returned.

    This could lead to all sorts of harmful behavior, like exiting a
    monitor twice, etc. Should this be possible, or will only a
    restricted case be implemented? There would have to be a big red
    sign (and possibly some kind of verifier) with all the things that
    aren't allowed in such code. 
    I recently looked at the apache commons Javaflow library - they
    implement storing the current stack state using only bytecode
    instrumentation and a small and unintrusive runtime framework. (I
    can write a short summary of how they're doing this if anyone is
    interested.) I think that, as their implementation is inherently
    safe, we could partly adopt its behaviour. 

    * I think that there are two variants to consider: one-shot
      continuations that really only qualify as nonlocal returns and
      full-fledged continuations that are invoked many times from
      everywhere.

I'm just now starting to explore the OpenJDK code, so any 
remarks/pointers are very welcome - especially where to look for 
examples on how to deal with stack frames (I thought about the 
deoptimization code...)

Some more use cases for the stack-manipulation that came to my mind:

    * sophisticated error logs (stack traces with local variables etc.)
    * checkpointing in servers
    * transferring threads between servers / nomadic lightweight threads
      (Second Life's agent execution model)
    * script interpreters/compilers that can switch between interpreted
      and compiled mode, like the JVM


Thanks,
 Lukas


From John.Rose at Sun.COM  Wed Mar 26 22:34:25 2008
From: John.Rose at Sun.COM (John Rose)
Date: Wed, 26 Mar 2008 22:34:25 -0700
Subject: stack manipulation APIs
In-Reply-To: <47EAD978.1030506@jku.at>
References: <47EAD978.1030506@jku.at>
Message-ID: <71A53C13-555C-4CB4-9A2B-77FECC27ED66@sun.com>

On Mar 26, 2008, at 4:17 PM, Lukas Stadler wrote:

> I am currently working on APIs for the multi-language VM that would
> allow Java code to access and manipulate its own stacks. The overall
> scope is pretty general, ranging from call-with-current-continuation
> (call/cc) for languages like scheme and coroutine implementations to
> dynamic recompilation for compilers (like what the JIT does, but
> for ?-to-bytecode compilers). In the end it should be possible to
> change, replace, remove, etc. stack frames or even synthesize a new
> stack from scratch. *But*: It seems almost impossible to make this
> secure - some useful higher-level APIs will be needed. How these could
> be implemented under the covers - that's what I'm thinking about  
> right now.
>
> I'm starting with the most simple use case for now - continuations.  
> Not
> really "stack manipulation", just saving/restoring stacks.
>
> Some questions/remarks that crossed my mind: (most of this is "...  
> am I
> correct assuming that:")
>
>     *  From what I've understood a call/cc can be invoked even  
> after the
>       method in which it was created has returned.

Yes.  From a low-level point of view, a copyStack operation can
return more than once.  The first time it returns, it produces a new
snapshot of (at least part of) the thread stack.  If that snapshot is
then passed to restoreStack, the thread makes a discontinuous
jump back to the state of affairs as of the corresponding copyStack
operation, and the copyStack operation returns a second time.

(By discontinuous, I mean that the control stack as of the
restoreStack call is at least partially irrelevant to the future
of the computation.  In that sense it is like a throw.)

On this second return from the same call to copyStack,
the control stack is once again in the state it was when
the snapshot was made.  For generality and convenience,
the call to returnStack should be able to specify either
a return value or a throwable with which to continue
(normally or with a throw) from the copyStack call.

(BTW, the method names and details are from some
POC code, for which I owe a blog and code review.)

So, yes, at least part of the call/cc computation can occur more
than once, because of those discontinuous restoreStack calls.

>     This could lead to all sorts of harmful behavior, like exiting a
>     monitor twice, etc. Should this be possible, or will only a
>     restricted case be implemented? There would have to be a big red
>     sign (and possibly some kind of verifier) with all the things that
>     aren't allowed in such code.

Yes.  This is probably the single biggest problem with
the low-level copyStack/restoreStack idea.

The basic idea is that if method has bracket pairs that must be
properly matched, any copyStack that copies that method's
stack frame while one or more bracket is open must ensure
that the brackets continue to be matched properly.

(By bracket pairs I mean especially monitorenter vs. monitorexit
and the entry and exit of security states in doPrivileged, etc.
They can also include anything that try/finally is used to
clean up, such as open/close of a file, or some sort of
push/pop on a thread local variable.)

Java programmers are used to putting try/finally in their
code to make sure a closing bracket gets executed.
But there's no corresponding convention to make sure
an opening bracket gets re-executed.  With continuations
the brackets are more symmetric.  Scheme's version
of try/finally has a sort of 'initially' clause which is
reliably executed before the main body is executed.
This is not the Scheme syntax, but Java-ified it might be:
	initially { x.monitorenter(); }
	try { doSomethingWithXLocked(); }
	finally { x.monitorexit(); }

Both the initially and finally clauses may be executed
more than once, but they are always properly matched.
(If they were to print  'I' and 'F', then the output would always
match the regular expression (IF)+.)

It's almost (but perhaps not quite) possible to recover the
block structure of 'synchronized' statements from bytecodes.
I think that's like the verifier thing you are talking about.
However, that does not guarantee that the code is likely
to work properly if it is re-entered even with the intended
monitorenter instructions.

If there is doubt, I think it is better to require that methods
positively declare that they are re-entry safe, and provide
the right hooks for 'initially' actions, before restoreStack
is allowed to re-enter them.

The interesting question is whether there is some large
class of methods for which a positive declaration of safety
is not needed, because there isn't doubt.  I don't know the
answer to this; I think the answer will come by looking
carefully at actual code and from experience.

>     I recently looked at the apache commons Javaflow library - they
>     implement storing the current stack state using only bytecode
>     instrumentation and a small and unintrusive runtime framework. (I
>     can write a short summary of how they're doing this if anyone is
>     interested.) I think that, as their implementation is inherently
>     safe, we could partly adopt its behaviour.

Yes, that is the sort of experience I'm hoping we can use.

>     * I think that there are two variants to consider: one-shot
>       continuations that really only qualify as nonlocal returns and
>       full-fledged continuations that are invoked many times from
>       everywhere.

There's another degree of freedom:  How deep is the stack
captured by copyStack?  (Relatedly, how many frames does
the restoreStack operation change?  There's a Hamming
distance between stack traces.)

In the use case of a coroutine-like generator, each restoreStack
will not pop any frames, and will just push a frame or two of
suspended generator state.  (I'm not saying that copyStack/restoreStack
is the best way to implement generators, but I am suggesting that
it is a good way to experiment with them.)

In the use case of an application reoptimizing itself, restoreStack
operations will be infrequent but will replace most or all stack frames.

> I'm just now starting to explore the OpenJDK code, so any
> remarks/pointers are very welcome - especially where to look for
> examples on how to deal with stack frames (I thought about the
> deoptimization code...)

Look a vframes and vframe arrays.  A vframe is a virtualized view
onto a stack frame.

> Some more use cases for the stack-manipulation that came to my mind:
>
>     * sophisticated error logs (stack traces with local variables  
> etc.)
>     * checkpointing in servers
>     * transferring threads between servers / nomadic lightweight  
> threads
>       (Second Life's agent execution model)
>     * script interpreters/compilers that can switch between  
> interpreted
>       and compiled mode, like the JVM

The Hotspot JVM switches back and forth between interpreted and compiled
mode, as it optimizes and deoptimizes code in response to dynamically  
changing
properties of the application (as measured by changes to things like  
type profiles
and the class hierarchy).  I believe future VM-based systems will  
provide
something like vframes to pluggable library-specific and application- 
specific
optimization frameworks, which will do similar tricks, based on their  
own
profiles and metrics.  Think on-the-fly profile-directed  
parallelization.
Continuations may be part of the optimization arsenal required to run
many-core systems efficiently.

It's great that you are looking at this stuff!

Best wishes,
-- John


From John.Rose at Sun.COM  Thu Mar 27 19:03:08 2008
From: John.Rose at Sun.COM (John Rose)
Date: Thu, 27 Mar 2008 19:03:08 -0700
Subject: Bugs in Da Vinci patches
In-Reply-To: <47E55A26.2010203@univ-mlv.fr>
References: <47E55A26.2010203@univ-mlv.fr>
Message-ID: <5BF9A3B1-277E-4F53-902C-C3F9E47B2F75@sun.com>

Thanks for the fixes, R?mi!  I can integrate them into the next  
respin of the patch.

(I note with satisfaction that you have signed the OpenJDK  
contributor agreement.
   http://www.sun.com/software/opensource/contributor_agreement.jsp )

Getting up a repo. for the patches is an item on my (long) to-do list.
I'm going to be using an HG sub-repository for the patches themselves,
which (I hope) will simplify the task of developing multiple changes on
a moving target (the OpenJDK itself).

Would you like to be a test subject for the new repo. (when it exists)
and check in the changes yourself?

Best,
-- John

On Mar 22, 2008, at 12:12 PM, R?mi Forax wrote:

> Hi john, hi all,
>
> When i tried to compile the VM patched on my laptop (fedora core 6),
> the compiler find two problems:


From forax at univ-mlv.fr  Fri Mar 28 02:44:34 2008
From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=)
Date: Fri, 28 Mar 2008 10:44:34 +0100
Subject: Bugs in Da Vinci patches
In-Reply-To: <5BF9A3B1-277E-4F53-902C-C3F9E47B2F75@sun.com>
References: <47E55A26.2010203@univ-mlv.fr>
	<5BF9A3B1-277E-4F53-902C-C3F9E47B2F75@sun.com>
Message-ID: <47ECBE02.9010103@univ-mlv.fr>

John Rose a ?crit :
> Thanks for the fixes, R?mi!  I can integrate them into the next respin 
> of the patch.
More will come, i've generified AnonymousClassLoader,
found an infinite loop in checkHostClass(), etc.
>
> (I note with satisfaction that you have signed the OpenJDK contributor 
> agreement.
>   http://www.sun.com/software/opensource/contributor_agreement.jsp )
yes, i was even a JDK contributor before it goes opensource.

>
> Getting up a repo. for the patches is an item on my (long) to-do list.
> I'm going to be using an HG sub-repository for the patches themselves,
> which (I hope) will simplify the task of developing multiple changes on
> a moving target (the OpenJDK itself).
A sub-repository of  the OpenJDK hotspot one  will be cool.
I currently use a kijaro.dev.java.net branch
(a SVN repository) but it mixes my patches against Da Vinci VM and
the properties runtime library that use it.
>
> Would you like to be a test subject for the new repo. (when it exists)
> and check in the changes yourself?
yes.
after posting a patch to mlvm-dev and if the patch is reviewed
(or at least  nobody is against it)
i can check in it myself.
Futhermore, i will be happy to review patches of other
members of this list and check in them too if the volume is low.

>
> Best,
> -- John
R?mi
>
> On Mar 22, 2008, at 12:12 PM, R?mi Forax wrote:
>
>> Hi john, hi all,
>>
>> When i tried to compile the VM patched on my laptop (fedora core 6),
>> the compiler find two problems: