Hello, and other things

Fri Mar 14 16:53:36 PDT 2008

Quick pointer to a project a co-worker told me about a while back:

http://www.xwt.org/mips2java/
http://www.thisiscool.com/mips2java.htm

-Ken

John Rose wrote:
> On Feb 29, 2008, at 4:53 PM, Jason Fordham wrote:
> 
>> I started thinking about targeting GCC for the JVM last week.
> 
> That's a neat project!
> 
> I have heard of JVMs being used to simulate very small assembly-level  
> systems,
> on the order of 16-bit computers.  The challenges with this come from  
> building
> in a second level of virtualization.  The execution of the simulated  
> unsafe
> CPU is hard to integrate with the JVM's libraries.
> 
>> It quickly became clear that the JVM instruction set is designed to  
>> make
>> the C programming model difficult: the separation of bytecodes,  
>> stacks,
>> frames, and object space, and the generally unconvertible addressType
>> quickly led me to a model where the JVM stacks are ignored except for
>> primitive operations, while memory - for data, bss and heap - is  
>> modeled
>> in a large array. In order to model C's function calls by pointer, I
>> figured a handle pair, class and method, hashing the strings, with a
>> linking stage after compilation to perform fixup - much as I imagine
>> slide 17 in the LangNet presentation implies.
> 
> I agree that method handles will help with this sort of thing.
> 
> The hard part, though, is the essentially untyped nature of C memory.
> I've seen C implementations that run over typed heaps, but they
> are artful compromises, rather than simple ports to a new backend.
> Centerline C and Zeta-C come to mind.  (Both are old projects, that
> may pre-date the Google cache.  I don't have references handy.)
> 
> The latter was a C compiler for the Symbolic Lisp Machine which
> used ordered pairs (cons cells) for all C pointers, to represent the
> combination of a base address and an arbitrary offset.
> A similar product was Bounds-Check C, which widened
> pointers into little 3-tuples (min, max, cur).  The idea is
> that a tuple-based pointer will never be allowed to "reach
> beyond" the heap object it was created for; such operations
> are always indeterminate, since there is no guaranteed
> distance (or ordering) of heap objects, from one instruction
> to the next, in a system like the Symbolics with a powerful GC.
> 
> That would work very nicely on the JVM also.  You could use
> the sun.misc.Unsafe API (with great care!) to handle punning
> among memory-resident primitive types.  You must avoid
> using Unsafe to pun between primitives and references, because
> there is absolutely no way to control when the GC might want
> to move things around underneath your code.
> 
>> The key obstacles I see are that the instruction set makes  
>> implementing
>> a C-like stack expensive: there are no neat push and pop operations  
>> for
>> this memory model, it feels like microcoding. Though I understand the
>> motivation, which is to protect the bytecodes from malicious or  
>> lazy use
>> of buffer overflows, and other mechanisms for executing data.
> 
> The stack is really just a shorthand for operand renaming.
> Feel free to generate code to a register-to-register machine,
> and map your virtual registers to JVM locals.
> 
>> I like the method handle mechanism, for a variety of reasons, and I
>> would like to see some easing up on where the a stack is located so  
>> that
>> operations which index into the stack are more flexible, and fast. Is
>> this possible?
> 
> If you need a memory-resident stack, you can just build an array
> to hold it, can't you?  I'm not sure where the pain point is here, yet.
> 
> Best wishes,
> -- John
> 
> _______________________________________________
> mlvm-dev mailing list
> mlvm-dev at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev