DRAFT PROPOSAL - Porting the PyPy JIT to JVM and MLVM

Thu Feb 28 02:10:01 PST 2008

Charles Oliver Nutter wrote:

> I'm also interested in this sort of approach for JRuby, but given 
> limitations of code generation, classloading, and method handles under 
> JDK6- I've held off on continuing such work. In an ideal world, it would 
> be trivial and lightweight to iteratively generate call sites, method 
> handles, JITted method bodies, and more to create an increasingly more 
> adapted call pipeline. As it stands, JRuby only can do code generation 
> for method handles at startup and for JITted methods once at runtime, 
> already paying a fairly high permgen and class maintenance cost. The 
> features of the MLVM are expected to lessen this pain.

I suspect that the plain JVM version of PyPy JIT will suffer of the same 
problems; I've not yet made benchmarks, but from what I heard it seems 
that the process of loading methods is costly on the current JVMs. 
That's why the short term goal of the PyPy JIT is to generate fast code, 
not to generate code fast :-).

For JRuby, maybe the are ways to reuse the PyPy machinery to 
automatically produce a JIT from it (e.g. by writing a Java frontend to 
the translation toolchain), but of course this would be a veeeery long 
term goal :-).

> Antonio: Do you have a feel for how much of this work would likely end 
> up producing idiomatic JVM bytecode and how much might require 
> modification to the JVM itself, or put differently how likely it would 
> be that the PyPy JIT would need to target JVM components other than its 
> bytecode interpreter and JIT? I'm growing more interested in the 
> possibility of expanding the capabilities of OpenJDK by making it 
> available in ways other than "good old Java bytecode". The proposal 
> earlier today by Mr. Hughes may also play in this direction...I admit I 
> have not read it yet.

I honestly have no clue about it yet. In an ideal world, it should be 
enough to produce "good old java bytecode" and let the JIT of the VM to 
do all the needed optimizations, but we cannot know in advance  and I 
suspect it won't be enough, i.e. the code produced won't be optimal.

About the idomatic JVM bytecode; I think the code generated will be more 
or less idiomatic apart from promotion_: by design, PyPy JIT generates a 
lot of small chunks of code that need to be chained together; in 
particular, to implement promotion we need a "growable switch" in which 
we can add new cases at runtime: since I don't think it is possible to 
dynamically modify the bytecode of an existing method, we need to think 
an alternative way to implement it.

Having a more direct interface to the underlying VM would probably help 
greatly, but I have to admit that I don't know much of what you could do 
with it. E.g., could it give you the possibility to change the bytecode 
of an existing method?

.. _promotion: http://codespeak.net/pypy/dist/pypy/doc/jit.html#promotion

ciao Anto