DRAFT PROPOSAL - Porting the PyPy JIT to JVM and MLVM

Thu Feb 28 03:06:17 PST 2008

Antonio Cuni wrote:
> I suspect that the plain JVM version of PyPy JIT will suffer of the same 
> problems; I've not yet made benchmarks, but from what I heard it seems 
> that the process of loading methods is costly on the current JVMs. 
> That's why the short term goal of the PyPy JIT is to generate fast code, 
> not to generate code fast :-).

*Very* costly. When run in 100% AOT mode, JRuby takes 50-200% longer to 
start up most Rails apps. At least we don't have it as bad as the CLR 
folks...since their generated code has to also immediately JIT to native 
code, IronPython takes several *minutes* to start up for some apps. But 
I believe we can do a lot more to improve this on JVM.

This is largely why we interpret at first (the other major reason being 
that I knew I couldn't write the whole compiler at once, so we released 
a partial compiler in JIT mode for 1.0).

> For JRuby, maybe the are ways to reuse the PyPy machinery to 
> automatically produce a JIT from it (e.g. by writing a Java frontend to 
> the translation toolchain), but of course this would be a veeeery long 
> term goal :-).

Hopefully not too long term...at least if we can find ways to feed in 
the same structures to your TT. We discussed what might be required when 
John Rose and I met with the other PyPy folks a few months ago.

> I honestly have no clue about it yet. In an ideal world, it should be 
> enough to produce "good old java bytecode" and let the JIT of the VM to 
> do all the needed optimizations, but we cannot know in advance  and I 
> suspect it won't be enough, i.e. the code produced won't be optimal.

This may be the case...but I think we can do a lot for Python on JVM 
with even a naive first attempt at this, especially since the naive 
first attempt with PyPy will probably be pretty good.

> About the idomatic JVM bytecode; I think the code generated will be more 
> or less idiomatic apart from promotion_: by design, PyPy JIT generates a 
> lot of small chunks of code that need to be chained together; in 
> particular, to implement promotion we need a "growable switch" in which 
> we can add new cases at runtime: since I don't think it is possible to 
> dynamically modify the bytecode of an existing method, we need to think 
> an alternative way to implement it.

The switch would need to be generated into its own method+class, and 
we'd throw it away as needed to generate new ones. Again, permgen hell, 
but it can be managed.

> Having a more direct interface to the underlying VM would probably help 
> greatly, but I have to admit that I don't know much of what you could do 
> with it. E.g., could it give you the possibility to change the bytecode 
> of an existing method?

I believe this is possible already through some of the debugging 
interfaces, and potentially could be made generally applicable with a 
bit of tweaking.

- Charlie