FORK

Fri Apr 25 02:22:10 UTC 2014

Thanks for moving the thread John!

On Fri, Apr 25, 2014 at 3:38 AM, John Rose <john.r.rose at oracle.com> wrote:
>> * Dalvik has shown what you can do with a "larval" preforking setup.
>> This is a big reason why Android apps can run in such a small amount
>> of memory and start up so quickly.
>
> They had to do a lot of work to segregate sharable stuff from non-sharable.  I think it would be a useful exercise for us, but it is difficult.  We have pointers everywhere, and every pointer is a chance to break sharing (if it points to something that a process needs to move).

Perhaps I'll belie my knowledge of kernel-level memory management
here, but I'm confused by this. Wouldn't all such pointers be
indirected through a kernel-level virtual memory table? And wouldn't
any moves be done transparently behind that table? I'm thinking in
terms of vfork here, of course.

I mean...I know Rubinius isn't going back and fixing up pointers after
a fork, and they've got plenty of direct memory references in both C++
and jitted code.

To be clear, I'm also not (initially) interested copy-on-write
capabilities, though there's interesting opportunities there, nor am I
interested in forking as a means to have many small child processes
sharing a large amount of read-only memory. I just want to easily
carry bootstrapped runtime + jitted code to the children.

> They also have the luxury of running on exactly one Unix OS, whose code they can adjust.  Java runs everywhere, so it cannot easily make deep demands on the OS, like "don't give performance surprises when we fork our VM".

I'm also not necessarily interested in this as a standard or official
Java/JVM feature. I am interested in exactly one case: being able to
preboot OpenJDK and fork it. I'd even be satisfied if it only worked
on a few platforms initially, because 99% of our users are on a
combination of OS X and/or Linux64.

> But, two other reasons to work on this problem is data sharing (class and/or application) and AOT compilation; both win bigger to the extent they can can work with untransposed data, directly out of a file (= shared RO memory).

Class data sharing and AOT address some portion of fork's use cases,
indeed. For my purposes, those would be enough. But the application
data sharing, even as a one-off fork, is a commonly-exploited case
among Rubyists, Pythonistas, UNIXers, and so on. Not being able to
fork *at all* seems more and more like going against the grain.

>> * Startup time! If we could fork an already-hot JVM, we could hit the
>> ground running with *every* command, *and* still have truly separate
>> processes.
>
> Yup.  How much does nailgun address this use case, already?  That is, what don't you get today with pre-warmed JVMs?

Addressed by Peter, but I'll add a few other nasty bits:

* Native-level code (e.g. FFI, file descriptors, ...) is impossible to
sort out under nailgun
* Nailgun clients don't (can't) propagate TTY to the pseudo-processes,
which rules out a large class of command-line uses...exactly why we
wanted Nailgun to begin with.
* Signal handling...totally effed.

A better alternative these days is Drip, which boots the next JVM
while the current command is running. Great return-on-investment, but
you *really* need to make sure that prebooting is accompanied with
something to warm up the JVM, and that's usually very app-specific.

>> * There's a lot of development and scaling patterns that depend on
>> forking, and we get constant questions about forking on JRuby.
>
> Over time, we have experimented at length with both forking and single-process multi-tasking (MVM).  Since MVM doesn't make as many demands on the operating system for complex operations, we are more comfortable with that approach.

MVM works only as far as Nailgun in this respect. I'd love to see it
for other reasons, but it has all the same problems.

>> * Rubinius -- a Ruby VM with partially-concurrent GC, a
>> signal-handling thread, JIT threads, and real parallel Ruby threads --
>> supports forking. They bring the threads to a safe point, fork, and
>> restart them on the other side. Color me jealous.
>
> Me too.  It's a cute move.  How many OS's does that trick work well on?

They test on and have users on several Linux and BSD variants, and of
course most Rubinius users run OS X for dev. Of course the lack of
issues reports is not proof there's no bugs in the approach, but I've
only ever seen people report mundane forking bugs (threads not getting
paused or restarted exactly right, etc), rather than anything severe.

>> So...given that OpenJDK is rapidly expanding into smaller-profile
>> devices and new languages and development patterns, perhaps it's time
>> to make it fit into the UNIX philosophy. Where do we start?
>
> In my opinion, with CDS and AOT.
>
> And with squinting suspiciously at our data structures.  (But when I say that it scares folks; who knows where that will end!? :-) )
>
> But a forkable-VM experiment, as a patch within MLVM project, would be a lightweight and easy thing to try.  It might not produce a usable result, but would at least illuminate the problem areas.

Now I need to find a patsy -- er, hero -- to work on such a patch
under my dubiously helpful guidance :-)

- Charlie