VM thread pool ( was: 3 questions)

Fri Oct 12 21:01:48 PDT 2007

Peter,

One answer, or more comment :) to #2.

There is no existing native thread pool in the VM. I'm sure it must have 
been considered though, ... which means there are probably some 
non-obvious "gotchas" waiting out there. Someone else may be able to 
chime in with more info there.

If you are going to try this then a couple of things you need to watch for:

1. Cleanup and reinitialization of thread-local state (native level I 
mean not Java ThreadLocal).

2. Removing the threads from the ThreadsList while they are idle and 
adding back when needed. (As you need the ThreadsLock for this you might 
as well use it to protect the queue of pool threads too.)

3. I'm not sure I see the need for a new entry point - as long as these 
are always going to execute Java threads. You just need to turn the 
existing entry into a loop that will return to waiting when a 
logical-thread has completed - though some of the initialization done 
when a new thread is created will have to be moved to the thread itself 
as part of the entry logic. I don't see that you would want to "jump" to 
a new entry rather than just calling it and returning normally.

Other than the TLS issue I think this would be a fairly straight-forward 
exercise. The complexities come with all of the thread management 
policies you might want, and how to expose them - as with 
ThreadPoolExecutor. :)

Cheers,
David Holmes

PS. I'll be traveling over the new few days so if there's any follow-up 
I may not see it for a while.

Peter Helfer said the following on 13/10/07 05:51 AM:
> Hi all
> 
> 1) how far is the release of the disassembler to the public, 
> specifically: x86 ?
> 
> 2) I'd like to allocate a pool of threads (JavaThreads) in the VM, and 
> keep them waiting until I figure out, what entrypoint they should take.
> 
> Now the plan would be to keep the threads preallocated, and let them 
> wait on a condition variable to be released. It seems only 
> JavaThread(ThreadFunction entry_point, size_t stack_size = 0) is to be 
> used, the other constructor is only for the main thread & jni, right ?
> 
> Now I would provide a function matching 'typedef void 
> (*ThreadFunction)(JavaThread*, TRAPS)' to the constructor, add to my 
> pool (just an array so far), and invoke Thread::start():
> 
> 
> // assume JavaThread is extended by
> // - a monitor (runtime/mutex.hpp) '_sleepVar'
> // - an additional entry point of type address '_newentry' which should 
> be either the entry point of  the interpreter when jumping back from a 
> method (assumed that bcp is correctly updated), or any instruction in a 
> compiled version of a java method. I assume (for now) that the frame is 
> correctly initialized to continue at that point.
> 
> while(true){
>       _sleepVar.wait(no_safepoint_check = false, timeout = 0, 
> as_suspend_equivalent = !_as_suspend_equivalent_flag); 
>       // what is that last flag doing ?
> 
>       if(_newentry != NULL){
>             // In GCC AT&T syntax: Jump to _newentry (clobbers eax)
>                      asm ("movl %0, %%eax; \n\t"
>                               "jmp %eax"
> 
>              :                /* output: none */
>              :"r"(_newentry)  /* input: _newentry */
>              :  "%eax"        /* clobbered register */
>              );
> 
>            }
>       }
> 
>       // return point of function
>       _newentry = NULL;
>      
>       // do some housecleaning
>       run_housecleaning();
> 
> }
> 
> 
> .. and some starting function:
> 
> jbool start_entrypoint(address entrypoint){
>      assert(entrypoint);
>      JavaThread* thread = _singleton_pool.getThread();
>      if(thread != NULL){
>         thread->set_new_entry(entrypoint);  // setter for entry point
>         thread->getSleepVar()->notify();    // getter for sleep var
>         return true;
>      }
>      return false;
> }
> 
> 
> Does this look feasible or is there a better way to go for ? Is there a 
> thread pool around (apart from java.util.concurrent.Executor et al.) ?
> 
> 
> 3)
> I know that the interpreter jumps away using jump_from(Method, temp) to 
> jump to either the compiled entry (_code->entry()) or again the 
> interpreter (_i2i_entry, _from_compiled initially). This entry 
> corresponds to the type of method (native, synchronized, accessors, 
> empty, intrinsic aka math functions, or zerolocals aka normal), and has 
> been determined at link time (methodOopDesc:link_method).
> 
> I know as well, that many return stubs are generated, in order to jump 
> back into the interpreter and pick up where it left, as described in
> AbstractInterpreterGenerator::generate_return_entry_for(TosState state, 
> int step) and stored into 'static Entrypoint 
> Interpreter::_return_entry[number_of_return_entries = 9].
> 
> If I'm not totally mistaken, the _return_entry[3] are for invokespecial, 
> static, virtual and [5] for invokeinterface, because of:
> 
> address AbstractInterpreter::return_entry(TosState state, int length) {
>   guarantee(0 <= length && length < 
> Interpreter::number_of_return_entries, "illegal length");
>   return _return_entry[length].entry(state);
> }
> 
> .. and in TemplateTable_i486.cpp, prepare_invoke:
> 
>   // compute return type
>   __ shrl(flags, ConstantPoolCacheEntry::tosBits);
>   // Make sure we don't need to mask flags for tosBits after the above shift
>   ConstantPoolCacheEntry::verify_tosBits();
>   // load return address
>   { const int table =
>       is_invokeinterface
>       ? (int)Interpreter::return_5_addrs_by_index_table()
>       : (int)Interpreter::return_3_addrs_by_index_table();
>     __ movl(flags, Address(noreg, flags, Address::times_4, table));
>   }
> 
>   // push return address
>   __ pushl(flags);
> 
>   // Restore flag value from the constant pool cache, and restore rsi
>   // for later null checks.  rsi is the bytecode pointer
>   if (save_flags) {
>     __ movl(flags, rsi);
>     __ restore_bcp();
>   }
> 
> 
> So this code determines by checking the TosBits in the child method, 
> what kind of return value it has to expect, computes the offset in 
> either return_X_addrs_by_index_table and pushes that value on the stack ?
> So this means that it expects the result of that method in RAX (+RDX for 
> long/double), irregarding of whether the child method is compiled or the 
> interpreter ?
> 
> 
> Now if I wanted to reroute the return call, I could change this pushed 
> return address to another stub, which would save the result (RAX/RDX), 
> do some freaky stuff like call the VM again, and finally return to the 
> entry beforehand exchanged ?
> 
> 
> Thanks, Peter
> 
> 
> 
> 
> 
> PS @Steve: your hint helped really well, thanks!
> To bring Steve's answer again to the list - I had to add the save of the 
> BCP before leaving it, otherwise the assertion would fail in 
> methodOop::bcp_from(int bci)
> 
> void InterpreterMacroAssembler::jump_from_interpreted(Register method, 
> Register temp) {
> if(MyMagicEnabled)
>     Label ignore;
> 
>     cmpw(Address(method, methodOopDesc::myFlag_offset()), myFlagValue);
>     jcc(Assembler::aboveEqual, ignore);
> 
>     restore_bcp(); // this saves the current BCP into the frame and 
> allows to jump into the VM
>     call_VM(temp, CAST_FROM_FN_PTR(address, 
> MyCode::setMyFlagValueRight), temp, true);
> 
>     bind(ignore);
>   }
>    // add the custom code BEFORE moving the last_sp into place
> 
>    // set sender sp
>   leal(rsi, Address(rsp, wordSize));
>   // record last_sp
>   movl(Address(rbp, frame::interpreter_frame_last_sp_offset * wordSize), 
> rsi);
> 
>   //here is the jvti in between
> 
>   //finally jump!
>   jmp(Address(method, methodOopDesc::from_interpreted_offset()));