Review request for 6977034 Thread.getState() very slow

Mandy Chung mandy.chung at oracle.com
Tue Dec 7 02:22:26 UTC 2010


  On 12/6/10 5:27 PM, Robert Lougher wrote:
> Hi Mandy,
>
> On 6 December 2010 19:26, Mandy Chung<mandy.chung at oracle.com>  wrote:
>> Remi, Eamonn, Brian, David, Doug,
>>
>> Thanks for the feedback.
>>
> I don't know if you welcome external feedback, but I'd like to point
> out (if you're not already aware) that this change modifies the VM
> interface.

Any feedback is welcome.   That's the intent to post this for the 
openjdk discussion.  Thanks for the information that this change will 
break CACAO of OpenJDK.
> While I'm cognisant of the reason for the change, my understanding of
> the existing map mechanism is that it makes the JDK class library
> independent of the underlying VM thread status values.  The value of
> Thread.threadStatus is opaque, with the mapping from VM thread status
> being determined by the following VM interface functions (see
> hotspot/src/share/vm/prims/jvm.h):
>
> ------------------
> /*
>   * Returns an array of the threadStatus values representing the
>   * given Java thread state.  Returns NULL if the VM version is
>   * incompatible with the JDK or doesn't support the given
>   * Java thread state.
>   */
> JNIEXPORT jintArray JNICALL
> JVM_GetThreadStateValues(JNIEnv* env, jint javaThreadState);
>
> /*
>   * Returns an array of the substate names representing the
>   * given Java thread state.  Returns NULL if the VM version is
>   * incompatible with the JDK or the VM doesn't support
>   * the given Java thread state.
>   * values must be the jintArray returned from JVM_GetThreadStateValues
>   * and javaThreadState.
>   */
> JNIEXPORT jobjectArray JNICALL
> JVM_GetThreadStateNames(JNIEnv* env, jint javaThreadState, jintArray values);
>
> ------------------
>
> These two functions are used by the native method
> sun.misc.VM.getThreadStateValues to setup the arrays which are then
> used to initialise the map.
>
> This change breaks this abstraction, and requires Thread.threadStatus
> to be a JVM TI thread state (which happens to match Hotspot's internal
> thread state).  This change will therefore break any VM which does not
> have an internal thread state based on JVM TI.
>

Indeed this fix will depend on this private interface between VM and JDK 
to set the threadStatus as JVM TI thread state.  The VM implementation 
of setting Thread.threadStatus hasn't been changed since 1.5 and 
performance required by the profiling tools was one of the reasons done 
in that way.

> As far as I'm aware, IKVM and CACAO are currently the only other users
> of OpenJDK (I'm also nearing completion of a port to JamVM).
> Unfortunately, from looking at CACAO I can see that this change will
> break it.  It may also break IKVM, but I haven't checked.  I, of
> course, can modify the internal thread states of JamVM to be
> compatible.
>
> I'm CC'ing CACAO's mailing list and GNU Classpath so that affected
> parties can be made aware of this change.

Thanks for doing that.

> As an aside, will there be
> any later clean-up of the native method implementation and VM
> interface?

I don't know of the project status.  I suggest you to post this question 
to the common VM interface openjdk project:
    http://openjdk.java.net/projects/cvmi/
    cvmi-dev at openjdk.java.net

Thanks
Mandy
> Thanks,
> Rob.
>
>> On 12/04/10 04:22, Eamonn McManus wrote:
>>
>> Hi Mandy,
>>
>> This test:
>>
>>          if ((threadStatus&  JVMTI_THREAD_STATE_RUNNABLE) == 1) {
>>
>> is always false, since JVMTI_THREAD_STATE_RUNNABLE is 4. (NetBeans 7.0
>> helpfully flags this; I'm not sure if earlier versions do.)
>>
>> Good catch.   This explains why the speed up for RUNNABLE was not as high in
>> the microbenchmark measurement.  Correcting it shows that Thread.getState()
>> gets 3.5X speed up on a thread in RUNNABLE state.
>>
>> But, once corrected, I think you could use this idea further to write a much
>> simpler and faster method, on these lines:
>>
>>      public static Thread.State toThreadState(int threadStatus) {
>>          if ((threadStatus&  JVMTI_THREAD_STATE_RUNNABLE) != 0) {
>>              return RUNNABLE;
>>          } else if ((threadStatus&
>> JVMTI_THREAD_STATE_BLOCKED_ON_MONITOR_ENTER) != 0) {
>>              return BLOCKED;
>>          } else if ((threadStatus&  JVMTI_THREAD_STATE_WAITING_WITH_TIMEOUT)
>> != 0) {
>>              return TIMED_WAITING;
>>          } else if ((threadStatus&  JVMTI_THREAD_STATE_WAITING_INDEFINITELY)
>> != 0) {
>>              return WAITING;
>>          } else if ((threadStatus&  JVMTI_THREAD_STATE_TERMINATED) != 0) {
>>              return TERMINATED;
>>          } else {
>>              return NEW;
>>          }
>>      }
>>
>> I forgot to mention in the email that I implemented this simpler approach to
>> compare with the table lookup approach.   There were no significant
>> difference.  I now rerun with the corrected fix (checking != 0 rather than
>> == 1) and the table lookup approach is about 2-6% faster than the sequence
>> of tests approach.
>>
>> I am also for the simpler approach but I post the table lookup approach as a
>> proposed fix to get any opinion on the performance aspect with that
>> approach.
>>
>> Given that the Fork-Join framework doesn't depend on it, I will go for a
>> simpler approach (sequence of tests) and further tune its performance when
>> there is a use case requiring a perf improvement.
>>
>> New webrev:
>>     http://cr.openjdk.java.net/~mchung/6977034/webrev.01/
>>
>> Can you review this version?
>>
>> Thanks
>> Mandy
>>
>> You could tweak the order of the tests based on what might be the relative
>> frequency of the different states but it probably isn't worth it.
>>
>> Regards,
>>
>> Éamonn
>>
>> On 3/12/10 11:52 PM, Mandy Chung wrote:
>>
>> Fix for 6977034: Thread.getState() very slow
>>
>> Webrev at:
>>     http://cr.openjdk.java.net/~mchung/6977034/webrev.00/
>>
>> This is an improvement to map a Thread's threadStatus field to
>> Thread.State.  The VM updates the Thread.threadStatus field directly at
>> state transition with the value as defined in JVM TI [1].  The
>> java.lang.Thread.getState() implementation can directly access the
>> threadStatus value and do a direct lookup from an array of Thread.State.
>> The threadStatus value is a bit vector and we would have to create an array
>> of a minimum of 1061 (0x425) elements to do direct mapping.   I took the
>> approach to use the first highest order bit set to 1 in the masked
>> threadStatus value as the index to the Thread.State element and only caches
>> 32 elements (could be fewer).  I wrote a micro-benchmark measuring the
>> Thread.getState of a thread in different state that shows 1.7X to 6X speedup
>> (see below).  There is possibly some issue with my micro-benchmark that I
>> didn't observe the 14X speed up as Doug did in his experiment.  However, I'd
>> like to get this reviewed and pushed to the repository so that anyone can do
>> more experiment on the performance measurement.
>>
>> Thanks
>> Mandy
>> P.S. The discussion on this thread can be found at [2] [3].
>>
>> [1]
>> http://download.java.net/jdk7/docs/platform/jvmti/jvmti.html#GetThreadState
>> [2]
>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2010-July/004567.html
>> [3]
>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2010-August/004721.html
>>
>>
>> 	JDK 7 b120 (in ms)	With fix (in ms)	Speed up
>> main		46465	        22772			2.04
>> NEW		50676		29921			1.69
>> RUNNABLE	42202		14690			2.87
>> BLOCKED		72773		12296			5.92
>> WAITING		48811		13041			3.74
>> TIMED_WAITING	45737		12849			3.56
>> TERMINATED	40314		16376			2.46
>>
>>
>>




More information about the core-libs-dev mailing list