ThreadMXBean::getCurrentThreadAllocatedBytes

Daniel D. Daugherty daniel.daugherty at oracle.com
Fri Jul 13 20:46:12 UTC 2018


On 7/13/18 2:44 PM, Daniel D. Daugherty wrote:
> On 7/13/18 12:35 PM, Markus Gaisbauer wrote:
>> Hello,
>>
>> I am trying to use ThreadMXBean::getThreadAllocatedBytes 
>> (com.sun.management) to get the amount of allocated memory of the 
>> current thread in some performance critical code.
>>
>> Unfortunately, the current implementation can be rather slow and the 
>> duration of each call unpredictable. I ran a test in a JVM with 500 
>> threads. Depending on which thread was queried, 
>> getThreadAllocatedBytes took between 100 ns and 2500 ns.
>>
>> The root cause of the problem is 
>> ThreadsList::find_JavaThread_from_java_tid which performs a linear 
>> scan through all Java threads in the current process. The more 
>> threads a JVM has, the slower it gets. In the worst case, the thread 
>> with the given TID is found as the last entry in the list.
>>
>> Before Java 10, the oldest thread is the slowest one to query.
>> Since Java 10, the youngest thread is the slowest one to query. I 
>> think this was a side effect of introducing "Thread Safe Memory 
>> Reclamation (Thread-SMR) support".
>>
>>              Oldest Thread   Youngest Thread
>> Java 8             8740 ns             76 ns
>> Java 10             109 ns           2485 ns
>
> It is good to see that longest search is much faster. Erik and Robbin
> will be pleased since speeding up traversal of the ThreadsList was one
> of the things that we tried to do during the Thread-SMR project.
>
> A first step is get a new bug filed that documents the issue with
> ThreadMXBean::getThreadAllocatedBytes(). Perhaps Gary or Serguei
> will take care of that.
>
> Dan
>
>
>> A common use case is to query the metric for the current thread (e.g. 
>> before and after performing some operation). This case can be 
>> optimized by introducing a new method: getCurrentThreadAllocatedBytes.
>>
>> I created a patch for http://hg.openjdk.java.net/jdk/jdk/ and by 
>> using the new method I saw the following improvements in my test:
>>              Oldest Thread   Youngest Thread
>> Proposal             37 ns             37 ns
>>
>> This is a 60x improvement over the worst case of the current API. In 
>> the best case of the current API, the new method is still 3 times faster.
>>
>> // based on JVM_SetNativeThreadName in jvm.cpp.
>> JVM_ENTRY(jlong, jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env, 
>> jobject currentThread))
>>   // We don't use a ThreadsListHandle here because the current thread
>>   // must be alive.
>>   oop java_thread = JNIHandles::resolve_non_null(currentThread);
>> JavaThread* thr = java_lang_Thread::thread(java_thread);
>>   if (thread == thr) {
>>     // only supported for the current thread
>>     return thr->cooked_allocated_bytes();
>>   }
>>   return -1;
>> JVM_END
>>
>> The proposed method also fixes the problem, that 
>> getThreadAllocatedBytes itself allocates some memory on the current 
>> thread (two long arrays, 24 bytes) and therefore can slightly skew 
>> measurements. The new method, getCurrentThreadAllocatedBytes, returns 
>> exactly the same value if it is called twice without allocating any 
>> memory between those calls.
>>
>> I also built a variation of this method that could be used to query 
>> allocated memory more efficiently for anyone who already has a 
>> java.lang.Thread object:
>>
>> JVM_ENTRY(jlong, jmm_GetThreadAllocatedMemory(JNIEnv *env, jobject 
>> threadObj))
>>   // based on code proposedin threadSMR.hpp
>> ThreadsListHandle tlh;
>> JavaThread* thr = NULL;
>>   bool is_alive = tlh.cv_internal_thread_to_JavaThread(threadObj, 
>> &thr, NULL);
>>   if (is_alive) {
>>     return thr->cooked_allocated_bytes();
>>   }
>>   return -1;
>> JVM_END
>>
>> This method took 70 ns in my test, which is 85% slower 
>> than GetCurrentThreadAllocatedMemory but still 30% faster than the 
>> best case of the current API. I currently have no immediate need for 
>> this second method, but I think it would also be a valueable addition 
>> to the API.
>>
>> I attached a patch for getCurrentThreadAllocatedBytes. I can create a 
>> second patch for also adding 
>> getThreadAllocatedMemory(java.lang.Thread) to the API.
>>
>> I am a first time contributor and I am not 100% sure what process I 
>> must follow to get a change like this into OpenJDK. Can someone have 
>> a look at my proposal and help me through the process?
>>
>> Best regards,
>> Markus
>>
>

I believe this is the code that's causing you grief:

open/src/hotspot/share/services/management.cpp:

// Gets an array containing the amount of memory allocated on the Java
// heap for a set of threads (in bytes).  Each element of the array is
// the amount of memory allocated for the thread ID specified in the
// corresponding entry in the given array of thread IDs; or -1 if the
// thread does not exist or has terminated.
JVM_ENTRY(void, jmm_GetThreadAllocatedMemory(JNIEnv *env, jlongArray ids,
                                              jlongArray sizeArray))
   // Check if threads is null
   if (ids == NULL || sizeArray == NULL) {
     THROW(vmSymbols::java_lang_NullPointerException());
   }

   ResourceMark rm(THREAD);
   typeArrayOop ta = typeArrayOop(JNIHandles::resolve_non_null(ids));
   typeArrayHandle ids_ah(THREAD, ta);

   typeArrayOop sa = typeArrayOop(JNIHandles::resolve_non_null(sizeArray));
   typeArrayHandle sizeArray_h(THREAD, sa);

   // validate the thread id array
   validate_thread_id_array(ids_ah, CHECK);

   // sizeArray must be of the same length as the given array of thread IDs
   int num_threads = ids_ah->length();
   if (num_threads != sizeArray_h->length()) {
     THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(),
               "The length of the given long array does not match the 
length of "
               "the given array of thread IDs");
   }

   ThreadsListHandle tlh;
   for (int i = 0; i < num_threads; i++) {
     JavaThread* java_thread = 
tlh.list()->find_JavaThread_from_java_tid(ids_ah->long_at(i));
     if (java_thread != NULL) {
       sizeArray_h->long_at_put(i, java_thread->cooked_allocated_bytes());
     }
   }
JVM_END


Perhaps something like this above the "ThreadsListHandle tlh;" line:

   if (num_threads == 1 && THREAD->is_Java_thread()) {
     // Only asking for 1 thread so if we're a JavaThread, then
     // see if this request is for ourself.
     JavaThread* jt = THREAD;
     oop tobj = jt->threadObj();

     if (ids_ah->long_at(0) == java_lang_Thread::thread_id(tobj)) {
       // Return the info for ourself.
       sizeArray_h->long_at_put(0, jt->cooked_allocated_bytes());
       return;
     }
   }

I haven't checked to see if this will even compile, but I
think you'll get the idea.

Dan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180713/646370d1/attachment-0001.html>


More information about the serviceability-dev mailing list