ThreadMXBean::getCurrentThreadAllocatedBytes

Daniel D. Daugherty daniel.daugherty at oracle.com
Mon Jul 16 19:39:22 UTC 2018


The new block needs to be just above the "ThreadsListHandle tlh;"
line in order to preserve all of the existing checks...

More below...


On 7/16/18 2:22 PM, Hohensee, Paul wrote:
>
> I believe you could move the code ahead of the call to 
> validate_thread_id_array() because that method just checks for thread 
> ids <= 0.
>
> *diff -r 3ddf41505d54 src/hotspot/share/services/management.cpp*
>
> *--- a/src/hotspot/share/services/management.cpp Sun Jun 03 23:33:00 
> 2018 -0700*
>
> *+++ b/src/hotspot/share/services/management.cpp Mon Jul 16 10:41:28 
> 2018 -0700*
>
> @@ -2084,11 +2083,19 @@
>
> typeArrayOop sa = typeArrayOop(JNIHandles::resolve_non_null(sizeArray));
>
> typeArrayHandle sizeArray_h(THREAD, sa);
>
> +// Special-case current thread
>

The next line uses ids_ah, but validate_threads_id_array() has
not been called yet so you don't know whether ids_ah is valid
yet.

> +int num_threads = ids_ah->length();
>

The original code that I posted used the existing THREADS
variable rather than a call to JavaThread::current() which
can be expensive.

> +JavaThread* java_thread = JavaThread::current();
>
> +if (num_threads == 1 && sizeArray_h->length() == 1 &&
>

The next line uses ids_ah, but validate_threads_id_array() has
not been called yetso you don't know whether ids_ah is valid
yet.


> +ids_ah->long_at(0) == 
> java_lang_Thread::thread_id(java_thread->threadObj())) {
>
> +sizeArray_h->long_at_put(0, java_thread->cooked_allocated_bytes());
>
> +return;
>
> +}
>
> +
>
> // validate the thread id array
>
> validate_thread_id_array(ids_ah, CHECK);
>
> // sizeArray must be of the same length as the given array of thread IDs
>

It's not safe to move the next line before validate_thread_id_array().


> -int num_threads = ids_ah->length();
>
> if (num_threads != sizeArray_h->length()) {
>
> THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(),
>
> "The length of the given long array does not match the length of "
>
> If performance is good enough, and if you still want to add 
> getCurrentThreadAllocatedBytes() (imo a good idea, since 
> getCurrentThreadCpuTime() and getCurrentThreadUserTime() already 
> exist), you could implement it by 
> “getThreadAllocatedBytes(Thread::currentThread().getId())”. You might 
> want also want to add getCurrentThread* methods to com.sun.management 
> where they don’t currently exist: then we’d have a complete parallel 
> method set.
>
> Another approach to improving things is to fix the underlying problem 
> with find_JavaThread_from_java_tid(). 
> https://bugs.openjdk.java.net/browse/JDK-8185005 proposes doing that 
> in a different context. We came up with a patch for JDK8 that uses an 
> open addressed hashtable (one where the “bucket chain” is in the index 
> array, see https://en.wikipedia.org/wiki/Hash_table#Open_addressing 
> <https://en.wikipedia.org/wiki/Hash_table#Open_addressing>) to map 
> Java tids to JavaThread*s. I’ve forward ported it to JDK12, see 
> http://cr.openjdk.java.net/~phh/8185005/webrev.00/ 
> <http://cr.openjdk.java.net/%7Ephh/8185005/webrev.00/>. The main 
> disadvantage, of course, is that it’s yet another data structure that 
> takes up memory. It’s really fast though and speeds up our profilers 
> quite a bit. Perhaps we could replace the existing thread list with a 
> variation on this map, since it’s quick to just run through the 
> underlying array when you want to run through the threads.
>

Hmmm... That bug got closed as will-not-fix. I'm not sure why
the triage team decided that.

Dan


> Thanks,
>
> Paul
>
> *From: *serviceability-dev 
> <serviceability-dev-bounces at openjdk.java.net> on behalf of "Daniel D. 
> Daugherty" <daniel.daugherty at oracle.com>
> *Reply-To: *"daniel.daugherty at oracle.com" <daniel.daugherty at oracle.com>
> *Date: *Friday, July 13, 2018 at 1:53 PM
> *To: *Markus Gaisbauer <markus.gaisbauer at gmail.com>, 
> "serviceability-dev at openjdk.java.net" 
> <serviceability-dev at openjdk.java.net>, Erik Österlund 
> <erik.osterlund at oracle.com>, Robbin Ehn <robbin.ehn at oracle.com>
> *Subject: *Re: ThreadMXBean::getCurrentThreadAllocatedBytes
>
> Markus,
>
> I filed the following bug for you:
>
>     JDK-8207266 ThreadMXBean::getThreadAllocatedBytes() can be quicker 
> for self thread
> https://bugs.openjdk.java.net/browse/JDK-8207266
>
> Dan
>
> On 7/13/18 4:46 PM, Daniel D. Daugherty wrote:
>
>     On 7/13/18 2:44 PM, Daniel D. Daugherty wrote:
>
>         On 7/13/18 12:35 PM, Markus Gaisbauer wrote:
>
>             Hello,
>
>             I am trying to use ThreadMXBean::getThreadAllocatedBytes
>             (com.sun.management) to get the amount of allocated memory
>             of the current thread in some performance critical code.
>
>             Unfortunately, the current implementation can be rather
>             slow and the duration of each call unpredictable. I ran a
>             test in a JVM with 500 threads. Depending on which thread
>             was queried, getThreadAllocatedBytes took between 100 ns
>             and 2500 ns.
>
>             The root cause of the problem is
>             ThreadsList::find_JavaThread_from_java_tid which performs
>             a linear scan through all Java threads in the current
>             process. The more threads a JVM has, the slower it gets.
>             In the worst case, the thread with the given TID is found
>             as the last entry in the list.
>
>             Before Java 10, the oldest thread is the slowest one to query.
>
>             Since Java 10, the youngest thread is the slowest one to
>             query. I think this was a side effect of introducing
>             "Thread Safe Memory Reclamation (Thread-SMR) support".
>
>                    Oldest Thread   Youngest Thread
>
>             Java 8             8740 ns             76 ns
>
>             Java 10             109 ns           2485 ns
>
>
>         It is good to see that longest search is much faster. Erik and
>         Robbin
>         will be pleased since speeding up traversal of the ThreadsList
>         was one
>         of the things that we tried to do during the Thread-SMR project.
>
>         A first step is get a new bug filed that documents the issue with
>         ThreadMXBean::getThreadAllocatedBytes(). Perhaps Gary or Serguei
>         will take care of that.
>
>         Dan
>
>
>
>             A common use case is to query the metric for the current
>             thread (e.g. before and after performing some operation).
>             This case can be optimized by introducing a new method:
>             getCurrentThreadAllocatedBytes.
>
>             I created a patch for http://hg.openjdk.java.net/jdk/jdk/
>             <http://hg.openjdk.java.net/jdk/jdk/> and by using the new
>             method I saw the following improvements in my test:
>
>                    Oldest Thread   Youngest Thread
>
>             Proposal            37 ns             37 ns
>
>             This is a 60x improvement over the worst case of the
>             current API. In the best case of the current API, the new
>             method is still 3 times faster.
>
>             // based on JVM_SetNativeThreadName in jvm.cpp.
>
>             JVM_ENTRY(jlong,
>             jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env, jobject
>             currentThread))
>
>               // We don't use a ThreadsListHandle here because the
>             current thread
>
>               // must be alive.
>
>               oop java_thread =
>             JNIHandles::resolve_non_null(currentThread);
>
>               JavaThread* thr = java_lang_Thread::thread(java_thread);
>
>               if (thread == thr) {
>
>                 // only supported for the current thread
>
>                 return thr->cooked_allocated_bytes();
>
>               }
>
>               return -1;
>
>             JVM_END
>
>             The proposed method also fixes the problem, that
>             getThreadAllocatedBytes itself allocates some memory on
>             the current thread (two long arrays, 24 bytes) and
>             therefore can slightly skew measurements. The new
>             method, getCurrentThreadAllocatedBytes, returns exactly
>             the same value if it is called twice without allocating
>             any memory between those calls.
>
>             I also built a variation of this method that could be used
>             to query allocated memory more efficiently for anyone who
>             already has a java.lang.Thread object:
>
>             JVM_ENTRY(jlong, jmm_GetThreadAllocatedMemory(JNIEnv *env,
>             jobject threadObj))
>
>               // based on code proposed in threadSMR.hpp
>
>               ThreadsListHandle tlh;
>
>               JavaThread* thr = NULL;
>
>               bool is_alive =
>             tlh.cv_internal_thread_to_JavaThread(threadObj, &thr, NULL);
>
>               if (is_alive) {
>
>                 return thr->cooked_allocated_bytes();
>
>               }
>
>               return -1;
>
>             JVM_END
>
>             This method took 70 ns in my test, which is 85% slower
>             than GetCurrentThreadAllocatedMemory but still 30% faster
>             than the best case of the current API. I currently have no
>             immediate need for this second method, but I think it
>             would also be a valueable addition to the API.
>
>             I attached a patch for getCurrentThreadAllocatedBytes. I
>             can create a second patch for also adding
>             getThreadAllocatedMemory(java.lang.Thread) to the API.
>
>             I am a first time contributor and I am not 100% sure what
>             process I must follow to get a change like this into
>             OpenJDK. Can someone have a look at my proposal and help
>             me through the process?
>
>             Best regards,
>
>             Markus
>
>
>     I believe this is the code that's causing you grief:
>
>     open/src/hotspot/share/services/management.cpp:
>
>     // Gets an array containing the amount of memory allocated on the Java
>     // heap for a set of threads (in bytes).  Each element of the array is
>     // the amount of memory allocated for the thread ID specified in the
>     // corresponding entry in the given array of thread IDs; or -1 if the
>     // thread does not exist or has terminated.
>     JVM_ENTRY(void, jmm_GetThreadAllocatedMemory(JNIEnv *env,
>     jlongArray ids,
>     jlongArray sizeArray))
>       // Check if threads is null
>       if (ids == NULL || sizeArray == NULL) {
>     THROW(vmSymbols::java_lang_NullPointerException());
>       }
>
>       ResourceMark rm(THREAD);
>       typeArrayOop ta = typeArrayOop(JNIHandles::resolve_non_null(ids));
>       typeArrayHandle ids_ah(THREAD, ta);
>
>       typeArrayOop sa =
>     typeArrayOop(JNIHandles::resolve_non_null(sizeArray));
>       typeArrayHandle sizeArray_h(THREAD, sa);
>
>       // validate the thread id array
>       validate_thread_id_array(ids_ah, CHECK);
>
>       // sizeArray must be of the same length as the given array of
>     thread IDs
>       int num_threads = ids_ah->length();
>       if (num_threads != sizeArray_h->length()) {
>     THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(),
>                   "The length of the given long array does not match
>     the length of "
>                   "the given array of thread IDs");
>       }
>
>       ThreadsListHandle tlh;
>       for (int i = 0; i < num_threads; i++) {
>         JavaThread* java_thread =
>     tlh.list()->find_JavaThread_from_java_tid(ids_ah->long_at(i));
>         if (java_thread != NULL) {
>           sizeArray_h->long_at_put(i,
>     java_thread->cooked_allocated_bytes());
>         }
>       }
>     JVM_END
>
>
>     Perhaps something like this above the "ThreadsListHandle tlh;" line:
>
>       if (num_threads == 1 && THREAD->is_Java_thread()) {
>         // Only asking for 1 thread so if we're a JavaThread, then
>         // see if this request is for ourself.
>         JavaThread* jt = THREAD;
>         oop tobj = jt->threadObj();
>
>         if (ids_ah->long_at(0) == java_lang_Thread::thread_id(tobj)) {
>           // Return the info for ourself.
>           sizeArray_h->long_at_put(0, jt->cooked_allocated_bytes());
>           return;
>         }
>       }
>
>     I haven't checked to see if this will even compile, but I
>     think you'll get the idea.
>
>     Dan
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180716/23b3c874/attachment-0001.html>


More information about the serviceability-dev mailing list