RFR: 8130212: Thread::current() might access freed memory on Solaris

Thomas Stüfe thomas.stuefe at gmail.com
Mon Aug 3 15:38:18 UTC 2015


Hi David,

we added compiler-level TLS (__thread) for Linux a while ago to do exactly
what you are doing, but then had to remove it again. Bugs in the glibc
caused memory leaks - basically it looked like glibc was not cleaning TLS
related structures. Leak was small but it added up over time.

I know you implemented this for Solaris. Just thought I give you a warning,
maybe this is something to keep in mind.

Thanks, Thomas


On Wed, Jul 29, 2015 at 7:46 AM, David Holmes <david.holmes at oracle.com>
wrote:

> Summary: replace complex custom code for maintaining ThreadLocalStorage
> with compiler supported thread-local variables (Solaris only)
>
> This is a non-public bug so let me explain with some background, the bug,
> and then the fix - which involves lots of complex-code deletion and
> addition of some very simple code. :)
>
> webrev: http://cr.openjdk.java.net/~dholmes/8130212/webrev/
>
> In various parts of the runtime and in compiler generated code we need to
> get a reference to the (VM-level) Thread* of the currently executing
> thread. This is what Thread::current() returns. For performance reasons we
> also have a fast-path on 64-bit where the Thread* is stashed away in a
> register (g7 on sparc, r15 on x64).
>
> So Thread::current() is actually a slow-path mechanism and it delegates to
> ThreadLocalStorage::thread().
>
> On some systems ThreadLocalStorage::thread utilizes a caching mechanism to
> try and speed up access to the current thread. Otherwise it calls into yet
> another "slow" path which uses the available platform
> thread-specific-storage APIs.
>
> Compiled code also has a slow-path get_thread() method which uses assembly
> code to invoke the same platform thread-specific-storage APIs (in some
> cases - on sparc it simply calls ThreadLocalStorage::thread()).
>
> On Solaris 64-bit (which is all we support today) there is a simple
> 1-level thread cache which is an array of Thread*. If a thread doesn't find
> itself in the slot for the hash of its id it inserts itself there. As a
> thread terminates it clears out its ThreadLocalStorage values including any
> cached reference.
>
> The bug is that we have potential for a read-after-free error due to this
> code:
>
>   46   uintptr_t raw = pd_raw_thread_id();
>   47   int ix = pd_cache_index(raw);  // hashes id
>   48   Thread* candidate = ThreadLocalStorage::_get_thread_cache[ix];
>   49   if (candidate->self_raw_id() == raw) {
>   50     // hit
>   51     return candidate;
>   52   } else {
>   53     return ThreadLocalStorage::get_thread_via_cache_slowly(raw, ix);
>   54   }
>
> The problem is that the value read as candidate could be a thread that
> (after line 48) terminated and was freed. But line #49 then reads the raw
> id of that thread, which is then a read-after-free - which is a "Bad Thing
> (TM)".
>
> There's no simple fix for the caching code - you would need a completely
> different approach (or synchronization that would nullify the whole point
> of the cache).
>
> Now all this ThreadLocalStorage code is pretty old and was put in place to
> deal with inadequacies of the system provided thread-specific-storage API.
> In fact on Solaris we even by-pass the public API
> (thr_getspecific/thr_setspecific) when we can and implement our own version
> using lower-level APIs available in the T1/T2 threading libraries!
>
> In mid-2015 things have changed considerably and we have reliable and
> performant support for thread-local variables at the C+ language-level. So
> the way to maintain the current thread is simply using:
>
>  // Declaration of thread-local variable
>  static __thread Thread * _thr_current
>
>  inline Thread* ThreadLocalStorage::thread()  {
>    return _thr_current;
>  }
>
>  inline void ThreadLocalStorage::set_thread(Thread* thread) {
>    _thr_current = thread;
>  }
>
> And all the complex ThreadLocalStorage code with caching etc all vanishes!
>
> For my next trick I plan to try and remove the ThreadLocalStorage class
> completely by using language-based thread-locals on all platforms. But for
> now this is just Solaris and so we still need the ThreadLocalStorage API.
> However a lot of that API is not needed any more on Solaris so I have
> excluded it from there in the shared code (ifndef SOLARIS). But to avoid
> changing other shared-code callsites of ThreadLocalStorage I've kept part
> of the API with trivial implementations on Solaris.
>
> Testing: JPRT
>          All hotspot regression tests
>
> I'm happy to run more tests but the nice thing about such low-level code
> is that if it is broken, it is always broken :) Every use of
> Thread::current or MacroAssembler::get_thread now hits this code.
>
> Performance: I've run a basic set of benchmarks that is readily available
> to me on our performance testing system. The best way to describe the
> result is neutral. There are some slight wins, and some slight losses, with
> most showing no statistical difference. And even the "wins" and "losses"
> are within the natural variations of the benchmarks. So a lot of complex
> code has been replaced by simple code and we haven't lost any observable
> performance - which seems like a win to me.
>
> Also product mode x64 libjvm.so has shrunk by 921KB - which is a little
> surprising but very nice.
>
> Thanks,
> David
>


More information about the hotspot-runtime-dev mailing list