RFR: Reimplement TsanOopMap support using WeakHandle and ResizeableResourceHashtable [v6]

Man Cao manc at openjdk.org
Wed Jul 24 03:21:43 UTC 2024


On Wed, 24 Jul 2024 00:24:18 GMT, Jiangli Zhou <jiangli at openjdk.org> wrote:

>> The new implementation for TsanOopMap support in JDK 21 is partially modeled after [JvmtiTagMap](https://github.com/openjdk/jdk21u-dev/blob/cdbd94f8fa97847c02e1f0f4c6cd75f457a62e25/src/hotspot/share/prims/jvmtiTagMap.hpp#L37) and [JvmtiTagMapTable](https://github.com/openjdk/jdk21u-dev/blob/cdbd94f8fa97847c02e1f0f4c6cd75f457a62e25/src/hotspot/share/prims/jvmtiTagMapTable.hpp#L43):
>> 
>> - Use [WeakHandle](https://github.com/openjdk/jdk21u-dev/blob/cdbd94f8fa97847c02e1f0f4c6cd75f457a62e25/src/hotspot/share/oops/weakHandle.hpp#L42) for handling oops in the table entries;
>> - Use ResizeableResourceHashtable to for the underlying hash table. Table growing/resizing and removal of freed object entries are mostly handled within ResizeableResourceHashtable, except some small parts are done within TsanOopMapTable. That makes the new implementation simpler.
>> 
>> The TsanOopMapTable contains entries consisting of TsanOopMapTableKey:oop_size (key:value) pairs. The TsanOopMapTableKey contains:
>> 
>>  - A Weakhandle that holds a pointer to the oop;
>>  - The original (or updated) address of the object in Java heap;
>> 
>> For TsanOopMapTable support, I added a new weak OopStorage, which is created during tsan_init(). All WeakHandles used for tracking Tsan interested Java objects are created from that OopStorage. GC processes these WeakHandles in concurrent WeakProcessor worker threads.
>> 
>> To notify Tsan about object free/move in time, I added a call in WeakProcessor::Task::work to process the TsanOopMap:
>> 
>> - If an object is freed by GC, call __tsan_java_free to notify Tsan. Remove the entry from the table.
>> - If an object is moved by GC, update the raw address in the entry using the new oop address obtained from the WeakHandle and add the “move” to a GrowableArray.
>> - After all entries in TsanOopMap are processed, sort the “moves” in the GrowableArray and notify Tsan by calling __tsan_java_move for each object in the sorted array. This part is the existing implementation from JDK 11 for Tsan support.
>> 
>> Note all above operations occur during GC, before any of the mutators sees a moved or freed Java object.
>> 
>> As an alternative, I initially experimented with doing the above operations concurrently (to mutators) in ServiceThread after GC finished processing the WeakHandles. That could occur after mutators sees moved objects and resulted incorrect data being recorded in Tsan.
>> 
>> Suppressed following Tsan errors in JDK:
>> 1)  In `java.util.concurrent.locks`, for
>>...
>
> Jiangli Zhou has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Sort moves for overlapping case only.

Changes requested by manc (Reviewer).

src/hotspot/share/tsan/tsanOopMap.cpp line 279:

> 277:   // We need object size when notify tsan about a freed object.
> 278:   // We cannot call size() for an object after it's freed, so we
> 279:   // need to save the size information in the table.

Could we move this comment about the size field to tsanOopMapTable.hpp? It was in tsanOopMap.hpp before.

src/hotspot/share/tsan/tsanOopMap.cpp line 284:

> 282: 
> 283: #ifdef ASSERT
> 284: bool TsanOopMap::exists(oopDesc *addr) {

Is `TsanOopMap::exists` dead code? If so, we can delete it and the `find`, `is_empty` methods in `tsanOppMapTable`.

src/hotspot/share/tsan/tsanOopMapTable.cpp line 49:

> 47: 
> 48: void TsanOopMapTableKey::update_obj() {
> 49:   oop obj = _wh.peek();

Since `_obj` is `oopDesc*` and `oop` could be an object, is it better to do `oopDesc* obj = _wh.peek()`?

src/hotspot/share/tsan/tsanOopMapTable.cpp line 134:

> 132:                                      _dest_low(dest_low), _dest_high(dest_high),
> 133:                                      _n_downward_moves(n_downward_moves) {}
> 134:     bool do_entry(TsanOopMapTableKey& entry, uintx size) {

`uintx` should also be `jlong`, right?

src/hotspot/share/tsan/tsanOopMapTable.hpp line 25:

> 23:  */
> 24: 
> 25: #ifndef SHARE_VM_PRIMS_TSAN_OOPMAPTABLE_HPP

The `VM_PRIMS_` part can be removed and it misses a `TSAN` prefix. It should be `SHARE_TSAN_TSANOOPMAPTABLE_HPP`.

src/hotspot/share/tsan/tsanOopMapTable.hpp line 67:

> 65: 
> 66:   void release_weak_handle() const;
> 67:   oop object_no_keepalive() const;

I'm a bit unsure about the use of `oop` type vs `oopDesc*` type. Could we keep using `oopDesc*` type in `object_no_keepalive()`, `obj()`, `equals()` and `update_obj()`? E.g., whenever calling `_wh.peek()`, immediately cast the return value to a `oopDesc*`.

They are the same type in a release build, but different in a debug build. I vaguely recall that TsanOopSizeMap used `oop` in the JDK7/8 era, but switched to `oopDesc*` due to errors in debug build.

src/hotspot/share/tsan/tsanOopMapTable.hpp line 74:

> 72:   static unsigned get_hash(const TsanOopMapTableKey& entry) {
> 73:     assert(entry._obj != nullptr, "sanity");
> 74:     return (unsigned int)entry._obj->identity_hash();

In theory `entry._obj` could be an invalid object (old object that is moved or freed), so `identity_hash()` may misbehave. In practice, this probably never happens due to the way we process the hash table in a GC pause. This may deserve a comment, or may be an assert such as `Metaspace::contains(entry._obj->klass())`. But feel free to do nothing.

src/hotspot/share/tsan/tsanOopMapTable.hpp line 78:

> 76: 
> 77:   static bool equals(const TsanOopMapTableKey& lhs, const TsanOopMapTableKey& rhs) {
> 78:     oop lhs_obj = lhs._obj != nullptr ? (oop)lhs._obj : lhs.object_no_keepalive();

I think `lhs._obj` and `rhs._obj` will never be `nullptr` in practice. This is because:
- We never add a null oop to the table.
- Once `_wh.peek()` returns null, the associated entry is immediately removed. We never update `_obj` to null.

Would it better to add an assert that `lhs._obj` and `rhs._obj` are both non-null and remove the call to `object_no_keepalive()`?

(I find it a bit hard to reason why we would resort to `_wh` for equality. I know `JvmtiTagMapKey::_obj` would be null most of the time in contrast.)

src/hotspot/share/tsan/tsanOopMapTable.hpp line 104:

> 102:   unsigned size() const { return _table.table_size(); };
> 103: 
> 104:   bool add_oop_with_size(oop obj, int size);

Type for `size` should be `jlong`, right? As `typedef ResizeableResourceHashtable` indicates.
Let's fix this problem also in `TsanOopMap::add_oop_with_size`.  According to https://github.com/microsoft/compiler-rt/blob/master/test/tsan/java.h, the tsan callbacks take `unsigned long` for size, so `jlong` should be fine.

(We have run into issues with overflowing size field for large objects (>4G), related to JVMTI SampledObjectAlloc. So this is a real bug that could manifest with large objects.)

-------------

PR Review: https://git.openjdk.org/tsan/pull/19#pullrequestreview-2195424296
PR Review Comment: https://git.openjdk.org/tsan/pull/19#discussion_r1689001611
PR Review Comment: https://git.openjdk.org/tsan/pull/19#discussion_r1689052679
PR Review Comment: https://git.openjdk.org/tsan/pull/19#discussion_r1689013309
PR Review Comment: https://git.openjdk.org/tsan/pull/19#discussion_r1689007134
PR Review Comment: https://git.openjdk.org/tsan/pull/19#discussion_r1688985804
PR Review Comment: https://git.openjdk.org/tsan/pull/19#discussion_r1689018481
PR Review Comment: https://git.openjdk.org/tsan/pull/19#discussion_r1689044556
PR Review Comment: https://git.openjdk.org/tsan/pull/19#discussion_r1689033960
PR Review Comment: https://git.openjdk.org/tsan/pull/19#discussion_r1689004755


More information about the tsan-dev mailing list