RFR: JDK-8298908: Instrument Metaspace for ASan [v3]

Thomas Stuefe stuefe at openjdk.org
Wed Jan 4 17:08:52 UTC 2023


On Wed, 4 Jan 2023 13:55:15 GMT, Justin King <jcking at openjdk.org> wrote:

>> This change instruments Metaspace for ASan. Metaspace allocates memory using `mmap`/`munmap` which ASan is not aware of. Fortunately ASan supports applications [manually poisoning/unpoisoning memory](https://github.com/google/sanitizers/wiki/AddressSanitizerManualPoisoning). ASan is able to detect poisoned memory, similar to `use-after-free`, and will raise an error similarly called `use-after-poison`. This provides and extra layer of defense and confidence.
>> 
>> The header `sanitizers/address.h` defines macros for poisoning/unpoisoning memory regions. These macros can be used regardless of build mode. When ASan is not available, they are implemented using a NOOP approach which still compiles the arguments but does so such that they will be stripped out by the compiler due to being unreachable. This helps with maintenance.
>> 
>> This also has the added benefit of making [LSan](https://bugs.openjdk.org/browse/JDK-8298445) more accurate and deterministic, as LSan will not look for pointers to malloc memory in poisoned memory regions.
>> 
>> IMO the benefit of doing this greatly outweighs the cost.
>
> Justin King has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:
> 
>  - Fix typo
>    
>    Signed-off-by: Justin King <jcking at google.com>
>  - Merge remote-tracking branch 'upstream/master' into jdk-8298908
>  - Exclude more zapping when ASan is in use
>    
>    Signed-off-by: Justin King <jcking at google.com>
>  - Instrument Metaspace for ASan
>    
>    Signed-off-by: Justin King <jcking at google.com>

Hi Justin,

I find this very interesting! Have several questions, though.

Does Asan assume 8-byte-alignment? I'm hesitant to set this alignment in stone since Metaspace could be allocated, at least now, with smaller alignments. We don't do this now, but I'd like to keep the option open. But see my proposal below, if you follow that, alignment should be no problem.

How is 32-bit handled?

Can you please explain the poisoning and unpoisening a bit? How is the use-after-free instrumentation done (I assume build time instrumentation?) and how fast is it, e.g. how fine granular can you go - can you have millions of micro-ranges, is that still feasible and reasonable? Should we poison before munmap?

----

Metaspace-wise, I think tracking *blocks* is not the best approach. Especially *dealloced* blocks, since there, you need to keep the header unpoisened, but the header is where most overwriters would happen. And block deallocation is a special rare case anyway. 

Metaspace allocation works like this:
- VSN allocates memory via mmap and hands out *chunks* to arenas. Chunks are 1k ... 4M in size.
- An arena then owns a series of chunks and hands out parts of these chunks as individual *blocks* to the user. Blocks are usually 12 bytes.. ~1K, with a heavy emphasis on tiny sizes.
- Arena is owned by a class loader. Typically individual blocks are not deallocated. Instead, if the arena is deleted, it releases all *chunks* to a pool (ChunkManager). Optionally these chunks are also temporarily uncommitted. These chunks can be reused by Arenas later, being committed again had they been uncommitted before.
- Block deallocation is a rare case that is only used if Blocks are released before the arena is released. This happens e.g., if a class is redefined and its old bytecode is let go, but class and classloader and hence arena live on.

I think Chunks are a better point for tracking:
- way less ranges to track since granularity is coarser. Less runtime costs?
- Alignment is no problem since chunk start addresses are aligned to chunk size (it is a buddy allocator), so at least 1k aligned.
- Chunks, in contrast to blocks, are maintained in a way that does not put metadata into them. Metadata is kept separate. So you can poison freed chunks at your heart's content with no header to tiptoe around. 
- You lose a bit of accuracy, but honestly, not that much. The difference between "memory in classloader-owned chunk" and "memory handed out to user" is typically small.

So I think it would be fine to unpoisen a whole chunk when given to an arena instead of unpoisening every individual allocation from the arena. And poison chunks when the Arena dies and puts the chunks back into the pool. So, if you instrument `ChunkManager::get_chunk()` (poisen) and `ChunkManager::return_chunk()` (unpoisen) in addition to your existing pre-poisoning in VSN, it would be sufficient. You could remove the code in binlist and blocktree, then.

Cheers, Thomas



 

Metaspace is an arena allocator, with arenas being bound to a class loader. It allocates chunks from a mmaped range, and upon class loader

Best thing would be:
- poisen the VSN (you do that already) upon creation
- unpoisen the blocks handed out from an arena (you do that too)
- poisen every *chunk* handed 
-
I think poisoning the VSN node is ok, and unpoisening the block, before handing it out, is fine too. 


But then

-------------

PR: https://git.openjdk.org/jdk/pull/11702



More information about the build-dev mailing list