RFR: 8331725: ubsan: pc may not always be the entry point for a VtableStub

Axel Boldt-Christmas aboldtch at openjdk.org
Tue Jul 2 19:15:18 UTC 2024


On Tue, 2 Jul 2024 17:23:06 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:

>> `VtableStub* VtableStubs::entry_point(address pc)` has multiple issues causing undefined behaviour. Most of it stems from the fact that the pc sent in may sometimes not be the entry point of a `VtableStub`. We check this by doing a lookup in a hash table, but to generate the hash we need to extract data from the `VtableStub`. The original implementation simply cast the address to a `*VtableStub` and called the member function getters. This caused both unaligned accesses and bad bool values, all symptoms of the fact that we reinterpret_cast the memory as a `VtableStub`.
>> 
>> This change will instead look at the raw bytes when generating the hash which neither suffers from unaligned access nor bad reinterpret_casts. 
>> 
>> My initial patch did not create an `enum class` to represent the boolean value but instead looked at the object representation of the `bool` field. However this suffers from the fact that the object representation of `false` is implementation defined. I assumed it to be all 0 bits. It was also required unfortunate extra logic to handle the cases when `sizeof(bool) != 1`. Alt patch: https://github.com/openjdk/jdk/commit/b20140f49983b1903f9038e4e47def792c4493ac
>> 
>> There may be some better name than `unsafe_hash`. Suggestions?
>> 
>> Verified that this patch resolves the UB.
>> Running testing.
>> 
>> Side note:
>> There is still an unaligned access issue on aarch64 because of the way we construct `VtableStub`. We align the chunk where we create the `VtableStub` so that the `<VtableStub address> + sizeof(VtableStub)` is aligned to the platforms code entry alignment. On all platforms but `aarch64`  `<VtableStub address> + sizeof(VtableStub) % pd_code_alignment == 0` implies `<VtableStub address>  % alignof(VtableStub) == 0`. 
>> https://github.com/xmas92/jdk/blob/beea5305b071820e2b128a55c5ca384caf470fdd/src/hotspot/share/code/vtableStubs.hpp#L168-L171
>> 
>> Also note that there are already assumptions in the JVM about the object representation of bool. On `x86` `MacroAssembler::testbool` assumes that `false` is all 0 bits, and in addition to that `MacroAssembler::movbool` assumes that `1` is `true`.
>
> src/hotspot/share/code/vtableStubs.cpp line 261:
> 
>> 259:   // The entrypoint may or may not be a VtableStub. Generate a hash as if it was.
>> 260:   address vtable_stub_addr = entry_point - VtableStub::entry_offset();
>> 261:   assert(CodeCache::contains(vtable_stub_addr), "assumed to always be the case");
> 
> Why not `VtableStubs::contains()` ?

This is not trying to ascertain if `vtable_stub_addr` is in a `VtableStub`. It may not be, because we do not yet have a megamorphic call. This whole patch is to handle this case. However, whatever our call destination was it must have been in the CodeCache. And not only that, we must maintain the invariant that 16 bytes before our call destination is also in the code cache, or we would not be able to dereference that memory.

I am not sure exactly how we ensure this. But this patch does not change the behaviour, we still look at exactly the same memory as before, just in a more C++ friendly way.

> src/hotspot/share/code/vtableStubs.cpp line 294:
> 
>> 292:   // to generate the hash that would have been used if it was. The lookup in the
>> 293:   // _table will only succeed if there is a VtableStub with an entry point at
>> 294:   // the pc.
> 
> This is dangerous if it is not pointing to entry. How that can happened?

We only get a `VtableStub` after the call has become megamorphic. Before that it may point to the "resolver" (clean) or directly at a method entry point (monomorphic).

The way we check if this is megamorphic is to check if the destination is a VtableStub. We could do a linear scan through all the buckets (like `VtableStubs::contains()` does), but instead we do a hash lookup. But how do we get the hash? we need the `VtableStub` to generate the hash. So we assume that the destination is actually a `VtableStub`, generate the hash as if it was, and do a lookup in the hash table to see if any `VtableStub`'s entry point matches our call destination. In that case we are already megamorphic and nothing needs to be fixed. 

So it is by design.

@fisk is probably the goto guy if you want a more detailed description of how our `CompiledIC` works.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/19968#discussion_r1663034486
PR Review Comment: https://git.openjdk.org/jdk/pull/19968#discussion_r1663034530


More information about the hotspot-compiler-dev mailing list