RFR: 8330174: Protection zone for easier detection of accidental zero-nKlass use [v6]
Ioi Lam
iklam at openjdk.org
Sat Feb 22 21:46:57 UTC 2025
On Fri, 21 Feb 2025 18:38:20 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
>> If we wrongly decode an nKlass of `0`, and the nKlass encoding base is not NULL (typical for most cases that run with CDS enabled), the resulting pointer points to the start of the Klass encoding range. That area is readable. If CDS is enabled, it will be at the start of the CDS metadata archive. If CDS is off, it is at the start of the class space.
>>
>> Now, both CDS and class space allocate a safety buffer at the start to prevent Klass structures from being located there. However, that memory is still readable, so we can read garbage data from that area. In the case of CDS, that area is just 16 bytes, after that come real data. Since Klass is large, most accesses will read beyond the 16-byte zone.
>>
>> We should protect the first page in the narrow Klass encoding range to make analysis of errors like this easier. Especially in release builds where decode_not_null does not assert. We already use a similar technique in the heap, and most OSes protect the zero page for the same reason.
>>
>> This patch does that. Now, decoding an `0` nKlass and then using the result `Klass` - calling virtual functions or accessing members - crashes right away.
>>
>> Additionally, the patch provides a helpful output in the register/stack section, e.g:
>>
>>
>> RDI=0x0000000800000000 points into nKlass protection zone
>>
>>
>>
>> Testing:
>> - GHAs.
>> - I tested the patch manually on x64 Linux for both CDS on, CDS off and zero-based encoding, CDS off and non-zero-based encoding.
>> - I tested manually on Windows x64
>> - I also prepared an automatic gtest, but that needs some preparatory work on the gtest suite first to work (see https://bugs.openjdk.org/browse/JDK-8348029)
>>
>> -- Update 2024-01-22 --
>> I added a jtreg test that is more thorough than a gtest (also scans the produced hs-err file)
>
> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 14 additional commits since the last revision:
>
> - copyrights
> - Redo everything, including Iois changes
> - reverse everything
> - Merge branch 'master' into JDK-8330174-Protection-zone-for-easier-detection-of-accidental-zero-nKlass-use
> - fix whitespace error
> - fix test
> - test-fixes
> - fix bug found with jtreg test where metaspace buddy allocator would accidentally replace the protection mapping
> - add jtreg test; replaces the gtest
> - Merge branch 'master' into JDK-8330174-Protection-zone-for-easier-detection-of-accidental-zero-nKlass-use
> - ... and 4 more: https://git.openjdk.org/jdk/compare/c9d0b971...6d85e1fe
Thomas, thanks for making the changes.
I think we should amend this comment in filemap.hpp:
- char* _mapped_base_address; // Actual base address where archive is mapped.
+ char* _mapped_base_address; // Actual base address used for mapping the core
// regions. Note that the lowest core region
// (rw for the static archive) is mapped at offset
// MetaspaceShared::protection_zone_size() from this address
src/hotspot/share/cds/archiveBuilder.cpp line 163:
> 161: _mapped_static_archive_top(nullptr),
> 162: _buffer_to_requested_delta(0),
> 163: _pz_region("pz", MAX_SHARED_DELTA), // protection zone
To avoid confusion:
// protection zone -- used only during dumping; does NOT exist in cds archive.
src/hotspot/share/cds/archiveBuilder.hpp line 214:
> 212: // The "pz" region is used only during static dumps to reserve an unused space between SharedBaseAddress and
> 213: // the bottom of the rw region. During runtime, this space will be filled with a reserved area that disallows
> 214: // read/write/exec, so we can track for bad CompressedKlassPointers encoding.
Add:
// Note: this region does NOT exist in the cds archive.
src/hotspot/share/cds/metaspaceShared.cpp line 1259:
> 1257: *(mapped_base_address) = 'P';
> 1258: *(mapped_base_address + prot_zone_size - 1) = 'P';
> 1259: #endif
Need a comment about the purpose of this block of code.
src/hotspot/share/cds/metaspaceShared.cpp line 1378:
> 1376: //
> 1377: // In order for those IDs to still be valid, we need to dictate base and shift: base should be the
> 1378: // mapping start (including protection zone), shift the shift used at archive generation time.
- shift the shift used
+ shift should be the shift used
src/hotspot/share/memory/metaspace.cpp line 811:
> 809:
> 810: // After narrowKlass encoding scheme is decided: if the encoding base points to class space start,
> 811: // establish a protection zone.
Maybe add a comment like:
This way we can ensure that no valid compressed pointers will have the value of zero.
src/hotspot/share/memory/metaspace.cpp line 812:
> 810: // After narrowKlass encoding scheme is decided: if the encoding base points to class space start,
> 811: // establish a protection zone.
> 812: if (CompressedKlassPointers::base() == (address)rs.base()) {
What happens if this condition is not true? Do we need to check for encoded values of zero?
Or, is the following true?
} else {
assert((unitx)(CompressedKlassPointers::base()) == 0x0, "must be zero based");
}
Is the `else` part covered by your new test case?
-------------
PR Review: https://git.openjdk.org/jdk/pull/23190#pullrequestreview-2635295087
PR Review Comment: https://git.openjdk.org/jdk/pull/23190#discussion_r1966603821
PR Review Comment: https://git.openjdk.org/jdk/pull/23190#discussion_r1966603266
PR Review Comment: https://git.openjdk.org/jdk/pull/23190#discussion_r1966604504
PR Review Comment: https://git.openjdk.org/jdk/pull/23190#discussion_r1966604847
PR Review Comment: https://git.openjdk.org/jdk/pull/23190#discussion_r1966606537
PR Review Comment: https://git.openjdk.org/jdk/pull/23190#discussion_r1966606813
More information about the hotspot-dev
mailing list