RFR: JDK-8318119: Invalid narrow Klass base on aarch64 post 8312018
Andrew Haley
aph at openjdk.org
Mon Nov 6 18:07:03 UTC 2023
On Mon, 6 Nov 2023 15:11:51 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
>>> This function is used just for one valid use case: verifying the
>>> user input for -XX:SharedBaseAddress. That happens at a point where
>>> we have not allocated anything yet, so we don't know the future
>>> encoding range.
>>
>> So this function is simply asking a question that cannot be answered
>> without actually trying it.
>>
>>> I am really considering removing this function. It is difficult to
>>> explain, of not much use, and also slightly wrong since it ignores
>>> the base==valid EOR immediate case.
>>
>> Please do! If there is to be an answer about whether a base is valid,
>> let it happen just after a failed attempted allocation.
>>
>>> Side note about non-zero immediate + non-zero shift:
>>>
>>> The Klass range does not have to start at the encoding base. For
>>> encoding base == zero it obviously does not. But also for non-zero
>>> encoding bases this could in theory make sense, at least on aarch64
>>> that has no fallback add-base mode.
>>>
>>> For example, given a Klass range starting just below 8 GB due to
>>> ASLR, this normally would lead to VM exit since such an address is
>>> unusable with MOVK, unless it fits EOR mode.
>>
>> Surely we'd scan upwards until a successful allocation. Why does it
>> matter where the Klass range starts? Or is the problem that we're
>> trying hard to honour a -XX: user request, for some reason? What would
>> be the purpose?
>
>> > This function is used just for one valid use case: verifying the
>> > user input for -XX:SharedBaseAddress. That happens at a point where
>> > we have not allocated anything yet, so we don't know the future
>> > encoding range.
>>
>> So this function is simply asking a question that cannot be answered without actually trying it.
>>
>> > I am really considering removing this function. It is difficult to
>> > explain, of not much use, and also slightly wrong since it ignores
>> > the base==valid EOR immediate case.
>>
>> Please do! If there is to be an answer about whether a base is valid, let it happen just after a failed attempted allocation.
>>
>> > Side note about non-zero immediate + non-zero shift:
>> > The Klass range does not have to start at the encoding base. For
>> > encoding base == zero it obviously does not. But also for non-zero
>> > encoding bases this could in theory make sense, at least on aarch64
>> > that has no fallback add-base mode.
>> > For example, given a Klass range starting just below 8 GB due to
>> > ASLR, this normally would lead to VM exit since such an address is
>> > unusable with MOVK, unless it fits EOR mode.
>>
>> Surely we'd scan upwards until a successful allocation. Why does it matter where the Klass range starts? Or is the problem that we're trying hard to honour a -XX: user request, for some reason? What would be the purpose?
>
> We cannot scan upward because Oracle does not want us to do that for security reasons. We are to provide some form of ASLR. But we do scan the available address range (32-48 bits) for a suitable mapping point in a random fashion, see:
>
> https://github.com/openjdk/jdk/blob/b3126b6e441bf52058075fa1fc9dc800af774ca9/src/hotspot/share/memory/metaspace.cpp#L620-L622
>
> With this patch, as a fallback, we now let the system reserve anywhere but with overalignment, and then cut-align the memory ourselves. That way, we rely on the entropy of the system's ASLR at the cost of temporary over-allocation of address space. In theory this should work most of the time unless someone limits the vsize of the process.
>
> All of that may fail once in a blue moon. The address space may be heavily populated. The user space may be limited from below with vm.mmap_min_adr or by SELinux. The kernel itself may further limit at what addresses the user is allowed to map, though I did not find such a restriction for Arm64.
>
> Note that the error motivating this PR came from someone running the VM on an OrangePi SoC. I compared the orangepi linu...
Oh, wow. I never heard about "Oracle does not want us to do that for security reasons. We are to provide some form of ASLR." Sorry, this is all news to me.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/16215#discussion_r1383734507
More information about the hotspot-runtime-dev
mailing list