RFR: 8276563: Undefined Behaviour in class Assembler

Andrew Haley aph at openjdk.java.net
Sat Nov 6 09:09:39 UTC 2021


On Fri, 5 Nov 2021 20:27:02 GMT, Mai Đặng Quân Anh <duke at openjdk.java.net> wrote:

>> The HotSpot code base contains a number of instances of Undefined Behavior, which can cause all manner of unpleasant surprises.
>> The UB to which this patch relates is in class `Assembler`, in which instances are pointers to (nonexistent) objects defined as, for example,
>> 
>> 
>>   typedef RegisterImpl *Register;
>>   const Register r10 = ((Register)10);
>> 
>> 
>> Registers have accessors, e.g.:
>> 
>> ` int RegisterImpl::encoding() const { return (intptr_t)this; }`
>> 
>> This works by an accident of implementation: it is not legal C++.
>> 
>> The most obvious way to this UB bug is to make instances of `Register` point to something, and to use pointer subtraction to find the encoding: (simplified for clarity)
>> 
>> 
>>   extern RegisterImpl all_Registers[num_Registers];
>>   int RegisterImpl::encoding() const   { return this - all_Registers; }
>> 
>> 
>> After this patch there is slightly more work to be done when assembling code but it's merely the subtraction of a constant in `encoding()` and the difference in execution time is so small (and the startup variance so large) that I have been unable to measure it, even after averaging 100 runs. It does lead to an increase of about 1% in the size of the stripped libjvm.so, but I think that can be recovered by a subsequent patch.
>> 
>> An alternative way to implement this would be to make the encoding a byte-wide field in `RegisterImpl` and define encoding() this way:
>> 
>> `  int RegisterImpl::encoding() const   { return _encoding; }`
>> 
>> This would result in smaller code, but I suspect slower.
>> 
>> If this change is accepted, I propose that all instances of this pattern in HotSpot be treated similarly.
>
> src/hotspot/cpu/aarch64/register_aarch64.hpp line 39:
> 
>> 37: #define REGISTER_IMPL_DECLARATION(type, name)                           \
>> 38: inline const type as_ ## type(int encoding) {                           \
>> 39:   assert(encoding <= name::number_of_declared_registers, "invalid register"); \
> 
> Should this be `<` instead of `<=`?

No, because `noreg` is a pointer one past the end of the array,

> src/hotspot/cpu/aarch64/register_aarch64.hpp line 76:
> 
>> 74: 
>> 75:   // derived registers, offsets, and addresses
>> 76:   Register successor() const     { return as_Register(encoding() + 1); }
> 
> Can we simply return `this + 1` here?
> Thank you very much.

Oh yeah, good point.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6280


More information about the hotspot-compiler-dev mailing list