current code location encoding doesn't work in aarch64
Jie He
Jie.He at arm.com
Fri May 8 03:09:33 UTC 2020
On Fri, May 8, 2020 at 01:40 AM Arthur Eubanks <aeubanks at google.com> wrote:
On Wed, May 6, 2020 at 10:41 PM Jie He <Jie.He at arm.com<mailto:Jie.He at arm.com>> wrote:
> Hi
>
> We know jdk/tsan uses a 64-bit value to identify code location, lowest
> 16-bit for bci, the next 44-bit for method id.
>
> In aarch64 current tsan memory mapping, method id allocated from heap is
> typically like 0xffffxxxxxxxx, which
> Couldn't be encoded by 47-bit directly.
I don't understand, doesn't 0xffffxxxxxxxx only require 32 bits to
represent?
Yes, but I’m not sure if it’s possible that the heap will be allocated in Low or Mid memory ranges.
and X86 just puts the method id directly into the code location packet, I won't change this code.
> However, there are only 3 application memory regions, they are
>
> static const uptr kLoAppMemBeg = 0x0000000001000ull;
> static const uptr kLoAppMemEnd = 0x0000200000000ull;
> static const uptr kMidAppMemBeg = 0x0aaaa00000000ull;
> static const uptr kMidAppMemEnd = 0x0aaaf00000000ull;
> static const uptr kHiAppMemBeg = 0x0ffff00000000ull;
> static const uptr kHiAppMemEnd = 0x1000000000000ull;
>
> I think in aarch64, we don't need 47-bit to encode the method id, because
> highest 12 bits are fixed, and could be encoded by at most 2 bits.
> e.g. 00 means LoAppMemRange, 01 means MidRange, 10 means HighRange.
> Or simpler, just check the value of the highest 4 bits, 0x0 means LoRange,
> 0x2 means MidRange, 0x7 means HighRange.
>
You don't even need the high 2 bits right? Chopping them off and using the
3rd and 4th highest bits, you can use 00 for low, 10 for mid, and 11 for
high. But I suppose it doesn't really matter as long as we can fit things
in.
> Like below
>
> 307 static jmethodID tsan_method_id_from_code_location(u8 loc) {
> 308 u8 id = (u8)(
> 309 (loc & ~(tsan_fake_pc_bit | tsan_bci_mask)) >>
> tsan_method_id_shift);
> 310
> 311 #ifdef AARCH64
> 312 u8 ms4bits = id >> 44;
> 313 if (ms4bits == 0x7ULL || ms4bits == 0x2ULL )
> 314 id = id | 0x800000000000ULL;
>
Doesn't this end up mixing high and mid?
address in high range is like 0xffffxxxxxxxx, and in mid range is like 0xaaaxxxxxxxxx.
by the previous code location encoding in function tsan_code_location, the msb of the 48-bit address will be overwrote,
the id become 0x7fffxxxxxxxx and 0x2aaxxxxxxxxx respectively.
then I use id = id | 0x800000000000ULL; to restore the highest 4 bits of method id to 0xf and 0xa.
> 315 #endif
> 316
> 317 return (jmethodID)id;
> 318 }
>
> What do you think?
>
> Thanks
> Jie He
>
>
>
>
>
More information about the tsan-dev
mailing list