current code location encoding doesn't work in aarch64

Mon May 11 16:18:07 UTC 2020

Ah I think I miscounted the bits, sounds good.

On Thu, May 7, 2020 at 8:09 PM Jie He <Jie.He at arm.com> wrote:

>
>
> On Fri, May 8, 2020 at 01:40 AM  Arthur Eubanks <aeubanks at google.com>
> wrote：
>
> On Wed, May 6, 2020 at 10:41 PM Jie He <Jie.He at arm.com> wrote:
>
> > Hi
> >
> > We know jdk/tsan uses a 64-bit value to identify code location, lowest
> > 16-bit for bci, the next 44-bit for method id.
> >
> > In aarch64 current tsan memory mapping, method id allocated from heap is
> > typically like 0xffffxxxxxxxx, which
> > Couldn't be encoded by 47-bit directly.
>
>  I don't understand, doesn't 0xffffxxxxxxxx only require 32 bits to
> represent?
>
>
>
> Yes, but I’m not sure if it’s possible that the heap will be allocated in
> Low or Mid memory ranges.
>
> and X86 just puts the method id directly into the code location packet, I
> won't change this code.
>
>
> > However, there are only 3 application memory regions, they are
> >
> > static const uptr kLoAppMemBeg   = 0x0000000001000ull;
> > static const uptr kLoAppMemEnd   = 0x0000200000000ull;
> > static const uptr kMidAppMemBeg  = 0x0aaaa00000000ull;
> > static const uptr kMidAppMemEnd  = 0x0aaaf00000000ull;
> > static const uptr kHiAppMemBeg   = 0x0ffff00000000ull;
> > static const uptr kHiAppMemEnd   = 0x1000000000000ull;
> >
> > I think in aarch64, we don't need 47-bit to encode the method id, because
> > highest 12 bits are fixed, and could be encoded by at most 2 bits.
> > e.g. 00 means LoAppMemRange, 01 means MidRange, 10 means HighRange.
> > Or simpler, just check the value of the highest 4 bits, 0x0 means
> LoRange,
> > 0x2 means MidRange, 0x7 means HighRange.
> >
> You don't even need the high 2 bits right? Chopping them off and using the
> 3rd and 4th highest bits, you can use 00 for low, 10 for mid, and 11 for
> high. But I suppose it doesn't really matter as long as we can fit things
> in.
>
> > Like below
> >
> > 307  static jmethodID tsan_method_id_from_code_location(u8 loc) {
> > 308     u8 id = (u8)(
> > 309         (loc & ~(tsan_fake_pc_bit | tsan_bci_mask)) >>
> > tsan_method_id_shift);
> > 310
> > 311     #ifdef AARCH64
> > 312     u8 ms4bits = id >> 44;
> > 313     if (ms4bits == 0x7ULL || ms4bits == 0x2ULL )
> > 314       id = id | 0x800000000000ULL;
> >
> Doesn't this end up mixing high and mid?
>
>
>
> address in high range is like 0xffffxxxxxxxx, and in mid range is like
> 0xaaaxxxxxxxxx.
>
> by the previous code location encoding in function tsan_code_location, the
> msb of the 48-bit address will be overwrote,
>
> the id become 0x7fffxxxxxxxx and 0x2aaxxxxxxxxx respectively.
>
> then I use  id = id | 0x800000000000ULL;  to restore the highest 4 bits of
> method id to 0xf and 0xa.
>
>
> > 315     #endif
> > 316
> > 317     return (jmethodID)id;
> > 318   }
> >
> > What do you think?
> >
> > Thanks
> > Jie He
> >
> >
> >
> >
> >
>
>
>
>