RFR: Remove prefetch during mark

Thu Mar 1 00:52:05 UTC 2018

Hi all,

I just stumbled upon this thread, and thought I ought to chime in.

You may find our prefetch paper from 10 years ago useful.   Or not! :-).
                http://users.cecs.anu.edu.au/~steveb/downloads/pdf/pf-ismm-2007.pdf

The short version is that there were a number of efforts to get prefetching working well in the past, but none were effective.  We did a pretty detailed study and managed to get some very nice results, with two important changes:

  *   FIFO front end to mark queue (without the FIFO the prefetch distance is unpredictable)
  *   Enqueue edges rather than nodes
Obviously, the situation is different here (concurrent, big change in uarch, etc), but still there are some core ideas that you probably ought to know.

The impatient may want to jump to section 7.2 and 7.3.    Note the last para of 7.3: just adding the FIFO, with no software prefetch may bring a win on some architectures.

Cheers,

--Steve

On 02/14/2018 05:23 PM, Wilkinson, Hugh wrote:
> I have been looking at this also.
>
> I find that if the prefetching occurs 3 popped entries ahead of the processing, then there is a worthwhile benefit.
>
> A bit of re-structuring is required to make this easy and efficient.
>
> I am prefetching 2 cache lines from the referenced object and also doing a PREFETCHW of the mark bitmap.  (Prefetch::write() requires modification for x86.)
>
> With the current code structure, removal of the Prefetch::read() probably makes sense; however, I would like to highlight that marking performance can be improved with sufficiently early software cache prefetches.
>
> I expect to share more details later.

Looking forward to that!

cheers,
Per