Umm, i don't think I said anything about cache effects in the passage you quoted bwlo.<br>I said we should prefer using a small VA range, so a small set of physical pages would suffice.<br>You may have conflated an earlier reference in passing to caches (and its immediate rejection in the<br>
same breath) with the discussion about phytsical pages that followed. My reference to NUMA<br>was that the allocation code would likely be affected and we could expect to see some changes<br>as a result of it, which would be a good time to review the repeated reuse of VA range for young<br>
gen allocation..Anyway, hopefully we both know what we mean, and there is no misunderstanding.<br><br>best regards.<br>-- ramki<br><br><div class="gmail_quote">On Sat, Oct 27, 2012 at 12:42 PM, Thomas Schatzl <span dir="ltr"><<a href="mailto:thomas.schatzl@jku.at" target="_blank">thomas.schatzl@jku.at</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br>
<div class="im"><br>
On Thu, 2012-10-25 at 20:40 -0700, Srinivas Ramakrishna wrote:<br>
> Thomas, while adding newly freed regions to the head of a common<br>
> global free region list probably achieves much reuse, I was<br>
> suggesting something even stronger and deliberate, in that a much<br>
> smaller set of regions be preferred for repeated reuse for Eden<br>
> allocation and for reuse as survivor regions, kind of<br>
> segregating that smaller subset completely from regions used for<br>
> tenuring objects.<br>
<br>
</div>What I mean is, that to exploit cache effects in the way you<br>
suggest, these areas must be much smaller than you suggest; on<br>
multi-GB ranges of memory you likely won't benefit from them any<br>
more than now. NUMA awareness does not change that, especially<br>
because you typically use NUMA on machines because you want to<br>
use lots of memory.<br>
<br>
I.e. just a quick calculation: with NUMA awareness, the amount<br>
of memory touched per node/core is likely still too high to<br>
benefit a lot from caching. E.g. a 64GB eden divided across 8<br>
nodes still means you're touching 8GB/node with processors that<br>
have a few MB of cache each. In the best case.<br>
<br>
NUMA awareness does not improve upon cacheability (it might as a<br>
secondary effect!), but its main point is to improve access<br>
to memory beyond the caches imo.<br>
<br>
One idea where you exploit cache effects by giving each<br>
thread/core a very small heap with eden (typically <= 1 MB,<br>
i.e. something that fits nicely into the cache) that is<br>
reused like you suggest over and over again is generally<br>
referred to as "thread local heaps".<br>
<br>
The problem is, that the threshold to get improvements is<br>
much higher than with NUMA awareness. If a thread only<br>
has such a small eden, you cannot stop all threads every time<br>
it fills up any more, because that would naturally decrease<br>
throughput too much. I.e. it needs thread local gcs (so that<br>
every thread can collect its own heap without stopping the<br>
others) and associated complexity in the VM to work<br>
efficiently though.<br>
<br>
This is not meant to discourage you about NUMA awareness, just<br>
a try to clear up things (if there ever was something to clear<br>
up). NUMA awareness does have its use and it improves<br>
performance, but I don't think it mainly does so because of<br>
improved cachability.<br>
<span class="HOEnZb"><font color="#888888"><br>
Thomas<br>
<br>
<br>
</font></span></blockquote></div><br>