hotspot heap and L1 and L2 cache misses
Vitaly Davidovich
vitalyd at gmail.com
Mon Sep 17 13:40:15 PDT 2012
Andy,
TLAB will satisfy the allocation requests in this case, so the object and
its arrays would be in that thread-local buffer. However, some objects can
be considered humongous and will be allocated straight out of global Eden
space or even tenured space. Also, once a tlab is exhausted, it's retired
(its objects' memory is copied into Eden global pool and TLAB is reset,
possibly resized as well). If your objects survive gc and get tenured,
they'll end up in the old gen. Once there, I don't think there's any
guarantee that heap compaction will try to keep the object graph close
together in memory, although maybe it ends up like that inadvertently. If
your objects get copied into Eden pool (or a survivor space), I'm not sure
there's any guarantee of colocation either. Some of the above may be
inaccurate so check with the gc devs using the alias in Chris' last email.
I'd also do as Chris suggested and re-run your benchmark on a recent
hotspot build and modern/recent hardware. You'd also need to profile using
a profiler that supports hardware perf counters so that you can attribute
the difference to cache misses - otherwise, hard to say for sure.
HTH,
Vitaly
Sent from my phone
On Sep 17, 2012 3:50 PM, "Christian Thalinger" <
christian.thalinger at oracle.com> wrote:
>
> On Sep 17, 2012, at 12:20 PM, Andy Nuss <andrew_nuss at yahoo.com> wrote:
>
> > What about the case of a new class instance that creates and holds 2
> references to medium length arrays? Is the new instance and its 2 arrays
> in the same area of the heap?
>
> Depends on what you mean with "the same area". But these questions should
> better go to hotspot-gc-dev.
>
> -- Chris
>
> >
> > From: Christian Thalinger <christian.thalinger at oracle.com>
> > To: Andy Nuss <andrew_nuss at yahoo.com>
> > Cc: hotspot <hotspot-compiler-dev at openjdk.java.net>
> > Sent: Monday, September 17, 2012 11:39 AM
> > Subject: Re: hotspot heap and L1 and L2 cache misses
> >
> >
> > On Sep 15, 2012, at 12:03 PM, Andy Nuss <andrew_nuss at yahoo.com> wrote:
> >
> > > Hi,
> > >
> > > Lets say I have a function which mutates a finite automata. It
> creates lots of small objects (my own link and double-link structures). It
> also does a lot of puts in my own maps. The objects and maps in turn have
> references to arrays and some immutable objects.
> > >
> > > My question is, all these arrays and objects created in one function
> that has to do a ton of construction, are there any things to watchout for
> so that hotspot will try to create all the objects in this one
> function/thread colocated on the heap so that L1/L2 cache misses are
> reduced when the finite automata is executed against data?
> > >
> > > Ideally, someone could tell me that when my class constructors in turn
> creates new instances of other various size other objects and arrays, they
> are all colocated on the heap.
> > >
> > > Ideally, someone could tell me that when I have a looping function
> that creates alot of very small Linked List objects in succession, again
> they are colocated.
> > >
> > > In general, how does hotspot try with creating new objects to help the
> L1/L2 caches?
> > >
> > > By the way, I did a test port of my automata to C++ where for objects
> like the above, I had big memory chunks that my inplace constructors just
> subdivided the memory chunk that it owned so that all the subobjects were
> absolutely as colocated as possible.
> > >
> > > This C++ ported automata out-performed my java version by 5x in
> execution against data. And in cases where I tested the performance of
> construction-time cost of the automata where the comparison is between the
> hotspot new, versus my simple inplace C++ member functions which basically
> just return the current chunk cursor, after calculating the size of the
> object, and updating the chunk cursor to point beyond the new size, in
> those cases I saw 25x performance differences (5 yrs ago).
> >
> > TLAB allocations do the same pointer-bump in HotSpot. Do the 5x really
> come from co-located data? Did you measure it? And maybe you should redo
> your 25x experiment. 5 years is a long time...
> >
> > -- Chris
> >
> > >
> > > Andy
> >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20120917/f69e107c/attachment.html
More information about the hotspot-compiler-dev
mailing list