finite automata and L1 and L2 cache misses

Tue Sep 18 06:51:52 UTC 2012

Hi. I've implemented Daciuk-Mihov's algorithm for FSA construction
from sorted data once and it was a few times faster than the native
implementation in C++ by the original author. It's a matter of
tradeoffs, really. If you want speed you'll have to go with low-level
data structures. Simulate objects on (possibly parallel) arrays of
primitives or a bytebuffer. I realize it's a pain but the speedup
should be significant.

My implementation is part of the morfologik project here:
http://sourceforge.net/projects/morfologik/

There is another implementation, this time an FST, based on similar
principles (authored by many people from the Apache Lucene project),
look for FSTBuilder etc.:
http://lucene.apache.org/core/

I guess my point here is that the Java implementation comes with
tradeoffs -- if you're using objects and utilize GC heavily (the
objects are not statically allocated once, for example) then you'll
pay the price. If you're using low-level primitives then... well, it's
much like coding in C or assembly, really, so unless you have a strong
motivation to do it (a library) I'd rather consider going through JNI
or simply upgrading your hardware :)

Dawid

On Tue, Sep 18, 2012 at 12:34 AM, Andy Nuss <andrew_nuss at yahoo.com> wrote:
> Hi,
>
> My application creates and executes big finite automata, which are so costly
> to construct that it is worth caching them for reuse.  Thus they are likely
> to persist thru gc cycles.
>
> My automata consist of arrays of ints, arrays of objects, and various small
> objects of different class types.  What strategies can I take to ensure that
> these are co-located on the heap to avoid L1/L2 cache misses during automata
> execution?  If I succeed in this, how can I be confident that they will
> still be co-located AFTER gc compaction?
>
> One idea is to have the cache hold a wrapper object rather than the graph
> reference itself.  The wrapper object would hold references to all the
> individual components of the graph, as well as the graph.  Would this help
> the gc cycle continue to co-locate the element objects of the graph?
>
> Andy