Truffle: Uncommon Traps in Graal graph code, TimSort

Mon Dec 30 11:13:18 PST 2013

Hi Stefan,

I don't know exactly where the uncommon traps are coming from. Maybe
someone else can have a look at it? I am also not sure if this is in fact a
problem.

Regarding the inlining heuristic. At the moment the Truffle inlining
heuristic relies on re-profiling of the CallTarget after each inlining
step. This requires the compilation to be delayed for some iterations and
often this has very unfortunate effects on the timing of compilations. As
you said its propagating up the call tree, which is clearly not what we
want. As with TruffleSOM this effect as you mentioned may be even bigger
due to the number of methods used even for simple constructs. As of now the
heuristic was primarily focused on peak performance and not on startup. I
also suspect that startup time may get better as soon as TruffleSOM
compiles more specialized code.

We are currently working on solutions to improve inlining startup and we
expect to improve it by factors (its not tuned for that at all). We also
plan to expose some additional language specific tuning opportunities for
the inlining heuristic. There are already some interesting options
available which you can experiment with. For a complete list see
class TruffleCompilerOptions.

To debug the inline decisions you can put a breakpoint in
TruffleInliningImpl#InliningPolicy#isWorthInlining. It may also be the case
that your implementation of the InlininableCallSite interface may return
wrong or incomplete data which may also result in a bad inlining decision.

To improve this situation a bit I recommend you (this is also what I think
Chris Seaton with Ruby is doing) to force inline core methods like
whileTrue (Don't know if TruffleSOM already does that). Please see the
inlineImmediatly flag in SL FunctionRootNode on how forced inlining can be
done. (see also CallNode#UninitializedCallNode#specialize)

Cheers and a happy new year,

- Christian Humer

On Mon, Dec 30, 2013 at 6:41 PM, Stefan Marr <java at stefan-marr.de> wrote:

> Hi:
>
> I’ll report later on all the changes in TruffleSOM. So just briefly: it is
> slowly getting faster.
>
> My actual question is related to strange behavior I am seeing.
>
> On benchmarks like DeltaBlue, Richards, and also:
> ./mx.sh --vm server vm -G:+TraceTruffleExpansion
> -G:+TraceTruffleExpansionSource -XX:+TraceDeoptimization
> -G:-TruffleBackgroundCompilation -G:+TraceTruffleCompilationDetails
> -Xbootclasspath/a:../som/build/classes:../som/libs/truffle.jar
> som.vm.Universe -cp ../som/Smalltalk:../som/Examples/Benchmarks/Richards
> ../som/Examples/Benchmarks/BenchmarkHarness.som FieldLoop 10 10 6000
>
> [Please note in the command line, I adopted the truffle.jar, which changes
> the line slightly compared to previous examples posted.]
>
> I see a lot of:
>
> Uncommon trap occurred in com.oracle.graal.nodes.MergeNode$1::apply
> Uncommon trap occurred in
> com.oracle.graal.graph.iterators.NodePredicates$AndPredicate::apply
> Uncommon trap occurred in
> com.oracle.graal.virtual.phases.ea.PartialEscapeClosure$MergeProcessor::mergeObjectStates
> Uncommon trap occurred in com.oracle.graal.lir.LIRIntrospection::forEach
> Uncommon trap occurred in java.util.TimSort::sort
>
> and similar traps.
>
> This looks strange to me. Is this supposed to happen?
>
> One thing I feel might be the issue is that the benchmarks cause many
> recompilations because of long phases of specialization. From my subjective
> feeling, the inlining in TruffleSOM is rather slow, i.e., happens rather
> later. And it proper gates up the call tree which leads to many
> re-inlining/re-specialization of in the middle and at the bottom of the
> tree. So, reaching steady state takes pretty long. And I am talking here
> about many many iterations before it stabilizes, which results in long
> warmup times rather on the side of minutes than seconds. Something as
> simple as the WhileLoop benchmarks takes 30s.
>
> So, that’s perhaps unrelated to the original question of whether those
> uncommon traps in Graal code should happen. For the warmup, could it be
> that Smalltalk’s  ‘make a method not longer than 7 lines’ could require
> different thresholds for inlining etc.?
>
> Thanks and best regards
> Stefan
>
> --
> Stefan Marr
> Software Languages Lab
> Vrije Universiteit Brussel
> Pleinlaan 2 / B-1050 Brussels / Belgium
> http://soft.vub.ac.be/~smarr
> Phone: +32 2 629 2974
> Fax:   +32 2 629 3525
>
>