slow performance of loom continuations

seth lytle seth.lytle at gmail.com
Fri Sep 7 04:47:30 UTC 2018


i ported a Xorshift implementation from kilim to project loom using the
prototype that was announced in august. the Continuation apis are similar -
the only changes are ctu.run() instead of run()
and Continuation.yield(SCOPE) instead of kilim.Fiber.yield(), and the code
runs and produces the same answers. however, performance with loom is on
the order of 500x slower. i tried both f46dc5c01b7d (2% faster) and
544bfe4ccd84

kilim: 32.30      nanos/op,           -4971871656801550503
kilim: 36.99      nanos/op,            2357119851256129241
kilim: 32.60      nanos/op,            8340372355868382387

loom : 18303.18   nanos/op,           -4971871656801550503
loom : 18107.61   nanos/op,            2357119851256129241
loom : 18184.15   nanos/op,            8340372355868382387


with -XX:+UnlockExperimentalVMOptions -XX:+UseNewCode

loom : 2072.82    nanos/op,           -4971871656801550503
loom : 1879.37    nanos/op,            2357119851256129241
loom : 1841.58    nanos/op,            8340372355868382387


am i using the api correctly ? i do use exceptions in other continuations,
so UseNewCode isn't really an option for me

i realize that this is an early prototype. do you have any idea what
performance you're ultimately shooting for for this sort of loop ?

is there a tradeoff between performance in these simple cases and being
able to weave "most" production code ?



here's the full example:


public class Xorshift {
    static final ContinuationScope SCOPE = new ContinuationScope() {};
    Continuation ctu = new Continuation(SCOPE,this::execute);
    long result;

    void warmup(long num) {
        long warmup = 5000000000L;
        final long start = System.nanoTime();
        long dummy = 0;
        while (System.nanoTime() - start < warmup)
            dummy = dummy ^ loop(num);
        System.out.println("warmup: " + dummy);
    }
    void cycle(long num) {
        final long start = System.nanoTime();
        long val = loop(num);
        long duration = System.nanoTime() - start;
        System.out.format("loom : %-10.2f nanos/op, %30d\n",
1.0*duration/num, val);
    }
    public long loop(long num) {
        long val = 0;
        for (int ii=0; ii < num; ii++) {
            ctu.run();
            val = val ^ result;
        }
        return val;
    }
    public void execute() {
        long x, y, s0=103, s1=17;
        while (true) {
            x = s0;
            y = s1;
            s0 = y;
            x ^= (x << 23);
            s1 = x ^ y ^ (x >> 17) ^ (y >> 26);
            result = (s1 + y);
            Continuation.yield(SCOPE);
        }
    }

    public static void main(String[] args) {

        long cycles = 200000;
        int reps = 10;
        if (args.length == 0) {
            System.out.println("args: number of cycles, number of repeats");
            System.out.format("\t no args provided using defaults: %d
%d\n",cycles,reps);
        }
        try { cycles = Long.parseLong(args[0]); } catch (Exception ex) {}
        try { reps = Integer.parseInt(args[1]); } catch (Exception ex) {}

        new Xorshift().warmup(cycles);
        Xorshift xor = new Xorshift();

        for (int jj=0; jj < reps; jj++)
            xor.cycle(cycles);
    }
}


More information about the loom-dev mailing list