slow performance of loom continuations

Rickard Bäckman rickard.backman at oracle.com
Thu Sep 13 12:55:22 UTC 2018


Hi,

glad to see someone trying out Loom. 
A couple of performance related patches have been going in today. 

To enable them all I think you need:
-XX:+UnlockDiagnosticVMOptions -XX:+UseNewCode -XX:+UseNewCode2 -XX:+UseNewCode3

-XX:-DetectLocksInCompiledFrames shouldn't be used anymore.

/R

On 09/07, seth lytle wrote:
> i ported a Xorshift implementation from kilim to project loom using the
> prototype that was announced in august. the Continuation apis are similar -
> the only changes are ctu.run() instead of run()
> and Continuation.yield(SCOPE) instead of kilim.Fiber.yield(), and the code
> runs and produces the same answers. however, performance with loom is on
> the order of 500x slower. i tried both f46dc5c01b7d (2% faster) and
> 544bfe4ccd84
> 
> kilim: 32.30      nanos/op,           -4971871656801550503
> kilim: 36.99      nanos/op,            2357119851256129241
> kilim: 32.60      nanos/op,            8340372355868382387
> 
> loom : 18303.18   nanos/op,           -4971871656801550503
> loom : 18107.61   nanos/op,            2357119851256129241
> loom : 18184.15   nanos/op,            8340372355868382387
> 
> 
> with -XX:+UnlockExperimentalVMOptions -XX:+UseNewCode
> 
> loom : 2072.82    nanos/op,           -4971871656801550503
> loom : 1879.37    nanos/op,            2357119851256129241
> loom : 1841.58    nanos/op,            8340372355868382387
> 
> 
> am i using the api correctly ? i do use exceptions in other continuations,
> so UseNewCode isn't really an option for me
> 
> i realize that this is an early prototype. do you have any idea what
> performance you're ultimately shooting for for this sort of loop ?
> 
> is there a tradeoff between performance in these simple cases and being
> able to weave "most" production code ?
> 
> 
> 
> here's the full example:
> 
> 
> public class Xorshift {
>     static final ContinuationScope SCOPE = new ContinuationScope() {};
>     Continuation ctu = new Continuation(SCOPE,this::execute);
>     long result;
> 
>     void warmup(long num) {
>         long warmup = 5000000000L;
>         final long start = System.nanoTime();
>         long dummy = 0;
>         while (System.nanoTime() - start < warmup)
>             dummy = dummy ^ loop(num);
>         System.out.println("warmup: " + dummy);
>     }
>     void cycle(long num) {
>         final long start = System.nanoTime();
>         long val = loop(num);
>         long duration = System.nanoTime() - start;
>         System.out.format("loom : %-10.2f nanos/op, %30d\n",
> 1.0*duration/num, val);
>     }
>     public long loop(long num) {
>         long val = 0;
>         for (int ii=0; ii < num; ii++) {
>             ctu.run();
>             val = val ^ result;
>         }
>         return val;
>     }
>     public void execute() {
>         long x, y, s0=103, s1=17;
>         while (true) {
>             x = s0;
>             y = s1;
>             s0 = y;
>             x ^= (x << 23);
>             s1 = x ^ y ^ (x >> 17) ^ (y >> 26);
>             result = (s1 + y);
>             Continuation.yield(SCOPE);
>         }
>     }
> 
>     public static void main(String[] args) {
> 
>         long cycles = 200000;
>         int reps = 10;
>         if (args.length == 0) {
>             System.out.println("args: number of cycles, number of repeats");
>             System.out.format("\t no args provided using defaults: %d
> %d\n",cycles,reps);
>         }
>         try { cycles = Long.parseLong(args[0]); } catch (Exception ex) {}
>         try { reps = Integer.parseInt(args[1]); } catch (Exception ex) {}
> 
>         new Xorshift().warmup(cycles);
>         Xorshift xor = new Xorshift();
> 
>         for (int jj=0; jj < reps; jj++)
>             xor.cycle(cycles);
>     }
> }


More information about the loom-dev mailing list