slow performance of loom continuations

Fri Sep 7 18:52:14 UTC 2018

ah, one thing that i omitted - the Xorshift example has no dependencies
other than loom. so running it is as simple as save it as a file (eg
"filename") and:

$loom/bin/java filename

JEP 330 is awesome :)

On Fri, Sep 7, 2018 at 2:42 PM, seth lytle <seth.lytle at gmail.com> wrote:

> > Could I recommend JMH?
>
> this isn't intended as a benchmark (kilim does have some but i haven't
> looked at them). but this example has been representative of how i use
> kilim (mostly simple state machines). performance of this example in kilim
> is roughly 25% slower than "pure" java, and that's what i see in my state
> machines too. and that's a price i'm happy to pay in exchange for scalable
> imperative code
>
> it seems premature to benchmark project loom at this point - it isn't
> feature complete yet (i'm guessing that tail calls will dramatically help
> this example). i assume that when it is, a suite of some sort will be
> developed. i'm just trying to verify that i'm using the api correctly, and
> to understand whether this is a use case that the team considers important
> long term
>
>
>
>
>
>
>
>
>
> On Fri, Sep 7, 2018 at 3:27 AM, ags <andrzej.grzesik at gmail.com> wrote:
>
>> Could I recommend JMH?
>>
>> On Fri, 7 Sep 2018 at 05:48, seth lytle <seth.lytle at gmail.com> wrote:
>>
>>> i ported a Xorshift implementation from kilim to project loom using the
>>> prototype that was announced in august. the Continuation apis are
>>> similar -
>>> the only changes are ctu.run() instead of run()
>>> and Continuation.yield(SCOPE) instead of kilim.Fiber.yield(), and the
>>> code
>>> runs and produces the same answers. however, performance with loom is on
>>> the order of 500x slower. i tried both f46dc5c01b7d (2% faster) and
>>> 544bfe4ccd84
>>>
>>> kilim: 32.30      nanos/op,           -4971871656801550503
>>> kilim: 36.99      nanos/op,            2357119851256129241
>>> kilim: 32.60      nanos/op,            8340372355868382387
>>>
>>> loom : 18303.18   nanos/op,           -4971871656801550503
>>> loom : 18107.61   nanos/op,            2357119851256129241
>>> loom : 18184.15   nanos/op,            8340372355868382387
>>>
>>>
>>> with -XX:+UnlockExperimentalVMOptions -XX:+UseNewCode
>>>
>>> loom : 2072.82    nanos/op,           -4971871656801550503
>>> loom : 1879.37    nanos/op,            2357119851256129241
>>> loom : 1841.58    nanos/op,            8340372355868382387
>>>
>>>
>>> am i using the api correctly ? i do use exceptions in other
>>> continuations,
>>> so UseNewCode isn't really an option for me
>>>
>>> i realize that this is an early prototype. do you have any idea what
>>> performance you're ultimately shooting for for this sort of loop ?
>>>
>>> is there a tradeoff between performance in these simple cases and being
>>> able to weave "most" production code ?
>>>
>>>
>>>
>>> here's the full example:
>>>
>>>
>>> public class Xorshift {
>>>     static final ContinuationScope SCOPE = new ContinuationScope() {};
>>>     Continuation ctu = new Continuation(SCOPE,this::execute);
>>>     long result;
>>>
>>>     void warmup(long num) {
>>>         long warmup = 5000000000L;
>>>         final long start = System.nanoTime();
>>>         long dummy = 0;
>>>         while (System.nanoTime() - start < warmup)
>>>             dummy = dummy ^ loop(num);
>>>         System.out.println("warmup: " + dummy);
>>>     }
>>>     void cycle(long num) {
>>>         final long start = System.nanoTime();
>>>         long val = loop(num);
>>>         long duration = System.nanoTime() - start;
>>>         System.out.format("loom : %-10.2f nanos/op, %30d\n",
>>> 1.0*duration/num, val);
>>>     }
>>>     public long loop(long num) {
>>>         long val = 0;
>>>         for (int ii=0; ii < num; ii++) {
>>>             ctu.run();
>>>             val = val ^ result;
>>>         }
>>>         return val;
>>>     }
>>>     public void execute() {
>>>         long x, y, s0=103, s1=17;
>>>         while (true) {
>>>             x = s0;
>>>             y = s1;
>>>             s0 = y;
>>>             x ^= (x << 23);
>>>             s1 = x ^ y ^ (x >> 17) ^ (y >> 26);
>>>             result = (s1 + y);
>>>             Continuation.yield(SCOPE);
>>>         }
>>>     }
>>>
>>>     public static void main(String[] args) {
>>>
>>>         long cycles = 200000;
>>>         int reps = 10;
>>>         if (args.length == 0) {
>>>             System.out.println("args: number of cycles, number of
>>> repeats");
>>>             System.out.format("\t no args provided using defaults: %d
>>> %d\n",cycles,reps);
>>>         }
>>>         try { cycles = Long.parseLong(args[0]); } catch (Exception ex) {}
>>>         try { reps = Integer.parseInt(args[1]); } catch (Exception ex) {}
>>>
>>>         new Xorshift().warmup(cycles);
>>>         Xorshift xor = new Xorshift();
>>>
>>>         for (int jj=0; jj < reps; jj++)
>>>             xor.cycle(cycles);
>>>     }
>>> }
>>>
>> --
>> ags
>>
>
>