Nashorn performance regression from JDK8u5 to JDK8u25?

Thu Feb 19 03:01:38 UTC 2015

Hannes/Marcus,

We haven’t heard any updates on this in the past 2 months or so. Was there
some progress made during that time?

Regards,
Bernard Liang

On 12/7/14, 10:22 PM, "Bernard Liang" <bliang at linkedin.com> wrote:

>Hannes,
>
>I tried out your methodology with the basic template + json that I
>provided earlier, as well as with the substantially larger templates +
>json I mentioned. In this pure form of the test, I did see that in the
>long term, u40 (with -Dnashorn.fields.objects=true; without this it was
>indeed much worse like you found) and u25 does seem to outperform u5 by a
>noticeable margin (with u40 slightly edging out u25), and with all
>versions requiring 10,000-15,000 iterations to reach their peak
>performance. However, in the first roughly 8,000-10,000 iterations, they
>were up to 2x slower than u5 in terms of time taken, with the average
>closer to 1.5x. Is this expected, and could it be remedied somehow? As it
>stands, the later versions do seem to have superior long-term performance,
>but we would have to accept the penalty of relatively poorer performance
>during quite a long initial phase. You can reproduce the same results with
>the basic templates.
>
>Regards,
>Bernard Liang
>
>On 12/1/14, 6:41 AM, "Hannes Wallnoefer" <hannes.wallnoefer at oracle.com>
>wrote:
>
>>Hi Bernard,
>>
>>Sorry for sending this message a second time, first one was rejected by
>>nashorn-dev because of the attachment.
>>
>>I looked at performance of dust.js with various versions of Nashorn. I
>>didn't see a regression between 8u05 and 8u25, quite the contrary, it
>>seems that u25 is about 30% faster after warmup. However, it does look
>>like startup and warmup is a bit more inconsistent with u25. It
>>sometimes takes a bit longer to reach final speed.
>>
>>With 8u40 I noticed a pretty dramatic performance regression right away,
>>it's about 3-4x slower than 8u05 and 8u25. The culprit turns out to be
>>our dual fields representation which is turned on by default in u40.
>>When that feature is disabled, u40 is even a bit faster than u25, and
>>also warmup seems to be smoother again.
>>
>>You can test running with dual fields disabled by setting the
>>"nashorn.fields.objects" system property to "true", for example by
>>adding -Dnashorn.fields.objects=true to the command line.
>>
>>I'm pasting in the script I used for benchmarking below. You can run it
>>along with the templates you sent with the following command line:
>>
>> > jjs  -Dnashorn.fields.objects=true -scripting dust-bench.js
>>
>>Let me know if my benchmark does not measure what is relevant to your
>>application. It's possible for example that you're repeatedly calling
>>into nashorn script engines and there's some potential issues with that
>>(you should reuse the same script engine if possible).
>>
>>Regards,
>>Hannes
>>
>>===== begin dust-bench.js =====
>>
>>if (!this.readFully) readFully = read;
>>
>>load("dist/dust-full.js");
>>var template = readFully("basic.tl");
>>var compiled = dust.compile(template, "basic");
>>print("compiled");
>>dust.loadSource(compiled);
>>print("loaded");
>>var json = JSON.parse(readFully("basic.json"));
>>print("read json");
>>print();
>>
>>
>>function bench(n) {
>>   var start = Date.now();
>>   for (var i = 0; i < 2000; i++) {
>>     dust.render("basic", json, function(err, out) {
>>       if (err) throw err;
>>     });
>>   }
>>
>>   print(n, "done", Date.now() - start);
>>}
>>
>>for (var i = 0; i < 200; i++) {
>>   bench(i);
>>}
>>
>>===== end dust-bench.js =====
>>
>>Am 2014-11-25 um 20:28 schrieb Bernard Liang:
>>> Hannes,
>>>
>>> Sure, I¹ve reproduced the data you requested below, as they are small
>>> enough to fit comfortably in a standard message. However, in addition I
>>> would still like to reiterate the questions from my previous reply,
>>> regarding across-build Nashorn-related changes and performance
>>>benchmarks.
>>>
>>> Your reply also suggests that you have some familiarity with Dust, but
>>>if
>>> that is not the case, I can certainly provide some more details.
>>>
>>> Regards,
>>> Bernard Liang
>>>
>>> ===== begin basic.tl =====
>>> <html>{~n}
>>> <head>{~n}
>>>    <title>{page_title}</title>{~n}
>>> </head>{~n}
>>> <body>{~n}
>>>    <ul>{~n}
>>>    {#names}
>>>      <li>{title} {name}</li>{~n}
>>>    {/names}
>>>    </ul>{~n}
>>> </body>{~n}
>>> </html>
>>>
>>> ===== end basic.tl =====
>>> ===== begin basic.json =====
>>> {
>>>    "page_title": "Benchmark",
>>>    "title": "Sir",
>>>    "names": [{
>>>      "name": "Moe"
>>>    },
>>>    {
>>>      "name": "Larry"
>>>    },
>>>    {
>>>      "name": "Curly"
>>>    }]
>>> }
>>> ===== end basic.json =====
>>>
>>>
>>> On 11/25/14, 2:30 AM, "Hannes Wallnoefer"
>>><hannes.wallnoefer at oracle.com>
>>> wrote:
>>>
>>>> Hi Bernard,
>>>>
>>>> I'm trying to reproduce your problems. In your first mail to the list
>>>> you wrote you attached a Dust template and JSON data. I think you
>>>>either
>>>> forgot to attach it, or it was stripped by the list software. Can you
>>>> please try to send it again, or put it somewhere we can download it?
>>>>
>>>> Thanks,
>>>> Hannes
>>>>
>>>> Am 2014-11-18 um 03:21 schrieb Bernard Liang:
>>>>> Michel et al,
>>>>>
>>>>> I¹ve run our local test battery using the link you provided, and
>>>>>while
>>>>> in some cases there is improvement, overall the performance still
>>>>>seems
>>>>> to be closer to u25 levels than u5 levels. For what it¹s worth, I did
>>>>> notice that the performance improvements from u25 to u40 were
>>>>>generally
>>>>> better in pooled environments than ones where a single instance of
>>>>>the
>>>>> execution environment was running per thread. This leads itself to a
>>>>>few
>>>>> questions, some of which are reiterated from the original inquiry:
>>>>>
>>>>>     *   Is anyone familiar with (significant) specific changes in the
>>>>> Nashorn libraries from u5 => u25 => u40 that might be related to this
>>>>> regression and could explain the u25 and/or u40 changes in more
>>>>>detail
>>>>> (that might have led to the recommendation to use u40)?
>>>>>     *   Do you have any performance suites (internal or other) that
>>>>>test
>>>>> various Nashorn benchmarks across different releases (of JDK8, for
>>>>> instance)? Do the results of those correlate with our findings?
>>>>>
>>>>> Regards,
>>>>> Bernard Liang
>>>>>
>>>>> PS. The output of `java -version` most recently tested was as
>>>>>follows:
>>>>>
>>>>> java version "1.8.0_40-ea"
>>>>>
>>>>> Java(TM) SE Runtime Environment (build 1.8.0_40-ea-b12)
>>>>>
>>>>> Java HotSpot(TM) 64-Bit Server VM (build 25.40-b16, mixed mode)
>>>>>
>>>>> Previous versions tested:
>>>>>
>>>>> java version "1.8.0_25"
>>>>>
>>>>> Java(TM) SE Runtime Environment (build 1.8.0_25-b17)
>>>>>
>>>>> Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)
>>>>>
>>>>> java version "1.8.0_05"
>>>>>
>>>>> Java(TM) SE Runtime Environment (build 1.8.0_05-b13)
>>>>>
>>>>> Java HotSpot(TM) 64-Bit Server VM (build 25.5-b02, mixed mode)
>>>>>
>>>>> From: Michel Trudeau
>>>>> <michel.trudeau at oracle.com<mailto:michel.trudeau at oracle.com>>
>>>>> Date: Monday, November 17, 2014 at 1:12 PM
>>>>> To: Bernard Liang <bliang at linkedin.com<mailto:bliang at linkedin.com>>
>>>>> Cc: 
>>>>>"nashorn-dev at openjdk.java.net<mailto:nashorn-dev at openjdk.java.net>"
>>>>> <nashorn-dev at openjdk.java.net<mailto:nashorn-dev at openjdk.java.net>>
>>>>> Subject: Re: Nashorn performance regression from JDK8u5 to JDK8u25?
>>>>>
>>>>> Bernard,
>>>>>
>>>>> It'd be great if you could try the latest 8u40 stable build.   We are
>>>>> planning to release 8u40 early in the new year.
>>>>>
>>>>> https://jdk8.java.net/download.html
>>>>>
>>>>> We also have an optional optimizer in 8u40, enable it with the
>>>>>command
>>>>> line argument '-ot'.
>>>>>
>>>>> Thanks,
>>>>> Michel
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Bernard Liang wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> After running some performance tests on the Cartesian product of
>>>>> ([JDK8u5, JDK8u25] x [simple template, complex templates] x
>>>>> [all-or-nothing, streaming chunks] x [single dust instance per
>>>>>thread,
>>>>> pooled dust instances] x [blank Dust instances, Dust instances with
>>>>> templates preloaded]), we find that JDK8u25 performance is very
>>>>> consistently considerably worse than JDK8u5 (by roughly 10-100%, with
>>>>> the average falling somewhere between there). The relevant code has
>>>>>been
>>>>> executed enough times (on the order of 10,000 times) to reach
>>>>>reasonably
>>>>> warmed-up states. If some of the items on the axes of the Cartesian
>>>>> product don¹t make much sense, you can ignore the fuzzy parts of the
>>>>> detailed breakdown for now, with the general understanding that
>>>>>various
>>>>> different environments have been tested and shown to yield the same
>>>>> results.
>>>>>
>>>>> Some additional high-level context:
>>>>>
>>>>> Dust is basically a templating language used to render JSON data into
>>>>> HTML with compilable ³templates": https://github.com/linkedin/dustjs
>>>>>(we
>>>>> are at the v2.4.2 tag)
>>>>>
>>>>> ³simple template² = ~150 bytes each of one (precompiled) template +
>>>>> context JSON (attached)
>>>>> ³complex templates² = ~350 compiled templates spanning ~245KB in
>>>>> compiled JS + ~75KB of JSON context (proprietary data)
>>>>>
>>>>> >From
>>>>> https://blogs.oracle.com/nashorn/entry/latest_news_from_the_nashorn,
>>>>>it
>>>>> sounded like there were some recent updates made to Nashorn
>>>>>performance
>>>>> around the u20 mark, but that seems to have caused a regression
>>>>>rather
>>>>> than an improvement. Is this something that nashorn-dev is aware of?
>>>>>Is
>>>>> there any way we can help diagnose the issue further using publicly
>>>>>safe
>>>>> data? (If you¹re looking for a way to reproduce this, the attached
>>>>>basic
>>>>> Dust template + JSON context should be adequate under almost any
>>>>> environment.)
>>>>>
>>>>> Regards,
>>>>> Bernard Liang
>>>>>
>>
>