Can't get Multithreaded Nashorn uses to Scale

Wed Dec 7 15:01:27 UTC 2016

Hi Jesus, 

I’m trying to reproduce the problem, and just want to make sure I get the missing pieces right. 

You already showed us how you’re setting up the engine and the JS code you’re running. I assume the JSON code you’re parsing is a simple array of objects? And you’re just calling Invocable.invokeFunction on the ScriptEngine from multiple threads in parallel, right?

Thanks,
Hannes 

> Am 07.12.2016 um 00:03 schrieb Jesus Luzon <jluzon at riotgames.com>:
> 
> When we share one invocable across many threads and run invokeFunction it
> happens, such as this:
> 
> ExecutorService executor = Executors.newFixedThreadPool(50);
>> 
>>        Invocable invocable = generateInvocable(script);
>> 
>>        AtomicLong count = new AtomicLong();
>> 
>>        for (int i = 0; i < 50; i++) {
>> 
>>            executor.submit(new Runnable() {
>> 
>>                @Override
>> 
>>                public void run() {
>> 
>>                    try {
>> 
>>                            while(true) {
>> 
>>                                invocable.invokeFunction("transform",
>>> something);
>> 
>>                                count.incrementAndGet();
>> 
>>                            }
>> 
>>                        } catch (NoSuchMethodException | ScriptException
>>> e) {
>> 
>>                            e.printStackTrace();
>> 
>>                        }
>> 
>>                    }
>> 
>>            });
>> 
>>        }
>> 
>> 
> 
> 
> On Tue, Dec 6, 2016 at 2:59 PM, Jim Laskey (Oracle) <james.laskey at oracle.com
>> wrote:
> 
>> Intersting.  The example you posted demonstrates this behaviour?  If so
>> I’ll file a bug and dig in.  It sounds like an object is being reused
>> across invocations and accumulating changes to the property map.
>> 
>> — Jim
>> 
>> 
>> On Dec 6, 2016, at 5:12 PM, Jesus Luzon <jluzon at riotgames.com> wrote:
>> 
>> With more threads you are impacting the same 8 cores, so it will taper off
>>> after 8 threads.  If it’s a 2x4 core machine then I can see 4 being a
>>> threshold depending on System performance.  Transport: I meant if you were
>>> using sockets to provide the script.
>> 
>> This makes sense. This one's on me then.
>> 
>> 
>>> So you are using the same invocable instance for all threads?  If so,
>>> then you are probably good to go.  As far as leaks are concerned, not sure
>>> how you would get leaks from Nashorn.  The JSON object is written in Java,
>>> and little JavaScript involved.
>> 
>> 
>> 
>>> In your example, pull up Invocable invocable = generateInvocable(script);
>>> out of the loop and use the same invocable for all threads.
>> 
>> 
>> We were using one invocable across all threads and we were getting
>> slowdowns on execution, high CPU Usage and memory leaks that led to
>> OutOfMemory errors. I could trace the leak to
>> 
>> jdk.nashorn.internal.objects.Global -> *objectSpill* Object[8] ->
>> jdk.nashorn.internal.scripts.JO4 -> *arrayData*
>> jdk.nashorn.internal.runtime.arrays.SparseArraysData -> *underlying*
>> jdk.nashorn.internal.runtime.arrays.DeletedArrayFilter
>> 
>> which just keeps growing forever.
>> 
>> On Tue, Dec 6, 2016 at 6:30 AM, Jim Laskey (Oracle) <
>> james.laskey at oracle.com> wrote:
>> 
>>> 
>>> On Dec 6, 2016, at 9:56 AM, Jesus Luzon <jluzon at riotgames.com> wrote:
>>> 
>>> The cost of creating a new engine is significant.  So share an engine
>>>> across threads but use *eval
>>>> <https://docs.oracle.com/javase/7/docs/api/javax/script/ScriptEngine.html#eval(java.lang.String,%20javax.script.ScriptContext)>*
>>>> (String
>>>> <https://docs.oracle.com/javase/7/docs/api/java/lang/String.html>
>>>> script, ScriptContext
>>>> <https://docs.oracle.com/javase/7/docs/api/javax/script/ScriptContext.html>
>>>> context) instead, separate context per execution.  If your JavaScript
>>>> code does not modify globals you can get away with using the same engine,
>>>> same compiled script on each thread.
>>> 
>>> 
>>> I guess there's a few things here I don't understand. One thing I'm
>>> trying to do is sharing a CompiledScript (which is why I'm using
>>> invocable). Also, what exactly does modify globals mean? All our filters do
>>> the same thing, make a function that takes a JSON String, turns it into a
>>> JSON, modifies it and then stringifies it back. No state is changed of
>>> anything else but there are temporary vars created inside the scope of the
>>> function. When we run this multithreaded, running invokeFunction slows down
>>> significantly and we get crazy memory leaks.
>>> 
>>> 
>>> So you are using the same invocable instance for all threads?  If so,
>>> then you are probably good to go.  As far as leaks are concerned, not sure
>>> how you would get leaks from Nashorn.  The JSON object is written in Java,
>>> and little JavaScript involved.
>>> 
>>> 
>>> Of course there are many factors involved n performance.  How many cores
>>>> do you have on the test machine?  How much memory in the process?  What
>>>> transport are you using between threads?  That sort of thing.  Other than
>>>> constructing then engine and context Nashorn performance should scale.
>>> 
>>> I'm using an 8 core machine to test with 2.5Gs of RAM allocated to the
>>> process. Not sure what transports between threads means, but this is the
>>> code I'm benchmarking with. Increasing the number of threads actually makes
>>> it go faster until about 4 threads, then adding more threads takes the same
>>> amount to get to 1000 and and after a certain point it is just slower to
>>> get to 1000 counts. Some of our filters need to be able to run over 1000
>>> times a second (across all threads) and the fastest time I could actually
>>> get with this was about 2.4 seconds for a 1000 counts.
>>> 
>>>>        ExecutorService executor = Executors.newFixedThreadPool(50);
>>>> 
>>>>        AtomicLong count = new AtomicLong();
>>>> 
>>>>        for (int i = 0; i < 50; i++) {
>>>> 
>>>>            executor.submit(new Runnable() {
>>>> 
>>>>                @Override
>>>> 
>>>>                public void run() {
>>>> 
>>>> 
>>>>>                        try {
>>>> 
>>>>                            Invocable invocable =
>>>>> generateInvocable(script);
>>>> 
>>>>                            while(true) {
>>>> 
>>>>                                invocable.invokeFunction("transform",
>>>>> something);
>>>> 
>>>>                                count.incrementAndGet();
>>>> 
>>>>                            }
>>>> 
>>>>                        } catch (NoSuchMethodException | ScriptException
>>>>> e) {
>>>> 
>>>>                            e.printStackTrace();
>>>> 
>>>>                        }
>>>> 
>>>>                    }
>>>> 
>>>>            });
>>>> 
>>>>        }
>>>> 
>>>>        long lastTimestamp = System.currentTimeMillis();
>>>> 
>>>>        while(true) {
>>>> 
>>>> 
>>>>>            if (count.get() > 1000) {
>>>> 
>>>>                count.getAndAdd(-1000);
>>>> 
>>>>                System.out.println((System.currentTimeMillis() -
>>>>> lastTimestamp)/1000.0);
>>>> 
>>>>                lastTimestamp = System.currentTimeMillis();
>>>> 
>>>>            }
>>>> 
>>>>        }
>>>> 
>>>> 
>>> With more threads you are impacting the same 8 cores, so it will taper
>>> off after 8 threads.  If it’s a 2x4 core machine then I can see 4 being a
>>> threshold depending on System performance.  Transport: I meant if you were
>>> using sockets to provide the script.
>>> 
>>> In your example, pull up Invocable invocable = generateInvocable(script);
>>> out of the loop and use the same invocable for all threads.
>>> 
>>> - Jim
>>> 
>>> 
>>> 
>>> 
>>> On Tue, Dec 6, 2016 at 5:31 AM, Jim Laskey (Oracle) <
>>> james.laskey at oracle.com> wrote:
>>> 
>>>> 
>>>> On Dec 6, 2016, at 9:19 AM, Jesus Luzon <jluzon at riotgames.com> wrote:
>>>> 
>>>> Hey Jim,
>>>> 
>>>> I looked at it and I will look into loadWithNewGlobal to see what
>>>> exactly it does since it could be relevant. As for the rest, for my use
>>>> case having threads in the JS would not help. We're using Nashorn to build
>>>> JSON filters in a Dynamic Proxy Service and need any of the threads
>>>> processing a request to be able to execute the script to filter.
>>>> 
>>>> 
>>>> The cost of creating a new engine is significant.  So share an engine
>>>> across threads but use *eval
>>>> <https://docs.oracle.com/javase/7/docs/api/javax/script/ScriptEngine.html#eval(java.lang.String,%20javax.script.ScriptContext)>*
>>>> (String
>>>> <https://docs.oracle.com/javase/7/docs/api/java/lang/String.html>
>>>> script, ScriptContext
>>>> <https://docs.oracle.com/javase/7/docs/api/javax/script/ScriptContext.html>
>>>> context) instead, separate context per execution.  If your JavaScript
>>>> code does not modify globals you can get away with using the same engine,
>>>> same compiled script on each thread.
>>>> 
>>>> 
>>>> Also, when you say a new engine per threads is the worst case what
>>>> exactly do you mean? I would expect an initial cost of compiling the script
>>>> on each thread and then each engine should be able to do its own thing, but
>>>> what I'm seeing is that when running with more than 10 threads all my
>>>> engines get slow at executing code, even though they are all completely
>>>> separate from each other.
>>>> 
>>>> 
>>>> Of course there are many factors involved n performance.  How many cores
>>>> do you have on the test machine?  How much memory in the process?  What
>>>> transport are you using between threads?  That sort of thing.  Other than
>>>> constructing then engine and context Nashorn performance should scale.
>>>> 
>>>> 
>>>> On Tue, Dec 6, 2016 at 5:07 AM, Jim Laskey (Oracle) <
>>>> james.laskey at oracle.com> wrote:
>>>> 
>>>>> Jesus,
>>>>> 
>>>>> Probably the most informative information is in this blog.
>>>>> 
>>>>> https://blogs.oracle.com/nashorn/entry/nashorn_multi_threading_and_mt
>>>>> 
>>>>> This example uses Executors but threads would work as well.
>>>>> 
>>>>> I did a talk that looked at different methods to max out multithreading
>>>>> performance.  A new engine per thread is the worst case.  A new context per
>>>>> thread does much better.  A new global per thread is the best while
>>>>> remaining thread safe.
>>>>> 
>>>>> Cheers,
>>>>> 
>>>>> — Jim
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Dec 6, 2016, at 8:45 AM, Jesus Luzon <jluzon at riotgames.com> wrote:
>>>>> 
>>>>> Hey folks,
>>>>> 
>>>>> I've tried many different ways of using Nashorn multithreaded based on
>>>>> what
>>>>> I've found on the internet and I still can't get a single one to scale.
>>>>> Even the most naive method of making many script engines with my script
>>>>> tends to bottleneck itself when I have more than 10 threads invoking
>>>>> functions.
>>>>> 
>>>>> I'm using the following code to compile my script and
>>>>> invocable.invokeFunction("transform", input) to execute:
>>>>> 
>>>>>   static Invocable generateInvocable(String script) throws
>>>>> ScriptException {
>>>>>       ScriptEngineManager manager = new ScriptEngineManager();
>>>>>       ScriptEngine engine =
>>>>> manager.getEngineByName(JAVASCRIPT_ENGINE_NAME);
>>>>>       Compilable compilable = (Compilable) engine;
>>>>>       final CompiledScript compiled = compilable.compile(script);
>>>>>       compiled.eval();
>>>>>       return (Invocable) engine;
>>>>>   }
>>>>> 
>>>>> 
>>>>> 
>>>>> The script I'm compiling is:
>>>>> 
>>>>>       String script = "function transform(input) {" +
>>>>>               "var result = JSON.parse(input);" +
>>>>>               "response = {};\n" +
>>>>>               "for (var i = 0; i < result.length; i++) {\n" +
>>>>>               "    var summoner = {};\n" +
>>>>>               "    summoner.id = result[i].id;\n" +
>>>>>               "    summoner.name = result[i].name;\n" +
>>>>>               "    summoner.profileIconId =
>>>>> result[i].profileIconId;\n" +
>>>>>               "    summoner.revisionDate = result[i].revisionDate;\n" +
>>>>>               "    summoner.summonerLevel = result[i].level;\n" +
>>>>>               "    response[summoner.id] = summoner;\n" +
>>>>>               "}\n" +
>>>>>               "result = response;" +
>>>>>               "return JSON.stringify(result);" +
>>>>>               "};";
>>>>> 
>>>>> 
>>>>> 
>>>>> I've also tried other more scaleable ways to work with scripts
>>>>> concurrently, but given that this is the most naive method where
>>>>> everything
>>>>> is brand new and I still get slowness calling them concurrently I fear
>>>>> that
>>>>> maybe I'm overlooking something extremely basic on my code.
>>>>> 
>>>>> Thanks.
>>>>> -Jesus Luzon
>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>> 
>>