More fun with scopes and ScriptObjectMirror

Wed Dec 11 09:43:26 PST 2013

So probably I will take the same path as you guys and just do a pure JS 
require and wrap in anon. function. Nice and easy

Thanks for all your help!

On 11/12/13 14:27, A. Sundararajan wrote:
> avatar/js project passes the relevant node.js tests (tests that check 
> no leak of vars from module files). Apparently, none of the modules 
> have "var" less declarations.
>
> -Sundar
>
> On Wednesday 11 December 2013 07:13 PM, Tim Fox wrote:
>> On 11/12/13 13:28, A. Sundararajan wrote:
>>>
>>> Our emails crossed (again!).
>>
>> Hehe, we really must stop doing that! ;)
>>
>>> I suggested that option based on avatar/js code..
>>
>> fwiw, this is also the approach used in rhino-require 
>> https://github.com/micmath/Rhino-Require
>>
>> It's actually the first thing I considered because it's so simple, 
>> and it's pure JS so portable between engines. However this approach 
>> is flawed as it doesn't prevent the leakage of globals not declared 
>> using var, i.e.
>>
>> someglobal = 3;
>>
>> Which is what led me down the more complex route with explicitly 
>> manipulating scopes at the engine level...
>>
>> Do you guys not consider the leakage of non var globals not a big 
>> issue? Personally I ruled out this approach because of that, but 
>> maybe I should reconsider it (?)
>>
>> In the long term though, I think it would be nice if Nashorn provided 
>> the mechanism to implement require() with real isolation.
>>
>>
>>
>>
>>>
>>> Sundar
>>>
>>> On Wednesday 11 December 2013 06:40 PM, Tim Fox wrote:
>>>> On 11/12/13 12:53, Attila Szegedi wrote:
>>>>> On Dec 11, 2013, at 1:13 PM, Tim Fox <timvolpe at gmail.com> wrote:
>>>>>
>>>>>> Confused...
>>>>>>
>>>>>> I assumed that if two scripts where run with their own script 
>>>>>> context, then they would already have separate globals, i.e. if I do
>>>>>>
>>>>>> myglobal = 1
>>>>>>
>>>>>> in module 1 that won't be visible in module 2.
>>>>> That's true, but then you also end up with the need for 
>>>>> ScriptObjectMirrors between them, and that was what I suggested 
>>>>> you try to avoid.
>>>>>
>>>>>> So I'm not sure really what --global-per-engine really means, if 
>>>>>> the modules have their own globals anyway. I guess my 
>>>>>> understanding must be wrong somewhere.
>>>>> Well, it will mean that modules won't have their own globals… 
>>>>> --global-per-engine will make it so that the Global object is 
>>>>> specific to the ScriptEngine, and not to the ScriptContext, e.g. 
>>>>> even if you replace the ScriptContext of the engine, when scripts 
>>>>> are run, they'll still see the same global object as before the 
>>>>> replacement. The gist of it is:
>>>>>
>>>>> a) without --global-per-engine, the Global object is specific to a 
>>>>> ScriptContext, each ScriptContext has its own. ENGINE_SCOPE 
>>>>> Bindings object of the context is actually a mirror of its Global.
>>>>> b) with --global-per-engine, the Global object lives in the 
>>>>> ScriptEngine. ENGINE_SCOPE Bindings object of the context is just 
>>>>> a vanilla SimpleBindings (or whatever you set it to), and Global 
>>>>> will delegate property getters for non-existent properties to it 
>>>>> (but it'll still receive property setters, so new global variables 
>>>>> created by one script will be visible by another; no isolation 
>>>>> there).
>>>>>
>>>>> What I was suggesting is that your module loading code would look 
>>>>> something like:
>>>>>
>>>>> // The engine that you use
>>>>> ScriptEngine engine = new NashornScriptEngine().getEngine(new 
>>>>> String[] { "--global-per-engine" });
>>>>>
>>>>> ...
>>>>>
>>>>> // when loading a module
>>>>> Bindings moduleVars = new SimpleBindings();
>>>>> moduleVars.put("require", requireFn);
>>>>> moduleVars.put("module", moduleDescriptorObj);
>>>>> moduleVars.put("exports", exportsObj);
>>>>> Bindings prevBindings = engine.getBindings(ENGINE_SCOPE);
>>>>> engine.setBindings(moduleVars, ENGINE_SCOPE);
>>>>> try {
>>>>>      engine.eval(moduleSource);
>>>>> } finally {
>>>>>     engine.setBindings(prevBindings, ENGINE_SCOPE);
>>>>> }
>>>>> return exportsObj;
>>>>>
>>>>> NB: your modules would _not_ run in isolated globals. I thought 
>>>>> they do, but I just spoke to Sundar and he explained the mechanism 
>>>>> to me so now I see they won't -- see above the case b).
>>>>>
>>>>> They could pollute each other's global namespace (since it's 
>>>>> shared). Hopefully they'd adhere to Modules/Secure recommendation 
>>>>> and refrain from doing so, but you can't really enforce it.
>>>>>
>>>>> My require() implementation in Rhino could provide real isolation, 
>>>>> but this is unfortunately impossible in Nashorn. Nashorn makes an 
>>>>> assumption that the global object during execution of a script is 
>>>>> an instance of jdk.nashorn.internals.objects.Global; in Rhino, it 
>>>>> could have been anything so there I was able to run a module in a 
>>>>> new scope that had the actual Global object as its prototype, so 
>>>>> it could catch all variable assignments in itself and essentially 
>>>>> make the Global read only (albeit objects in it mutable - e.g. a 
>>>>> module could still extend Array prototype etc.).
>>>>>
>>>>> In Nashorn, it's the other way round with "--global-per-engine" - 
>>>>> Global object is the one immediately visible to scripts, and 
>>>>> ENGINE_SCOPE Bindings object is used as source of properties that 
>>>>> aren't found in the Global. Here, Global catches variable 
>>>>> assignments and ENGINE_SCOPE Bindings object ends up being 
>>>>> immutable (although objects in it are obviously still mutable, so 
>>>>> the module can build up its "exports" object).
>>>>>
>>>>> Basically, you have a choice between having shared globals (no 
>>>>> isolation) without mirrors (and then you won't run into any issues 
>>>>> with mirrors), or separate globals (with real isolation), but then 
>>>>> also mirrors, and then you might run into limitations of mirrors 
>>>>> (e.g. they can't be automatically used as Runnable etc. callbacks 
>>>>> from Java and so on.)
>>>>
>>>> I also considered a third option - executing all modules in a 
>>>> single global, but wrapping the module code in a function to hide 
>>>> any top level globals declared as vars, e.g. if module is:
>>>>
>>>> var someglobal = "hello";
>>>> module.exports = someglobal;
>>>>
>>>> after wrapping it becomes:
>>>>
>>>> (function(module) {
>>>>   var someglobal = "hello");
>>>>   module.exports = someglobal;
>>>> })();
>>>>
>>>> which hides someglobal.
>>>>
>>>> However this doesn't work with modules that use globals by omitting 
>>>> var, e.g.
>>>>
>>>> someglobal = "hello";
>>>> module.exports = someglobal;
>>>>
>>>> Now, there are far fewer CommonJS modules  which use globals 
>>>> without var (as it's bad practice) but still enough to make this 
>>>> not a good option either :(
>>>>
>>>>
>>>>
>>>>>
>>>>> I'm not trying to justify or otherwise qualify any of the design 
>>>>> decisions here, just trying to help you understand its constraints.
>>>>>
>>>>> Attila.
>>>>>
>>>>
>>>
>>
>