Nashorn parser directives

Fri Nov 27 19:56:29 UTC 2015

Hi Jim, Sundararajan,

Thanks for your invitation and welcoming response. My spare time will be 
scarce in December, but I would love to contribute to improving Nashorn 
in the new year.

Kind regards,
Anthony

On 27/11/2015 17:50, Sundararajan Athijegannathan wrote:
> In addition, setting parser [or other] options locally inside a script 
> may not work - at least with the current model of option handling. 
> Currently, options are parsed and a class with final fields 
> (ScriptEnvironment) is initialized during engine initialization. This 
> allows for efficient option checks elsewhere - just checking a final 
> boolean/int fields.  Letting delayed option handling => we need 
> another, possibly slower (maps?) option handling mechanism.
>
> That said, if there is a better/smarter way to handle these, we'd 
> definitely welcome as Jim mentioned.
>
> Thanks,
> -Sundar
>
> On 11/27/2015 10:12 PM, Jim Laskey (Oracle) wrote:
>> Anthony,
>>
>> Your argument is not unreasonable, but given the resources we have 
>> available to work on features (including full ES6 support), it is 
>> simpler for us to be binary on this.  If, on the other hand, you sign 
>> an OCA and submit the necessary changes, we might consider 
>> integrating after review.  You have made requests for features in the 
>> past and we would like to give you a opportunity contribute to 
>> improving Nashorn.
>>
>> Cheers,
>>
>> — Jim
>>
>>
>>
>>
>>> On Nov 27, 2015, at 11:50 AM, Anthony Vanelverdinghe 
>>> <anthony.vanelverdinghe at gmail.com> wrote:
>>>
>>> Hi
>>>
>>> Recently I filed an issue [1] requesting to "Allow to run shebang 
>>> script with --no-syntax-extensions". This was closed as "Won't Fix" 
>>> with "Can't support because # is an extension." While this is true, 
>>> I meant to say "Allow to run shebang script without allowing any 
>>> other syntax extensions than shebang". Or more generally: allow 
>>> fine-grained control over syntax extensions.
>>>
>>> Currently, there are only 2 switches that influence syntax 
>>> extensions: --no-syntax-extensions and -scripting. However, the 
>>> nature of the various extensions is totally disparate: allowing "for 
>>> each" expressions has nothing to do with allowing a shebang. 
>>> Therefore, it should be possible to enable/disable them 
>>> individually. Additionally, there could be a few groups defined: 
>>> "-scripting" is already one such group, and "-mozilla" could be 
>>> another.
>>>
>>> So I would like to be able to specify options such as:
>>> --language=es5 --extensions=scripting,foreach
>>> --language=es6 --extensions=shebang,mozilla
>>> where the "--extensions" option implicitly disables all 
>>> non-specified extensions
>>>
>>> These options should also be validated:
>>> * warnings should be emitted when the specified language provides an 
>>> alternative for some enabled extension. For example, in ES6, "for 
>>> of" is available as a standard alternative for "for each".
>>> * errors should be emitted when there's an ambiguity. For example, 
>>> in ES6, when "Back-quote exec expressions" are enabled, the statement:
>>> var cmd = `format C:`;
>>> is both ambiguous & potentially disastrous ‎if the author meant it 
>>> as a template literal, while Nashorn interprets it as a command.
>>>
>>> The above also demonstrates why more fine-grained control is 
>>> required: certain extensions are obsolete and/or cause ambiguities, 
>>> while others are extremely useful. Even the "Shell script style hash 
>>> comments" extension is too coarse-grained: it should be split into a 
>>> "shebang" and a "hash comment" extension, since there's no good 
>>> reason to use hash comments besides for specifying the shebang. In 
>>> my opinion, it should be possible to run a shebang script with 
>>> something like "jjs --extensions=shebang myscript.js"
>>>
>>> Besides more fine-grained control, there should also be a way to 
>>> specify options inside the script, no matter how it's invoked. In my 
>>> opinion, this is very useful because the script author always knows 
>>> best how a script must be interpreted and which extensions 
>>> were/weren't used.
>>>
>>> For example, suppose I wrote the following shell script named 
>>> "find.js":
>>> //jjs --language=es5 --extensions=execbackquote
>>> load("library.js")
>>> print(`grep abc file`);
>>>
>>> where library.js is a library written by someone else, in which 
>>> template literals are used (note that "--extensions=" has the same 
>>> effect as "--no-syntax-extensions"):
>>> //jjs --language=es6 --extensions=
>>> var cmd = `format C:`;
>>>
>>> then by being able to specify the necessary parser directives within 
>>> the script, Nashorn would be able to correctly run "find.js" (i.e. 
>>> without executing the `format C:` inside library.js). Personally, I 
>>> think any parser directives such as language version and extensions 
>>> should always be embedded within the script, and I don't really see 
>>> a need to specify them on the command-line. As for implementation: 
>>> this can be done either by:
>>> * always interpreting the shebang. However, this is not 
>>> ECMAScript-compliant.
>>> * providing a "pure ECMAScript" alternative. For example, if the 
>>> script starts with "//jjs ", then it's interpreted by Nashorn (as in 
>>> the example above). Personally, I prefer this approach.
>>>
>>> In summary, I'd like 2 things:
>>> * fine-grained control over extensions, and validation of the 
>>> combination "language + enabled extensions". This could be done by 
>>> introducing a new "--extensions" option, which takes a list of 
>>> extensions & disables all others.
>>> * a way to specify options in pure ECMAScript (i.e. by interpreting 
>>> a comment in a specific format at the beginning of the script) 
>>> inside the script itself, whereby these options are used to load the 
>>> remainder of the script, regardless of any "outside-specified" options.
>>>
>>> What's your opinion on this?
>>>
>>> Kind regards,
>>> Anthony
>>>
>>> [1] https://bugs.openjdk.java.net/browse/JDK-8144139
>>>
>
>