Nashorn parser directives

Fri Nov 27 15:50:12 UTC 2015

Hi

Recently I filed an issue [1] requesting to "Allow to run shebang script 
with --no-syntax-extensions". This was closed as "Won't Fix" with "Can't 
support because # is an extension." While this is true, I meant to say 
"Allow to run shebang script without allowing any other syntax 
extensions than shebang". Or more generally: allow fine-grained control 
over syntax extensions.

Currently, there are only 2 switches that influence syntax extensions: 
--no-syntax-extensions and -scripting. However, the nature of the 
various extensions is totally disparate: allowing "for each" expressions 
has nothing to do with allowing a shebang. Therefore, it should be 
possible to enable/disable them individually. Additionally, there could 
be a few groups defined: "-scripting" is already one such group, and 
"-mozilla" could be another.

So I would like to be able to specify options such as:
--language=es5 --extensions=scripting,foreach
--language=es6 --extensions=shebang,mozilla
where the "--extensions" option implicitly disables all non-specified 
extensions

These options should also be validated:
* warnings should be emitted when the specified language provides an 
alternative for some enabled extension. For example, in ES6, "for of" is 
available as a standard alternative for "for each".
* errors should be emitted when there's an ambiguity. For example, in 
ES6, when "Back-quote exec expressions" are enabled, the statement:
var cmd = `format C:`;
is both ambiguous & potentially disastrous ‎if the author meant it as a 
template literal, while Nashorn interprets it as a command.

The above also demonstrates why more fine-grained control is required: 
certain extensions are obsolete and/or cause ambiguities, while others 
are extremely useful. Even the "Shell script style hash comments" 
extension is too coarse-grained: it should be split into a "shebang" and 
a "hash comment" extension, since there's no good reason to use hash 
comments besides for specifying the shebang. In my opinion, it should be 
possible to run a shebang script with something like "jjs 
--extensions=shebang myscript.js"

Besides more fine-grained control, there should also be a way to specify 
options inside the script, no matter how it's invoked. In my opinion, 
this is very useful because the script author always knows best how a 
script must be interpreted and which extensions were/weren't used.

For example, suppose I wrote the following shell script named "find.js":
//jjs --language=es5 --extensions=execbackquote
load("library.js")
print(`grep abc file`);

where library.js is a library written by someone else, in which template 
literals are used (note that "--extensions=" has the same effect as 
"--no-syntax-extensions"):
//jjs --language=es6 --extensions=
var cmd = `format C:`;

then by being able to specify the necessary parser directives within the 
script, Nashorn would be able to correctly run "find.js" (i.e. without 
executing the `format C:` inside library.js). Personally, I think any 
parser directives such as language version and extensions should always 
be embedded within the script, and I don't really see a need to specify 
them on the command-line. As for implementation: this can be done either by:
* always interpreting the shebang. However, this is not 
ECMAScript-compliant.
* providing a "pure ECMAScript" alternative. For example, if the script 
starts with "//jjs ", then it's interpreted by Nashorn (as in the 
example above). Personally, I prefer this approach.

In summary, I'd like 2 things:
* fine-grained control over extensions, and validation of the 
combination "language + enabled extensions". This could be done by 
introducing a new "--extensions" option, which takes a list of 
extensions & disables all others.
* a way to specify options in pure ECMAScript (i.e. by interpreting a 
comment in a specific format at the beginning of the script) inside the 
script itself, whereby these options are used to load the remainder of 
the script, regardless of any "outside-specified" options.

What's your opinion on this?

Kind regards,
Anthony

[1] https://bugs.openjdk.java.net/browse/JDK-8144139