ARM syntax and new keywords

Thu Nov 26 06:31:21 PST 2009

A pretty long post I'm afraid, but I hope it is contributing something of
value.

I've been following the discussions about ARM and a question about the
syntax has been niggling at me: wouldn't it be better using a new keyword?

Now as I understand it there is considerable resistance to the idea of new
keywords, and with reason, but I feel that the associated problems have a
reasonable solution and that it is a mistake to refuse to at least consider
using keywords. If the resulting feature is much simpler and clearer using a
keyword then why introduce some new syntax?

As a motivation, consider ARM if we allow new keywords.

Create, say, a new keyword 'autoclose'. This would be a new modifier on
local variable declarations with much the same semantics as the current
proposal; in particular that when the variable goes out of scope there will
be a suitably exception protected call to a close routine. In general the
keyword would take a 'parameter' giving the name of the close routine, thus:

autoclose(close) InputStreamReader in = new FileReader("filename");

The declared type must have a method of the given name that can be called
with no args. This would be desugared in much the same way as:

try (InputStreamReader in = new FileReader("filename")) {
statements_to_the_end_of_scope_of_in }

in the current proposal would be.

There should also be a simplified form without the (close) for the common
cases. There are different ways that this could be defined: either using
marker interfaces as currently envisaged, or using a name based scheme. For
example, if the declared type has exactly one method with name "close",
"dispose", "free", "release" or "unlock" that can be called with no args,
use that, otherwise it's an error not to specify the name explicitly. This
is slightly nasty in that it puts those names into the language spec, but
then again only as defaults, and it has the advantages that it is simple to
understand, simple to use and will do what's expected. It would create the
restriction that it would be a breaking change to add a second method from
the set to an existing library class, but to me that doesn't seem much of a
burden. It wouldn't require any reworking of the library at all - no new
marker interfaces etc. So the above example would become:

autoclose InputStreamReader in = new FileReader("filename");

This feature could also be used simply for user defined types, for example:

autoclose(bin) UserBinnable ub = new UserBinnable();

To my eyes it is much clearer than the proposed syntax (not that I find that
unacceptable, but I think this is better). It's so simple to use that it
will become second nature to use it every time, even in throwaway code. It
even has a good chance of being correctly interpreted by someone who doesn't
know the feature - surely an indication of a good syntax.

It has the difference relative to the proposed syntax that the programmer
isn't forced to explicitly show the end of scope via {}, though there is
nothing to stop it (by putting the declaration at the start of a new block).
The presence of the word 'try' can also be considered an advantage of the
current approach (reminding you that there is some exception magic going
on), but personally I don't find this compelling.

It could be extended to the foreach case that has been recently discussed.
If you allow its use just after the : to indicate that the created iterator
needs closing:

for (Object o: autoclose somethingIterable)...

which would be expanded as detailed in the current JLS but with autoclose
added to the iterator declaration. This would require
somethingIterable.iterator() to return a subtype of Iterator that had a
close, dispose etc method, otherwise it's an error. The fact that it is
explicit and compiler checked strikes me as a big advantage over the
previously discussed idea of just quietly doing it if the passed type
permits it. Additionally, if the passed type is changed to no longer support
closing then the code no longer compiles, rather than quietly changing its
functionality.

Anyway, enough of the ARM case. On to the subject of new keywords.

Obviously, the above syntax is only worthwhile if the cost of introducing
new keywords isn't too high. At the moment it famously is, of course. So,
'source' revisited, but source light.

I totally agree that having the meaning of syntax change so that you can't
tell what code is doing without looking at the file header is only to be
tolerated in extreme cases. However, I don't think this applies with new
keywords. If you are looking at source code in an IDE then you get syntax
highlighting. This will tell you right away if a name is being treated as an
identifier or as a keyword. If you type the name into your code then you
will know you that you haven't written an identifier but a keyword (or vice
versa).

So, the suggested solution is that a contextual keyword called 'keywords'
should be permitted as the first thing in a file, in which case it is
followed by a version string which must correspond to a version of the JLS
(or possibly java version -  TBD). The meaning of this is that the list of
keywords should be that from the referenced version of the JLS. If a name
which is a keyword in the latest version of the language (eg 'autoclose')
appears in the source, but is not present in the version of the JLS given in
the keywords specification, then it will be treated as an ordinary
identifier rather than a keyword for this file. If no keywords specification
appears at the start of the source file then "keywords 3.0;" is assumed.

This provides many of the advantages of 'source' without most of the
downsides. The JSL doesn't need to maintain multiple versions. If you add
'autoclose' say, then you specify what it does and add it to the new list of
keywords. The functionality is still, in some senses, always present in the
language, it is simply that if a source file specifies a version preceding
the version it was added to then the name autoclose refers to an identifier
rather than the new feature. It doesn't present language designers with the
temptation to modify its behaviour later. The keywords version merely tells
you if it is in or out, it doesn't give you information about how to
interpret it.

IDEs should be able to refactor this relatively easily (just checking that
no new keywords are currently used as identifiers in the file when altering
the 'keywords' specifier). They should be able to do the syntax highlighting
pretty straightforwardly as well.

It does create one problem that I can see: if you are writing code using the
new keywords how do you access a member called, for example, autoclose, in
an existing class? I suppose one answer could be to require the creation of
a glue class to rename access to the member, but this isn't very elegant
(but then again if new keywords are chosen carefully there shouldn't be very
many of these cases). Another possibility would be to allow individual
blocks to be reverted to earlier keyword version by tagging them with a
local 'keywords' specifier, but this seems a bit heavy handed. Probably the
best way would be if current proposals to create a syntax for non-standard
identifiers are implemented. Then this could be used to force the new
keyword to be treated as an identifier in the given context.

Independently of whether the ideas above concerning ARM get any support, I
feel that discussions about the language would benefit from the freedom to
at least consider new keywords. I hope this post will encourage this by
pointing out that the compatibility problems of introducing new keywords can
be easily eliminated.

Jonty