PROPOSAL: Language Escape Operator

Fri Mar 27 04:11:33 PDT 2009

The backtick, on a standard US 101, needs to be typed by awkwardly  
bending your left pinky, traditionally the least flexible typing  
finger. At least, for 10-fingers on the home row kind of typer. As  
Stefan pointed out, on international keyboards its akin to using the  
umlaut (without any characters under it) as a symbol.

However, that's not really my issue with this proposal, this is:

Who ever types or sees that backtick?

1. The source file - but, no. It doesn't even end in .java, there is  
no way that source could possibly be mistaken for java code.

2. The programmer - but, no. I would never use a new language (even if  
its folding sugar on vanilla java) if it requires me to type a  
gazillion backticks. I wouldn't do it even if it were a nice, easily  
typed symbol like, say, semi-colon. Turning my code into perl cartoon  
swearing surely cannot be the point. Even if I never see them (because  
the folder folds them away!), I don't want to hit a special 'I want  
sugar' modifier key, that breaks the flow just as much as a cmd-key  
combo. Also, You hit shift routinely as you are 'just typing', so  
what's the problem with hitting CMD+X or whatever key such a folding  
IDE comes up with?

3. The parser - but, no. A code-folding view on the AST would never be  
in a position to parse such a thing.

other problem:

Many non-java languages already need the backtick, such as scala,  
which uses it to mean: The contents of the backticks are an  
identifier, even if it looks like a keyword. This allows you to call  
Thread.yield() in scala like so : Thread.`yield`(), because 'yield' in  
scala is a keyword, and therefore Thread.yield() would not invoke the  
method (it is instead a syntax error). Java has only 2 common keyboard  
symbols left, the backtick and the hash, and taking one of them is a  
big deal. I am not convinced its worth it, mostly because of the above  
problems.

The only benefit I can see is that, IF the code-folding AST view  
editor does use the backtick as some kind of escaping key, that any  
future java expansions do not all of a sudden make previous syntax  
sugar legal actual java. But this really doesn't sound very good  
either, because:

1. The backtick really really needs to be in the source, so that if I  
copy and paste, it sticks around. A view on the AST could dynamically  
add it to the text buffer when you copy and paste, but this is  
introducing rather a lot of voodoo. There's also no way for different  
AST node view tools t

2. Odds are that a very useful and often used desugaring in a popular  
AST view editor (which we don't have yet, but hopefully someone will  
take the time to write one someday), will end up in java itself.  
Except it can't, by definition: That backtick is in the way.

3. If, other than the backtick, the syntax is the same, or at least so  
similar that a key-by-key parsing is going to have issues, as a real  
java syntax, than the AST view desugaring syntax *needs to go away*.  
That point I made before but you haven't answered it yet: One of the  
points of ast views is that you can just change everything without  
breaking backwards compatibility. The only compatibility you're  
breaking is training your programmers to use a different syntax -  
which they have to do anyway, because they're learning to use a new  
java version which shipped with at least 1 non-trivial language change  
(after all, something that used to be illegal in java but legal as an  
AST view syntax, is now legal java!) - I consider the penalty for not  
having a language escape operator very low as is, in other words.

Take that last point, and combine it with the notion that I thoroughly  
do not look forward to hitting ` all the time, and you see why I don't  
like this proposal.

This is what I would type in an AST view language:

private     property int foo;

and this is what it should write to disk

@TypedDifferently(original="private    property int foo;")
private int foo;

@GeneratedBy("#foo") public Constructor(int foo) {
this.foo = foo;
}

@GeneratedBy("#foo") public int getFoo() {
return foo;
}

@GeneratedBy("#foo") public void setFoo(int foo) { this.foo = foo;}

@GeneratedBy("#foo") public void addFooListener(.....

(you get the point)

but if I load this file, -with or without annotations in it-, I want  
to see:

private     property int foo;

-though- I'll accept that it might give me:

property private int foo;

If the @OriginallyTyped annotation is no longer there.

I don't want to type backticks. I don't want to see backticks.

A proposal to make *THIS* work well would be great. I actually tried  
to write up the specs for an AST node editing language, and it's very  
very difficult. One of the issues you run into is that you can not  
save layout at all - you need to save the code to a java file, in a  
different structure, so it becomes almost impossible to preserve the  
whitespace, or things like order between keywords, or ordering between  
methods, when you read it back in. I thought of magic comments that  
keep the original syntax around, but that makes the java code ugly and  
not very accessible for people who are not using your particular AST  
desugaring engine.

Something standardized there that links code to the originally typed  
stuff (as a comment or string literal) would allow vanilla java IDEs  
to not render that comment at all (their own little AST node view :P)  
and to notice that a user is editing a generated block, and e.g. show  
a little warning, or alternatively offer a view of the originally  
typed code, which a vanilla java editor may not understand, though  
likely the raw english text in the source would carry some meaning of  
intent, probably moreso than the generated code.

Such a proposal wouldn't neccessarily fit in the JLS, it's just a  
standardized format that is also legal vanilla java (by using magic  
comments, probably), which all IDEs agree on as something they'll  
eventually detect. If you want special syntax for this, that would be  
fine too, but its okay if its verbose. You could introduce a keyword  
such as 'ast-node-view', which is so rare that I really doubt anyone's  
going to run into an issue with that no longer being a legal  
identifier. There's so much you can do - the backtick seems like one  
of the worst choices.

  --Reinier Zwitserloot

On Mar 27, 2009, at 04:35, Reinier Zwitserloot wrote:

> The problem then is: When does the special magic end?
>
> If there is no end, then this is no different from a 'language'  
> indicator at the top.
>
> I am very much in favour of editing a vanilla java AST that involves  
> lots of sugaring, but I don't think such an operator is neccessary  
> or even useful for it; all those backticks would become mighty ugly  
> fast. Also, because you're doing AST editing, those backticks aren't  
> anywhere on disk, so they become a keystroke to indicate: The next  
> few things I'm going to type are definitely intended as non-java  
> sugar that you should java-ize for me.
>
> Two things come to mind:
>
> 1) That's what cmd/ctrl/alt is for. Typing a backtick on most  
> keyboard is very difficult; I'd rewire a cmd+something to generate  
> it instead, easier on the fingers.
>
> 2) *any* IDE that is going to do this correctly also needs to know  
> all about java. Therefore, if an IDE can only handle up to java6  
> (+AST-based sugaring), and you want to type something java7, which  
> so happens to also be legal sugaring, then - that's only a minor  
> pain. You can't use java7 in this IDE, it doesn't know the syntax.  
> It's unfortunate that this AST sugar now needs to find another  
> syntax, but isn't that part of the point of AST sugaring? Ease of  
> switching around? I really doubt any kind of AST sugaring system is  
> going to make 'start with a backtick' a prerequisite.
>
>
> --Reinier Zwitserloot
>
>
>
> On Mar 27, 2009, at 04:13, brucechapman at paradise.net.nz wrote:
>
>> Quoting Reinier Zwitserloot <reinier at zwitserloot.com>:
>>
>>> Isn't it easier to do this with a keyword?
>>>
>>> Right at the top of your java file (after all, a non-java language
>>> would work at least on CompileUnits, if not entire Modules/Packages/
>>> directories), put something like:
>>>
>>> language groovy;
>>>
>>> some sort of guarantee by java, in the vein of your proposal, that
>>> java will never make:
>>>
>>> language (anything-but-"java", literally);
>>>
>>> legal. That's not to say that "language java;" at the top WOULD be
>>> legal, just that "language foo;" at the top would NEVER be legal.
>>>
>>> --Reinier Zwitserloot
>>>
>>
>> I think you misunderstood the intent( or I have misunderstood you -  
>> either way -
>> my fault).
>>
>> It is NOT about mixed language applications at the file level (file  
>> extensions
>> do a fine job of that), but about having mixed content within a  
>> single source
>> file, down to the granularity of statement and expression level and  
>> possibly
>> even finer.
>>
>> The point is that a tool can interpret the language escape operator  
>> and do
>> something different with the content. By the time it get to the  
>> compiler the
>> language escape operator should have disappeared because the java  
>> compiler
>> doesn't handle mixed language files. That's why I describe the  
>> behaviour as
>> raising a compiler error.
>>
>> Many of the other coin proposals define themselves purely as a  
>> syntax and
>> desugaring of that to more verbose java code. Such desugarings can  
>> done as a
>> view element within an IDE (and code folded back to the sugared  
>> form in the view
>> - the model is still bulk standard Java). This proposal offers a  
>> low impact
>> indicator to the IDE that what follows is something that is not  
>> Java, so please
>> treat it specially (for example by recognizing the syntax and  
>> desugaring).  That
>> is one use of a language escape.
>>
>> Another slant:
>>
>> JSP embeds java code within XML so that various parts of a web page  
>> can be coded
>> with an appropriate syntax. That doesn't need a java language  
>> escape character
>> because the java is inside the XML and XML itself and the JSP  
>> schema defines
>> which bits are java. But if you invert that model and want to put  
>> something that
>> isn't Java inside something that is primarily Java, then the escape  
>> operator is
>> the mechanism that the code author and tool can use to indicate and  
>> detect the
>> start of the non java stuff.
>>
>>
>> For further background check out my bog
>> http://weblogs.java.net/blog/brucechapman/archive/2009/03/meta_coinage_ro_1.html
>>
>> and the slides from my JUG presentation
>> http://weblogs.java.net/blog/brucechapman/archive/JUG%20Aug%202008%20Mark%20II%20Pecha%20Kucha.ppt
>>
>> And if you have Netbeans, try out the proof of concept module  
>> (reference at end
>> of proposal) for using a language escape operator to do properties.  
>> I really
>> must do one of those sexy screencast thingies showing that plugin  
>> in use so that
>> people don't need to install it in order to experience it. Geertjan  
>> can you help?
>>
>> Apologies if I am not doing a good job of getting this concept  
>> across.
>>
>> Bruce
>>
>>
>