Update on String Templates (JEP 459)

Wed Mar 13 22:22:20 UTC 2024

On 13 Mar 2024, at 13:13, Guy Steele wrote:

> … Well, just off the top of my head as a thought experiment, if I had a series of SQL commands to process, some with arguments and some not, I would rather write
>
> SQL.process($”CREATE TABLE foo;”);
> SQL.process($”ALTER TABLE foo ADD name varchar(40);”);
> SQL.process($”ALTER TABLE foo ADD title varchar(30);”);
> SQL.process($”INSERT INTO foo (name, title) VALUES (‘Guy’, ‘Hacker’);”);
> SQL.process($”INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});”);
>
> than
>
> SQL.process(ST.of(”CREATE TABLE foo;”));
> SQL.process(ST.of(”ALTER TABLE foo ADD name varchar(40);”));
> SQL.process(ST.of(”ALTER TABLE foo ADD title varchar(30);”));
> SQL.process(ST.of(”INSERT INTO foo (name, title) VALUES (‘Guy’, ‘Hacker’);”));
> SQL.process(”INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});”);

OK, yes.  I think a simpler example is needed to answer my question more fully.  In this example, the name “foo” is given as a literal.  But, even if only as a workaround, it would probably not hurt code like that to quote such a name as an argument.  So:

> var foo = “foo”;  // or static final String FOO = “foo”;
> SQL.process(”CREATE TABLE \{foo};”);
> SQL.process(”ALTER TABLE \{foo} ADD name varchar(40);”);
> SQL.process(”ALTER TABLE \{foo} ADD title varchar(30);”);
> SQL.process(”INSERT INTO \{foo} (name, title) VALUES (‘Guy’, ‘Hacker’);”);
> SQL.process(”INSERT INTO \{foo} (name, title) VALUES (\{other name}, \{other job});”);

And it’s not just a workaround here, it’s arguably better style (D.R.Y.) to factor out the name foo that links everything together.  Non-support of no-arg STs would possibly push users towards a more D.R.Y. style, possibly a good thing.

I think such multi-command examples, in many little languages, will tend to have some term like foo shared across phrases.  The very small example I’m looking for would ideally be non-factorable, just a little string with not much substructure.  Because if it’s factorable, then maybe the user should just factor it, and then it’s a ST with arguments.  And if it’s not factorable, then maybe it is some stand-alone thing that won’t be harmed by making it a canned constant, or making it via a factory method, or making it a true string which is introduced into the processor by other means.

Not all languages offend against D.R.Y. as much as SQL.  A more contextual stateful little language, like Forth or Turtle graphics or Postscript, might have lots of little fixed commands (like “left” for a turtle).  When we work with such little languages we sometime have lots of static final strings to help us find the commands and spell them correctly.  (Like static final String LEFT = “left” in class TurtleGraphics.)  That would maybe morph into lots of static final STs?

————————

OVERLOADS

Another possible answer, in the use case with SQL above, is that if the language processor expects lots of ad hoc non-factorable (or non-factored) strings, it should cater to that expectation by taking String as an overload option.  That places pressure on the API designer to perform the conversion (ST.of) on the fly.  And overloads can expand non-linearly when there are several arguments in play.  And there are sometimes ambiguity risks in some corner cases, as Maurizio has shown.  Still sometimes it’s a good tradeoff to add an overload, if the problems are in truly minor corner cases.  Or maybe allowing strings instead of STs gives up some optimizations?  But often it’s better to let the chips fall with API design and do the work to optimize whichever API turns out to be most user-friendly.  I don’t see (maybe I missed it) a decisive objection to overloading across ST and String, at least for some processing APIs.

————————

ALGEBRA

These examples also lead me to a different source of questions, which is whether or how the existing practice of string constant expressions (like static final FOO above) can or should connect to STs as well.  It’s an interesting line of thought, so I’ll write something here, but (bottom line) I don’t think we want to act on it, at least at first.

String constants have a privileged role in the JLS, and also in programmer practice (as with FOO above).  Can/should STs leverage this somehow?  Should a “constant ST expression” be an alternative to a ST literal?  I’m thinking of a String or ST constant like MY_FORTH_PROLOGUE which I stick at the front of some ST that I’m building.

But that would seem to require some way to concatenate such a string to an ST, an expression like ST ‘+’ String -> ST, which seems disturbing to me, but might actually make sense.  Or would nesting be better, something like `(define tp (foo , at sub-tp bar))?  A variation of \{x} like \@{subtp}?  (And would there be javac constant folding rules for it, as well as dynamic rules for evaluation?)  This is speculative brainstorming; I’m not seriously recommending it for now.

Still, continuing… If MY_FORTH_PROLOGUE should be a static final ST, then I want options for prepending it locally to ad hoc strings.  So the question about “what about constants” turns into a larger question, “what about ST algebra on ST expressions?”  If you allow constants to be defined non-locally, you need a way to combine them with “more stuff” locally.  This relates to the issue raised earlier of whether nested STs should be part of the ST API:  Whether you concatenate two STs or nest one inside another, it seems you are doing some kind of generic ST algebra, generic across all uses of ST, not just for some processors.

And, circling back, if there were a way to fold ST literals together (with some non-local parts) then that would lead to another alternative to a “sigil” to disambiguate a plain string from a no-arg ST.  You’d use the concatenation operation (whatever that is) to combine an empty ST into the string that needs markup.  Kind of like when we say “”+x to abbreviate String.valueOf(x).

Given nesting or concatenation syntax, no-arg ST literals could be disambiguated by a prefix like ST.EMPTY+”…” or like “\@{}…”, which either prepends or nests a degenerate ST.  That could serve a role like $”…” in your examples, Guy, although of course a single-char sigil looks nicer.

Maybe we want some more algebra like that someday, but I am not enthusiastic enough to recommend it now.  I guess the most I’d recommend is somehow leave room for building up nested or concatenated literals, as a future addition.  Allowing STs to start like “\{}…” would solve today’s disambiguation problem with a kludge like $”…”, and also a hint of more “algebra” in the future.

> SQL.process(”\{}CREATE TABLE foo;”);
> SQL.process(”\{}ALTER TABLE foo ADD name varchar(40);”);
> SQL.process(”\{}ALTER TABLE foo ADD title varchar(30);”);
> SQL.process(”\{}INSERT INTO foo (name, title) VALUES (‘Guy’, ‘Hacker’);”);
> SQL.process(”\{}INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});”);

HTH