String Tapas Redux: Beyond mere string interpolation

Fri Sep 17 06:43:29 UTC 2021

Yay!

I agree with Brian’s response to Remi:  Nothing new here
regarding eval or ASTs.

My favorite wished-for-use-case for templated strings is
a grammar where the “holes” are grammar actions or
other configuration points for rules.  This paper made
me envious for the sake of Java, and also made me think,
“when we get string templates we can try our own
shenanigans”:

http://www.inf.puc-rio.br/~roberto/lpeg/#grammar
http://www.inf.puc-rio.br/~roberto/docs/peg.pdf

> We can meet our diverse goals by separating mechanism from
> policy. How we introduce parameters into string expressions is
> mechanism; how we combine the parameters and the string into a final
> result (e.g., concatenation) is policy.

One might also say we separate wiring from function.  How we introduce
a segmented string with holes, and also (typed) expressions to fill
those holes, is the wiring.  (No function yet: We just observe all
those values waiting for us to do something with them.)  How we
arrange the static structure of those values verges on function, but
it’s really just setting up for the moment when we have all the values
and can run our function.  The function comes in when we have all the
values (crucially, the dynamic and typed hole-filling values).  At
that point it’s really “only” a method call, but that method contains
all the function (the “policy”).  The intermediate step where we
derive a recipient for the function call, from the static parts
(strings, and also hole types), is a factory method or constructor
call, one which creates the receiver object (which is constant for
that expression, just like a no-capture lambda).

It’s important to separate wiring from function in part because
(a) wiring can be fully understood while (b) function is inherently
undecidable.  So it’s good (for extensibility, universality) when
the wiring is really simple and clear, and the transitions into
the mysterious function-boxes are also really clear.

Also, if we focus on wiring that is as universal as possible,
we can do fancy stuff like grammars with functional actions.
Otherwise, it’s harder.

Also, making the function part “only a method call” means
that whatever resources the language has to make method calls
be a good notation apply to templated strings.

Also, since the “wiring part” includes the round-up of the
static parts of the template expression, it follows that we can
do lots of interesting “compile time” work in the indy or condy
instruction that implements the expression as a whole.

I am gently insisting that the types of the “holes” are part
of the template setup code because, after all, that’s what the
indy needs anyway, and it seems a shame to make them all
be erased to Object and Object[].

One reason “just use Object” is a missed opportunity:  You
get lots of boxing.  Another, deeper one:  Without FI-based
target typing that would be provided by a general Java method
call, you can’t put poly-expressions (like lambdas) into the holes.

For example, a PEG template might take a varargs argument of
type (PEGRuleAction… actions) where PEGRuleAction is a FI.
(Mixing FIs and other data is a challenge I will delay for now,
but it’s under the rubric of “Varargs 2.0”, allowing methods
to capture variable-length yet type-heterogeneous arguments.
Think also for Map<K,V>.of(…) which wants to take pairs
of arguments of alternating types.  But that’s for later.)

I’m fully aware that I will be asking for stuff that is beyond
a 1.0-level feature set, but in discussing stuff like this I’m
hoping to stake out a path for growth to wider set of use
cases that inevitably comes if the meaning of the expression
is “call this method on this compute-once receiver”, as opposed
to something more constrained (even if the 1.0 level is constrained).

Or, to go back to the “mechanism vs. policy” formulation,
the (very interesting) policy is embodied in the template
expression receiver object itself that is built on first use
of the expression, and it is also in the (very interesting)
method that is called on the object with the typed hole
values.  The mechanism consists of the rules by which
the expression receiver object (TemplatePolicy) is created,
and what values are passed to its constructor or factory
(just once, lazy), and then (each time the expression is
evaluated) how the holes are passed to the object, and
via which method.

The wiring I’m suggesting starts with a one-time setup
operation, which needs to be statically defined (no dynamic
argument dependencies).  I’m suggesting that the policy
provides a factory method which is run once (probably
via a condy or indy) per expression.

I will use really, really dumb names for the two interfaces
that characterize the processing at the two phases, the
one-time setup and the each-time evaluation of the template.

interface TemplatePolicyPhase1<S, P2 extends TemplatePolicyPhase2<S>> {
    P2 setup(List<String> parts, MethodType type);
    // type takes all the hole arguments and returns S
}

Note that TemplatePolicyPhase1 is a FI.  It is a factory
run at expression-link time, and it makes a phase 2
thingy which knows how to execute the expression:

interface TemplatePolicyPhase2<S> {
   List<String> parts();
   //?MethodType type();
   S execute(Object… args);
}

or, maybe better with MHs (with a workaround to invokeWA for arity limits):

interface TemplatePolicyPhase2<S> {
   List<String> parts();
   MethodHandle executor();
   default MethodType type() { return executor().type().dropParameterTypes(0,1); }
   default S execute(Object… args) {
      return executor().bindTo(this).invokeWithArguments(args); }
}

The names are deliberately dumb here.

The latter one (with the MH) is preferable, because
(a) it can reflectively report its exact type and (b) the
javac compiler can use the MH to pass the arguments
under exactly correct types (not just Object, not varargs).

The instance of TemplatePolicyPhase1 is really a witness
to a particular policy, as well as a way to execute it.

Once you have a TemplatePolicyPhase1 in hand it’s
clear how the wiring works and how to set up the
indy or condy.

public class String {
    public record SimpleTemplatePolicy(List<String> parts, MethodType type) {
        implements TemplatePolicyPhase2<String> {
        maker() { …do something here… } 
    }
    // witness to the policy:
    public static final TemplatePolicyPhase1<String,SimpleTemplatePolicy>
      SETUP_TEMPLATE_POLICY = SimpleTemplatePolicy::new;
}

To bootstrap things, there could be a rule whereby
that if a type has a static method or field or nested class
of appropriate name and type, the construction method
(or other witness value) is wired to the position of the
TemplatePolicyPhase1.  That method can be called from
a condy or indy, the first time the expression is executed.

That means the inference finds one TemplatePolicyPhase1
witness per type.  Eventually we can enhance this further
if we want to define a special kind of “static constant”
expression that can provide (as if through a lazy static
variable) a computed constant of some sort.  Maybe
String or SQLQuery can have several named witnesses
available:  SQLQuery.NORMAL.”template …”.

I think there is a place somewhere in the language for
lazy statics, which could provide named access points
for such “witnesses”, as well as do many other jobs.

If the phase 2 protocol uses method handles, then we
desugar to something like this:

   // int n; Object w;
   // var s = String.”hello, \{n} times to you, \{w}”;
   lazy @Condy static final MethodType
     MT42 = methodType(String.class, /*n*/ int.class, /*w*/ Object.class);
   lazy @Condy static final List
     SL42 = Utils.splitForST(“hello, \0 times to you, \0”);
   lazy @Condy static final TemplatePolicyPhase1
     TP43 = String.SETUP_TEMPLATE_POLICY.setup(SL42, MT42));
   lazy @Condy static final MethodHandle
     MH44 = TP43.maker();
   var s = (String) MH44.invokeExact(TP43, n, w);

In this desugaring, perhaps in some cases javac can decide on
some M and call TP43.M(n,w) instead of materializing the MH44,
which might as well be (in most cases) just a wrapper around M.

A universal “all wires exposed” policy-free policy might
look like this:

public record BasicTemplate
         (TemplatePolicyPhase2<BasicTemplate> policy,
          List<Object> arguments)  {

    public record BasicPolicy(List<String> parts, MethodType type) {
        implements TemplatePolicyPhase2<BasicTemplate> {
        BasicTemplate makeTemplate(List<Object> arguments) {
            //assert(validateTypedArgs(type, arguments))
            return new BasicTemplate(this, arguments);
        }
        MethodHandle maker() {
             return (…MHs.lookup makeTemplate…).asType(type);
        }

       // later on, narrow to a more definitive phase1 policy
       public <S,P2  extends TemplatePolicyPhase2<S>>
       P2 setupAgain(TemplatePolicyPhase1<S,P2> newPolicy) {
           return newPolicy.setup(policy.parts(), policy.type());
       }
    }
    // witness to the policy:
    public static final TemplatePolicyPhase1<BasicTemplate,BasicPolicy>
      SETUP_TEMPLATE_POLICY = BasicPolicy::new;

   // later on, narrow to a more definitive policy and execute method:
   public <S,P2  extends TemplatePolicyPhase2<S>>
   S executeAgain(TemplatePolicyPhase1<S,P2> newPolicyP1) {
      return policy.setupAgain().execute(arguments());
   }
}

Then if we have this:

   BasicTemplate tem = BasicTemplate.”hello, \{n} times to you, \{w}”;

…it desugars to:

   lazy @Condy static final MethodType
     MT42 = methodType(BasicTemplate.class, /*n*/ int.class, /*w*/ Object.class);
   lazy @Condy static final List
     SL42 = Utils.splitForST(“hello, \0 times to you, \0”);
   lazy @Condy static final TemplatePolicyPhase1
     TP43 = BasicTemplate.SETUP_TEMPLATE_POLICY.setup(SL42, MT42);
   lazy @Condy static final MethodHandle
     MH44 = TP43.maker();
   BasicTemplate tem = (BasicTemplate) MH44.invokeExact(TP43, n, w);

Later on, if we want to execute the template as if it had originally
be defined as a simple string concatenate, we do:

   String s = tem.executeAgain(String.TemplatePolicy::new);

Or we might run it as an SQL query:

   SQLQuery q = tem.executeAgain(SQLQuery.TemplatePolicy::new);

The BasicTemplate is an existence proof that the wiring
and the function, the mechanism and the policy, are completely
separate, because you can “rewire” any and all functions/policies
by means of the method “executeAgain”.

(I think the names of the two interfaces should maybe be
something like “LinkTemplateExpression” and
“ExecuteTemplateExpression”.)

— John