Code (Pre-)Review for JEP 280: Indify String Concat

Fri Nov 27 01:03:10 UTC 2015

Thanks! The patch looks better, and having the code all in one place 
definitively helps. I think there is still something that you can do to 
improve the code - i.e. all strategies seem to do:

* collect arguments (on a JCAssignOp and on a JCBinary)
* build indy node

Currently you have switches on both collect() and xyzConcat methods; 
what I was trying to suggest is - instead of doing:

<common code>
switch (cond) {
    case A:
       <do strategy x>
    case B
       <do strategy y>
}

Can we organize the code using a class hierarchy - i.e.

abstract class StringConcatHelper {
     List<JCTree> collectAll(JCAssignOp) { ... } //implemented in terms 
of collect(JCTree, JCTree, as now)
     List<JCTree> collectAll(JCBinary) { ... } //implemented in terms of 
collect(JCTree, JCTree, as now)
     abstract void emit(List<JCTree> args, JCTree tree, Type type);

     private List<JCTree> collectAll(JCTree, JCTree) { //as current impl }
}

and then have a bunch of subclasses:

class InlineConcatHelper extends StringConcatHelper {
    //implements inline emit here
}

abstract class IndyConcatHelper extends StringConcatHelper {
   //common logic to all indy-based strategy goes here
}

class ConstantIndyConcatHelper extends IndyConcatHelper {
   //constant indy logic here
}

class PlainIndyConcatHelper extends IndyConcatHelper {
   //plain indy logic here
}

Then, all you have to do is to create the right helper given the command 
line options, and that object will have an API that can be used by Gen 
to perform string concact effectively, w/o switching and minimizing 
duplication.

What do you think?

Maurizio

On 27/11/15 00:20, Aleksey Shipilev wrote:
> Hi Maurizio,
>
> Thanks for the reviews!
>
> Updated webrevs:
>    http://cr.openjdk.java.net/~shade/8085796/webrev.langtools.02/
>    http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.03/
>
>
> On 11/26/2015 02:40 PM, Maurizio Cimadamore wrote:
>> types.boxedClass(syms.voidType).type;
>>
>> Note that j.l.Void is always created (see call to
>> synthetizeBoxTypeIfMissing(voidType) - but there are (or used to be)
>> corner cases where j.l.Void is not available on certain platforms (i.e.
>> mobile) so we can't rely on it being always there (as it will lead to
>> crashes). The pattern above is safer, as it relies on javac ability to
>> fake a box type if it's missing from the platform.
> Excellent, thanks! Didn't know that nuisance with j.l.Void availability.
> Removed in favor of the suggested pattern.
>
>
>> * The code for initializing options in Gen is what we would normally
>> model with an enum. I.e. you need something like:
>>
>> enum StringConcatMode {
>>     INLINE,
>>     INDY_PLAIN,
>>     INDY_CONSTANTS;
>> }
>>
>> And then put some logic inside the enum class to parse the option and to
>> return the right enum constant. The code will get better as you can
>> model two constants (allowIndyStringConcat and
>> indyStringConcatConstants) with a single enum value.
> Yes, indeed, done.
>
>
>> * From a design point of view, I see the logic for generating the string
>> concat indy calls as split into two components - one that collects
>> arguments, iterates through the peels and the merges them together; this
>> part is independent from the particular indy strategy used. There's also
>> a second part - the one that generates the list of static/dynamic args
>> (given the flattened args to the string concat) and then generates the
>> final indy call. This latter part is variable, i.e. you get different
>> logic for different indy strategies - which means the code ends up with
>> many if/else blocks. A possible suggestioin is to make the code more
>> object oriented - keep the first part as is, but have an helper class to
>> do the job of mapping an argument into a static arg/recipe and then
>> generate the final indy call. In other wordas, what you need is a
>> builder pattern - you need an object that you can append args to and,
>> when finished, you can call a method to generate the resulting indy
>> call. There will be a builder for each indy strategy. This is just a
>> suggestion on how I would organize the code - it's not a blocking
>> comment - but I see the current strategy not scaling very well if e.g.
>> you start adding more strategies.
> Yes, I stepped further and created a separate helper class that does all
> the String concat. Moved all the logic, including the current
> StringBuilder-based one there, this lets to un-clutter Gen itself.
>
> Please see the updated webrevs. I would need to polish langtools changes
> a bit more, but tell me if the general direction is what you like.
>
>
>> * I need some confirmation here
>>
>> // Concat the String representation of the constant, except
>> + // for the case it contains special tags, which requires us
>> + // to expose it as detached constant.
>> + String a = arg.type.stringValue();
>> + if (a.contains(TAG_CONST) || a.contains(TAG_ARG)) {
>> + recipe.append(TAG_CONST);
>> + staticArgs.add(a);
>> + } else {
>> + recipe.append(a);
>> + }
>>
>>
>> My understanding is that your strategy builds a 'recipe' which is
>> basically the result of the concat. Since some of the arguments are
>> dynamic, you need to put special markers in the recipe, to tell the BSM
>> to fetch the dynamic argument and stick it in there. Correct? This in
>> turn creates a problem, as it's possible for a concat _constant_
>> argument to contain the special values themselves - if that's the case,
>> you use static arguments essentially as an escaping mechanism, right?
> Yes, that is a correct understanding. The credit for the idea actually
> goes to John Rose. This scheme has a nice implication that *most*
> recipes would have only the recipe string + dynamic arguments, and no
> other static arguments. Which saves space in constant pool. Also, some
> compilers may find it easier to avoid building a complicated recipe, and
> rather pass the String constants untouched.
>
> There is a guidance in StringConcatFactory for that:
>
>   @apiNote Code generators have three distinct ways to process a constant
>   string operand S in a string concatenation expression.  First, S can be
>   materialized as a reference (using ldc) and passed as an ordinary
>   argument (recipe '\1'). Or, S can be stored in the constant pool and
>   passed as a constant (recipe '\2') . Finally, if S contains neither of
>   the recipe tag characters ('\1', '\2') then S can be interpolated into
>   the recipe itself, causing its characters to be inserted into the
>   result.
>
>
>
>> * I would expect more of a combinatorial test where you check i.e.
>> string concat of different sizes and args - i.e. the dimension of the
>> combinatorial spaces are:
>>
>> - size - how many args to the string concat? This as non obvious
>> consequences in terms of peeling
>> - arg types: constants, dynamic, special constants (i.e. TAG_CONST, TAG_ARG)
>> - strategy used: vanilla, indy_constats, indy_plain
>> - codegen context: assignop vs. binary
>> - target option (-target 8 vs -target 9)
> In fact, we have these tests in jdk/test/java/lang/String concat, see
> the jdk webrev. The rationale for keeping these tests in jdk: you would
> want to test the JDK concat code along with those combinatorial tests,
> and verify the *actual* concatenation result is sane. Simply generating
> the bytecode and trying to verify it seems to be a half-way approach to
> the problem (also amenable to bugs in your own bytecode verifier!)
>
>
> Thanks,
> -Aleksey
>