PROPOSAL: Simplified StringBuffer/StringBuilder syntax

Derek Foster vapor1 at teleport.com
Thu Apr 2 23:33:14 PDT 2009


Replies inline.

-----Original Message-----
>From: Reinier Zwitserloot <reinier at zwitserloot.com>
>Sent: Mar 31, 2009 5:59 AM
>To: Mark Thornton <mthornton at optrak.co.uk>
>Cc: coin-dev at openjdk.java.net
>Subject: Re: PROPOSAL: Simplified StringBuffer/StringBuilder syntax
>
>+ being overloaded to also mean string concatenation was a mistake in  
>java 1.0*. Let's not enshrine it by making more of them.

Actually, I agree with this. I think they should have come up with a new "concatenation" operator instead of trying to reuse an existing one. However, + is what we have, and that's not going to change. However, it is obnoxious because it isn't really "complete" -- it takes care of one trivial use case in concatenating strings, but it doesn't handle the larger issue of what to do when all the concatenation doesn't occur in the same expression, and it isn't consistent in its coverage of the family of related types (String, StringBuffer, and StringBuilder) that are really used in creating strings.

My intent wasn't to create new operators -- it was to fix odd holes and (fairly inconvenient) missing features in the behavior of the existing ones. I just want the behavior of +, =, and += to behave consistently and intuitively when used with the classes that are used to generate Strings (StringBuffer, StringBuilder, and String). For anybody who knows that + means concatenation of strings, it's fairly obvious what:

StringBuilder foo = "abc";
foo += "def";

is supposed to do.

>Also, if '+' will call .append on any appendables, I guarantee you,  
>the first thing some clown will create is this:
>
>public class BigInteger2 extends Number implements Appendable {
>    //I'm a BigInteger that supports +! Oh - and I'm mutable too :/
>}

Two observations:

1) If you are programming alongside clowns, it's probably time to find a new job.

2) This is probably the least of the damage that a clown can do when armed with a Java compiler. See observation #1.


>Specific problems with the entire concept:
>
>1. + so far is strictly a 'create new object with result' kind of of  
>operation. "x + y" is an expression that does not change either the  
>value of x, or the value of y. It just creates a new value, and that  
>is the result of the expression. The same thing happens with strings,  
>but if you apply this to appendables, all of a sudden you get "x + y"  
>takes y and appends it to x. That is just strange.

That's not what the proposal says. It says "x = x + y" takes y and appends it to x, if x is a StringBuilder and y is a String. 

The append is only done so that this happens efficiently (by eliminating the need to create an unnecessary temporary String object). It's an optimization for an expression in a very common, specific pattern involving multiple operators which is known to be evaluated inefficiently if not treated as a special case. Compilers do this sort of large-scale optimization frequently for a variety of reasons. From the user's point of view, there is no difference in behavior.

"x + y" by itself does what it has always done.

>2. The whole point of not allowing operator overloading is to make  
>sure any given snippet of the form 'x + y' serves as an anchor of  
>sorts: You know nothing too weird is going on in those 3 characters.  
>If + can mean: Mathematical plus, -OR- string concatenation (utterly  
>unrelated), -OR- anything anybody may have cooked up by implementing  
>Appendable, then there's zero conservative anchoring left for the +  
>symbol. Ergo I assert that doing this is as bad as having full blown  
>operator overloading. Actually, it's worse - at least full blown  
>operator overloading has proper names for things ('+' is plus and not  
>append), and allows one to write proper libraries for it.

I personally am not particularly against operator overloading as a general principle. I've used C++ for many years and never really had a problem with someone abusing it. I've never really understood certain parts of the Java community's shock and horror at the concept of having it in a language -- lots of languages have it, and it doesn't really seem to be one of the major problems with any of them. (C++, for instance, has WAY bigger problems than that!) Whether it gets abused or not mostly depends on the culture and training of the average person who uses that particular language, not the features they have available to them. Also, libraries which aren't useful and reasonably intuitive don't tend to become widespread in their use. Operator overloading is only one of many possible ways that an API could be designed either well or poorly. As always, the market decides.

In any case, having limited use of operator overloading designed within the platform by the people who maintain it is presumably a lot less likely to lead to abuse than to turn the public at large loose on it, so the general arguments against operator overloading ("Everybody and his dog will define operators to have unintuitive meanings at every opportunity!") don't really seem to apply in this case.

Also, having operator overloading based on system interfaces which have defined meanings (and which may well be designed in ways that make them awkward to use in cases where those meanings do not apply) seems far less prone to abuse than the C++ model of "any operator can be redefined by anybody at any time for any purpose, as long as you remember to include the appropriate header file."

With regards to "things people cook up using Appendable", presumably those will be things for which appending strings is a meaningful operation, right? Otherwise, they shouldn't be implementing the Appendable interface in the first place (and it would be awkward to do so, since "append(String)" isn't the only method on it, and the others are harder to implement). Since that's what the interface is really defined for, why is it a bad thing for them to be able to use the same syntax for appending strings to something with that mandate as it would be for appending them to other things, like Strings?

Note that I'm not really all that sure that I'll add the Appendable concept to the proposal, since I have other concerns about it (such as the fact that append(int) isn't allowed, so there would be a lot of calling of String.valueOf(?) on the arguments). I'll have to think a bit more on that issue.

>*) string concat and numeric plus are unrelated to each other. In  
>fact, numeric plus ought to be commutative, which string concat isn't  
>(commutative = swapping arguments doesn't change result). It weakens  
>the information that a raw + sign is capable of telling you. There are  
>pros and cons to having a strict interpretation of a + sign, but given  
>that java does not allow operator overloading, the onus appears to be  
>on those in favour of weakening it to prove why this is acceptable.

Operator + is already defined within the language to mean concatenation as well as addition. You may argue that this shouldn't be the case, and I may even agree with you, but the fact remains that it is so, and neither of us can change that fact at this point. 

Given that we do, in fact, have operator + for strings meaning concatenation, I think that it is logical for us to make sure that the full set of operations that have related semantics all work together in a fashion that makes sense, given that + means concatenation for strings. Otherwise, we have a language that's not just inconsistent in the one detail that + has multiple meanings -- we have a language that's arbitrary and capricious with regards to which apparently similar operations are allowed versus which ones aren't.

That's harder for users to learn. "I'm supposed to use StringBuilder to build strings. Why can't I build them with nice syntax like I'm used to with expressions involving Strings? Why does making a small change (building a string in multiple statements versus a single statement) mean I have to use an entirely different syntax to get a similar level of efficiency?". Also, in this case, it leads to a lot of unnecessarily verbose code or unnecessarily inefficient code being written.

Derek

>  --Reinier Zwitserloot
>
>
>
>On Mar 28, 2009, at 21:14, Mark Thornton wrote:
>
>> Derek Foster wrote:
>>>
>>> CONCATENATION: An expression of the form
>>>
>>>    SB += S
>>>
>>> where SB is an RValue expression of type StringBuilder, and S is an  
>>> expression of type String, shall be considered to have meaning as  
>>> defined below. (Previously, this was a syntax error)
>>>
>>> SELF-CONCATENATION:
>>>
>>> An expression of the form
>>>
>>>    SB = SB + S
>>>
>> Why not allow any Appendable in these cases?
>>
>> Mark Thornton
>>
>
>




More information about the coin-dev mailing list