StringJoiner: detect or fix delimiter collision?
Philip Hodges
philip.hodges at bluewin.ch
Sat Aug 31 22:57:31 UTC 2013
I don't think my needs are in any way particular or unusual. This is the third attempt at a StringJoiner that has not stopped me having to roll my own. Arrays.toString and AbstractCollection.toString already meet logging needs and serve as simple useful examples for where people don't care about delimiter collision.
Your strings/stream/map/escape/collect/joining idiom is impressive and might well be adequate and flexible. It is also quite beyond incomprehensible to many. Even I don't immediately see what the stream and collect methods contribute to the processing, I will look it up later.
A lot of those will simply not think about delimiter collision until it bites them in production, because your joiner does not by default insist on either a waiver or an escaper or a checker for separators in the collection where you wrote map(escape). Maybe they will join some numbers first, then copy that code to join some identifiers, and then copy that code to join some names, and it will "work fine". Until one of the names contains an apostrophe or comma. Or they will quickly join a few lines for a web page, separated by a <br> tag, and forget to convert the lines to HTML, or verify that each line is a valid and safe HTML subset fragment.
This feature as it stands adds precious little convenience value, while making it easier for people to forget to provide any escaping at all. You are missing a golden opportunity to ensure that the code only compiles if it is complete with a waiver, detector or escaper. I've seen the consequences of escaping being left out all too often.
My own rolled code consists of a looper, an appender, and a replacer. I found it best if the appender method that chooses and injects the separator also inserts the quotes and chooses and calls the escaper (replacer). The code is clear apart from the replacer implementation, which has to use the charAt idiom instead of foreach to iterate the string without the defensive copy, and has inline code to escape the escape character itself as well as the quote character. If I was doing it over again, I would try to estimate and pass an initial capacity for the StringBuilder, and see if I really have to break the append chain in order to pass that same StringBuilder to the replacer instead of letting it create its own. I would also change the "TODO use guava joiner" comment to just say *see* guava joiner and see StringJoiner since java 1.8 :( It's more than persuasive enough to drive me to keep trying to pass on my doubts in this forum, but I would rather spare you the javadocs and implementation.
Phil
To quote Joshua Bloch on good API design: 'When in doubt leave it out' (or 'You can always add, but you can never remove').
On 31 Aug 2013, at 23:13, Mike Duigou <mike.duigou at oracle.com> wrote:
On Aug 30 2013, at 23:40 , Philip Hodges wrote:
> ...
>
> Why is there such a bandwagon rolling for this "convenience" feature?
Perhaps others just don't agree with you. The choice of functionality offered in the JDK 8 StringJoiner was not arbitrary or haphazardly chosen. If it doesn't meet your particular needs, I am sorry to hear that. Our belief as the proposers is that it offers a reasonable mix of functionality and simplicity to cover useful number of requirements. Our intent was certainly not to create the kitchen magician [1] of string joiners that attempted to incorporate every conceivable option. In particular your concerns about escaping are out of scope for the joiner in part because
Collection<String> strings;
String concat = strings.stream().map(MyUtils::escape).collect(Collections.joining(","));
seems like an adequate and flexible solution to most people.
> We already have joiners in apache commons and guava.
This is the reason that StringJoiner was considered for JDK--that many others were "rolling their own".
> At first I thought they were cool. Then I tried to use them in anger.
> And was forced to roll my own.
I would encourage you to share your implementation or the javadocs as grist for discussion. An actual alternative is the *only* thing that is likely to be persuasive.
> That can't be right.
Sometimes it is. The JDK doesn't make any attempt to satisfy everyone (or anyone). We all end up writing lots of non-business logic utility functions to bridge gaps in what JDK offers. This is normal. And, if it turns out to be something lots of people need then it's entirely possible that it could be added to the JDK.
Mike
[1] https://www.youtube.com/watch?v=cGVG9xLHb84
>
> A more elaborate offically blessed feature that
> only does half the job is worse than useless.
> Without the extra complex ability to detect or avoid collisions
> it is neither "nice", nor a "Good Thing".
>
> Phil
>
> http://www.techempower.com/blog/2013/03/26/everything-about-java-8/
>
> "StringJoiner and String.join(...) are long, long overdue. They are so long overdue that the vast majority of Java developers likely have already written or have found utilities for joining strings, but it is nice for the JDK to finally provide this itself. Everyone has encountered situations where joining strings is required, and it is a Good Thing™ that we can now express that through a standard API that every Java developer (eventually) will know. "
>
>
> On 2013-07-23 22:09, Mike Duigou wrote:
>>
>> On Jul 23 2013, at 12:43 , ph wrote:
>>
>>> didn't see delimiter collision mentioned anywhere - are you really offering
>>> an interface that only joins a list of entries without escaping or quoting
>>> or even checking for separators (or escape or quote sequences) that might
>>> occur in the entries?
>>
>> Correct. StringJoiner makes no effort to address escaping/quoting.
>> Doing so would add a lot of complexity that would not
>> useful interesting to most users and probably still wouldn't satisfy everyone.
>
>> If you wish some form of escaping or quoting you can pre-processed entries however you like before joining them.
>>
>> Mike
>>
More information about the core-libs-dev
mailing list