String reboot - (1a) incidental whitespace

Alex Buckley alex.buckley at oracle.com
Mon Apr 22 19:04:13 UTC 2019


Nope, I don't think multi-line string literals are an attractive 
nuisance in any way. We should NOT deem it incorrect to refactor a 
sequence of concatenations into a single multi-line string literal. 
Developers are chomping at the bit to do it, and if we cast doubt on the 
ability then we're wasting everyone's time. We should deem it correct, 
and 99% of the time no-one will care that newline characters exist in 
the string. The rare library that subtly misbehaves or (and this is the 
better option) actually blow ups when seeing newlines will feel great 
pressure to become more liberal in what it accepts, and that is a good 
thing.

Alex

On 4/19/2019 7:42 PM, Guy Steele wrote:
> So is your point that multiline string literals may be an “attractive nuisance” in that they may make it too convenient for inattentive programmers to perform _incorrect_ refactoring?
>
>
>> On Apr 19, 2019, at 8:16 PM, Alex Buckley <alex.buckley at oracle.com> wrote:
>>
>>> On 4/10/2019 8:22 AM, Jim Laskey wrote:
>>> Line terminators:  When strings span lines, they do so using the line
>>> terminators present in the source file, which may vary depending on what
>>> operating system the file was authored.  Should this be an aspect of
>>> multi-line-ness, or should we normalize these to a standard line
>>> terminator?  It seems a little weird to treat string literals quite so
>>> literally; the choice of line terminator is surely an incidental one.  I
>>> think we're all comfortable saying "these should be normalized", but its
>>> worth bringing this up because it is merely one way in which incidental
>>> artifacts of how the string is embedded in the source program force us
>>> to interpret what the user meant.
>>
>> No-one has commented on this, but it's important because some libraries are going to be surprised by the presence of line terminators, of any kind, in strings denoted by multi-line string literals.
>>
>> To be clear, I agree with normalizing line terminators. And, I understand that any string could have contained line terminators thanks to escape sequences in traditional string literals. But, it was not common to see a \n except where multi-line-ness was expected or harmless. Going forward, who can guarantee that refactoring the argument of `prepareStatement` from a sequence of concatenations:
>>
>>   try (PreparedStatement s = connection.prepareStatement(
>>       "SELECT * "
>>     + "FROM my_table "
>>     + "WHERE a = b "
>>   )) {
>>       ...
>>   }
>>
>> to a multi-line string literal:
>>
>>   try (PreparedStatement s = connection.prepareStatement(
>>       """SELECT *
>>          FROM my_table
>>          WHERE a = b"""
>>   )) {
>>       ...
>>   }
>>
>> is behaviorally compatible for `prepareStatement`? It had no reason to expect \n in its string argument before.
>>
>> (Hat tip: https://blog.jooq.org/2015/12/29/please-java-do-finally-support-multiline-strings/)
>>
>> Maybe `prepareStatement` will work fine. But someone somewhere is going to take a program with a sequence of 2000 concatenations and turn them into a huge multi-line string literal, and the inserted line terminators are going to cause memory pressure, and GC is going to take a little longer, and eventually this bug will be filed: "My system runs 5% slower because the source code changed a teeny tiny bit."
>>
>> In reality, a few libraries will need fixing, and that will happen quickly because developers are very keen to use multi-line string literals. But it's fair to point out that while everyone is worrying about whitespace on the left of the literal, the line terminators to the right are a novel artifact too.
>>
>> Alex
>


More information about the amber-spec-experts mailing list