RFR(2): 7174936: several String methods claim to always create new String

Stuart Marks stuart.marks at oracle.com
Thu Nov 14 23:56:14 UTC 2013



On 11/14/13 2:04 AM, David Holmes wrote:
> Sorry for the delay (been enroute). Only issue I have remains the subSequence
> change - you can't weaken the post-condition of CharSequence.subSequence without
> breaking subtype substitutability.

Hi David,

Yes, in general, what you say about weakening post-conditions is true. In this 
case, though, I can't see how it can cause any practical harm.

I think the CharSequence spec for subSequence() is written with the assumption 
that the CharSequence is mutable. If I'm not mistaken, all the CharSequence 
implementations except for String are mutable. Thus, in the statement

     Returns a new CharSequence that is a subsequence of this sequence.

the "new" really means that if the value of this sequence is later mutated, it 
will not affect the value of the returned sequence. Well, strictly speaking, 
it's ambiguous; the "new" could refer to either the sequences' values, the 
reference identity of this vs the returned sequence, or both. I'd say that for 
the mutable CharSequences, keeping the values independent is the important part, 
and having independent values pretty much implies that the returned sequence is 
a distinct object.

Since Strings are immutable, there is no concern about mutation of the value of 
the original String affecting the value of the returned String. That leaves 
reference identity. Obviously identity of mutable objects is really important. 
For immutable objects, identity is much less important, and so I'm trying to 
relax the spec to allow more discretion to the implementation about when to 
create new objects.

I guess our choices are as follows:

0) Current webrev: remove "new" from String's spec for subSequence(). Allows 
'this' to be returned in certain cases, but weakens the post-condition of its 
definining interface, which is usually a no-no.

1) Leave String.subSequence() as it stands, requiring a new CharSequence. This 
fixes the relationship with the CS interface, but the current implementation 
violates this spec, as it returns 'this' in some cases (where the subSequence 
represents the entire String).

2) Leave String.subSequence() as it stands, requiring a new CharSequence, and 
change the implementation so that it always creates a new String, even if the 
subSequence represents the entire original String.

3) Change the CharSequence.subSequence() spec to be more precise about what it 
requires (mutation of one doesn't affect the other), allowing 
String.subSequence() to return 'this' when it can.

Anything else?

I don't like (1) as it leaves our implementation in violation of a close reading 
of the spec. I don't like (2) as it de-optimizes the current implementation. I 
could live with (3) but it's more work, as new wording will have to be 
generated, and its impact on the other CS implementations will need to be 
assessed. It would probably result in the "best" (as in, most precise and 
self-consistent) specification, but I have to question whether it's worth the 
effort.

I still prefer (0), though I acknowledge it does have this issue of weakening 
the interface's post-condition. I note that there are other places where an 
implementation doesn't strictly conform to its interface, such as 
IdentityHashMap and Map. Another possibility, I guess, would be to do something 
like adding an @apiNote to String.subSequence() explaining why it's different 
from CS.subSequence().

s'marks









> Thanks,
> David
>
> On 12/11/2013 8:43 AM, Stuart Marks wrote:
>> Hi all,
>>
>> Here's an updated version of the String spec change. Changes from the
>> previous version address comments made by Brent Christian and David
>> Holmes. Specifically:
>>
>>   - Specify copyValueOf() overloads as "equivalent to" corresponding
>> valueOf() overloads.
>>   - Remove extranous changes to subSequence() method
>>   - Clarify wording of concat() method doc.
>>
>> Bug report:
>>
>>      https://bugs.openjdk.java.net/browse/JDK-7174936
>>
>> Webrev:
>>
>>      http://cr.openjdk.java.net/~smarks/reviews/7174936/webrev.1/
>>
>> Specdiff:
>>
>>
>> http://cr.openjdk.java.net/~smarks/reviews/7174936/specdiff.1/overview-summary.html
>>
>>
>> Thanks!
>>
>> s'marks
>>
>> On 11/7/13 2:31 AM, Stuart Marks wrote:
>>> Hi all,
>>>
>>> Please review the following spec changes to java.lang.String.
>>>
>>> In several places the specs mention returning "new" strings. This is
>>> overspecified; it's sufficient to return "a" string that satisfies the
>>> required
>>> properties. In some cases the current implementation doesn't create a
>>> new string
>>> -- for example, substring(0) returns 'this' -- which is strictly a
>>> violation of
>>> the spec. The solution is to relax the spec requirement to create new
>>> strings.
>>>
>>> Also, this change cleans up the specs for the copyValueOf() overloads
>>> to make
>>> them identical to the corresponding valueOf() overloads. Previously,
>>> they were
>>> gratuitously different. I think that some time in the dim, distant past,
>>> probably before JDK 1.0, they had different semantics, but now they're
>>> identical. The specs should say they're identical.
>>>
>>> This change is spec only, no code changes.
>>>
>>> Bug report:
>>>
>>>      https://bugs.openjdk.java.net/browse/jdk-7174936
>>>
>>> Webrev:
>>>
>>>      http://cr.openjdk.java.net/~smarks/reviews/7174936/webrev.0/
>>>
>>> Specdiff:
>>>
>>>
>>> http://cr.openjdk.java.net/~smarks/reviews/7174936/specdiff.0/overview-summary.html
>>>
>>>
>>>
>>> Thanks!
>>>
>>> s'marks



More information about the core-libs-dev mailing list