String class api redesign
Red IO
redio.development at gmail.com
Fri Jun 2 18:00:54 UTC 2023
Thanks for the feedback.
And yes the attempt you suggested first is possible and I have done that
trying to solve the problem and it did.. for my library. My library
expected a subinterface of CharSequence that mirrored string and provided
many convince methods to create it from srings, CharSequences etc.
But the problem I'm trying to address is that library authors use string
making string the only mutable "interface" for text.
I could create a library only containing the "stringLike" interface and
push it to be adopted. But I don't think that would go anywhere since it
would be yet another non standard solution.
I think we have different assumptions on why library authors choose string
over CharSequence even though CharSequence is more convenient for users.
My theory is that CharSequence doesn't provide enough functionality. For
example a parsing library. It would need to implement things like
toLowerCase internally or convert the CharSequence to string.
Homebrewing every method that is "missing" from CharSequence that you need
is cumbersome and not worth it for a library.
It would be far more optimal to shift that burden to those who develop
their own string implementation.
I predict that a recommend interface to replace string would eventually
catch on just like for example the switch from Vector to List interface
(making the implementation irrelevant in most cases)
Also in context of implementating the string methods it's not really to
difficult. Someone who needs a different string implementation for a
special reason most likely won't have a problem with implementing that.
Also plug in string implementations for special purposes in form of
libraries would be possible then.
The thing with equals wouldn't be a huge issue. It's just a implementation
detail that needs to be discussed.
I don't think creating and maintaining an interface that has the same
methods (and contracts) as string would be such a high burden on the jdk
team.
Something like new string methods could be handled in various ways again
something to discuss in case it's seriously considered.
The idea of changing string was like I said radical and most likely
unpractical. I just wanted to throw it at the table as an over the top
solution that came to my mind.
Great regards
RedIODev
On Fri, Jun 2, 2023, 17:25 Ethan McCue <ethan at mccue.dev> wrote:
> Hi,
>
> First as to your problem statement - if a method could be implemented with
> a default method on CharSequence then there is nothing stopping library
> owners from either
>
> 1. Extending the CharSequence interface
>
> interface RichCharSequence extends CharSequence {
> default byte[] getBytes() {
> byte[] bytes = new byte[this.length()];
> // ...
> return bytes;
> }
>
> static RichCharSequence from(CharSequence csq) {
> // ...
> }
> }
>
> (see
> https://javadoc.io/doc/io.vavr/vavr/0.9.0/io/vavr/collection/CharSeq.html)
>
> 2. Using static methods defined in their projects or in dependencies
>
> class CharSequenceUtils {
> private CharSequenceUtils() {
> }
>
> public static byte[] getBytes(CharSequence csq) {
> // ...
> }
> }
>
> So your problem comes down to the fact that there are libraries out there
> which take Strings as arguments that you wish took CharSequences. That
> problem is not mechanical, it is social. Your proposed solutions are ways
> to "force" those library owners to support your use case by making it so
> String means something different than it does or to add more
> functionality to CharSequence such that it might be more socially
> convenient to make the change.
>
> > One solution could be to add many of strings useful methods to
> CharSequence and implement them with default methods based on the existing
> abstract methods (or throw exception). These defaults could than be
> overridden by implementers that have more context to provide a more
> efficient implementation.
>
> I think it's reasonable to propose methods that you feel CharSequence is
> "missing," but "make it equal to String" is not. CharSequence does
> represent a slightly different concept and methods like toLowerCase might
> not make that much sense.
>
> > Another approach could be to add a new interface that mirrors strings
> methods and is implemented by string decoupling the method api of string
> from the fixed string class.
>
> Say there was a StringLike interface. I'm not convinced it solves your
> problem. Say the contract of that interface said "must be an immutable
> character sequence, randomly indexable, yada yada". To be actually
> interchangeable with String then you would have to be sure that every
> expression involving a string would work the same or in a similar way
>
> // How is String going to account for a StringLike argument to
> equals?
> // Tricky conceptually, but also likely a performance regression
> "abc".equals(stringLike)
>
> > An even more radical and most likely most difficult/breaking approach
> would be to make string itself an interface. This solution would have the
> upside that all existing libraries would suddenly accept the interface
> instead of the locked down class without migrating each library seperatly.
>
> It is far too late to do this. The bytecode for invoking an interface
> method is different than for invoking a class method. There is no way,
> absent significant and costly VM heroics, to make that a binary compatible
> change. All existing JVM code in the world would break.
>
> We also cannot make String non-final. Various parts of the JVM (and user
> code) rely on Strings being truly immutable. A String subclass would
> wreak havoc on everyone's invariants.
>
> > They all achieve the same goal of allowing libraries to ask for
> something string like without specifying the explicit implementation
> (similar to the collections framework) this would allow seamless use of
> other implementations in libraries without them having to reimplement or
> work around existing methods from the string class.
>
> The benefits of achieving this goal (by using the levers available only to
> the JDK) seem way out of line with the costs.
>
> On Fri, Jun 2, 2023 at 10:14 AM Red IO <redio.development at gmail.com>
> wrote:
>
>> Since String is a locked down Class part of the standard library it
>> cannot be modified nor subclassed(final).
>> Also string directly has a lot of useful methods on hand that the more
>> general CharSequence does not.
>> This and the fact that CharSequence was only added in later versions
>> (compared to string) causes many libraries to use the type string as
>> arguments.
>> This makes working with different implementations of text awkward and
>> inefficient (since you often have to convert to string)
>> There are multiple possible approaches to resolve this issue.
>>
>> One solution could be to add many of strings useful methods to
>> CharSequence and implement them with default methods based on the existing
>> abstract methods (or throw exception). These defaults could than be
>> overridden by implementers that have more context to provide a more
>> efficient implementation.
>>
>> Another approach could be to add a new interface that mirrors strings
>> methods and is implemented by string decoupling the method api of string
>> from the fixed string class.
>>
>> An even more radical and most likely most difficult/breaking approach
>> would be to make string itself an interface. This solution would have the
>> upside that all existing libraries would suddenly accept the interface
>> instead of the locked down class without migrating each library seperatly.
>> The downsides would be that previous to this change all constructors would
>> need to be made private and some code that either used the constructors or
>> was dependant on the class file structure of string might break. Also
>> string literals would need to be rerouted to create an instance of the then
>> internal implementation of string instead.
>>
>> There sure are other possibilities but those 3 where those I came up with.
>>
>> They all achieve the same goal of allowing libraries to ask for something
>> string like without specifying the explicit implementation (similar to the
>> collections framework) this would allow seamless use of other
>> implementations in libraries without them having to reimplement or work
>> around existing methods from the string class.
>>
>> Great regards
>> RedIODev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/discuss/attachments/20230602/a5de03fc/attachment.htm>
More information about the discuss
mailing list