String class api redesign
Ethan McCue
ethan at mccue.dev
Fri Jun 2 15:24:54 UTC 2023
Hi,
First as to your problem statement - if a method could be implemented with
a default method on CharSequence then there is nothing stopping library
owners from either
1. Extending the CharSequence interface
interface RichCharSequence extends CharSequence {
default byte[] getBytes() {
byte[] bytes = new byte[this.length()];
// ...
return bytes;
}
static RichCharSequence from(CharSequence csq) {
// ...
}
}
(see
https://javadoc.io/doc/io.vavr/vavr/0.9.0/io/vavr/collection/CharSeq.html)
2. Using static methods defined in their projects or in dependencies
class CharSequenceUtils {
private CharSequenceUtils() {
}
public static byte[] getBytes(CharSequence csq) {
// ...
}
}
So your problem comes down to the fact that there are libraries out there
which take Strings as arguments that you wish took CharSequences. That
problem is not mechanical, it is social. Your proposed solutions are ways
to "force" those library owners to support your use case by making it so
String means something different than it does or to add more functionality
to CharSequence such that it might be more socially convenient to make the
change.
> One solution could be to add many of strings useful methods to
CharSequence and implement them with default methods based on the existing
abstract methods (or throw exception). These defaults could than be
overridden by implementers that have more context to provide a more
efficient implementation.
I think it's reasonable to propose methods that you feel CharSequence is
"missing," but "make it equal to String" is not. CharSequence does
represent a slightly different concept and methods like toLowerCase might
not make that much sense.
> Another approach could be to add a new interface that mirrors strings
methods and is implemented by string decoupling the method api of string
from the fixed string class.
Say there was a StringLike interface. I'm not convinced it solves your
problem. Say the contract of that interface said "must be an immutable
character sequence, randomly indexable, yada yada". To be actually
interchangeable with String then you would have to be sure that every
expression involving a string would work the same or in a similar way
// How is String going to account for a StringLike argument to equals?
// Tricky conceptually, but also likely a performance regression
"abc".equals(stringLike)
> An even more radical and most likely most difficult/breaking approach
would be to make string itself an interface. This solution would have the
upside that all existing libraries would suddenly accept the interface
instead of the locked down class without migrating each library seperatly.
It is far too late to do this. The bytecode for invoking an interface
method is different than for invoking a class method. There is no way,
absent significant and costly VM heroics, to make that a binary compatible
change. All existing JVM code in the world would break.
We also cannot make String non-final. Various parts of the JVM (and user
code) rely on Strings being truly immutable. A String subclass would wreak
havoc on everyone's invariants.
> They all achieve the same goal of allowing libraries to ask for something
string like without specifying the explicit implementation (similar to the
collections framework) this would allow seamless use of other
implementations in libraries without them having to reimplement or work
around existing methods from the string class.
The benefits of achieving this goal (by using the levers available only to
the JDK) seem way out of line with the costs.
On Fri, Jun 2, 2023 at 10:14 AM Red IO <redio.development at gmail.com> wrote:
> Since String is a locked down Class part of the standard library it cannot
> be modified nor subclassed(final).
> Also string directly has a lot of useful methods on hand that the more
> general CharSequence does not.
> This and the fact that CharSequence was only added in later versions
> (compared to string) causes many libraries to use the type string as
> arguments.
> This makes working with different implementations of text awkward and
> inefficient (since you often have to convert to string)
> There are multiple possible approaches to resolve this issue.
>
> One solution could be to add many of strings useful methods to
> CharSequence and implement them with default methods based on the existing
> abstract methods (or throw exception). These defaults could than be
> overridden by implementers that have more context to provide a more
> efficient implementation.
>
> Another approach could be to add a new interface that mirrors strings
> methods and is implemented by string decoupling the method api of string
> from the fixed string class.
>
> An even more radical and most likely most difficult/breaking approach
> would be to make string itself an interface. This solution would have the
> upside that all existing libraries would suddenly accept the interface
> instead of the locked down class without migrating each library seperatly.
> The downsides would be that previous to this change all constructors would
> need to be made private and some code that either used the constructors or
> was dependant on the class file structure of string might break. Also
> string literals would need to be rerouted to create an instance of the then
> internal implementation of string instead.
>
> There sure are other possibilities but those 3 where those I came up with.
>
> They all achieve the same goal of allowing libraries to ask for something
> string like without specifying the explicit implementation (similar to the
> collections framework) this would allow seamless use of other
> implementations in libraries without them having to reimplement or work
> around existing methods from the string class.
>
> Great regards
> RedIODev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/discuss/attachments/20230602/d4b09f54/attachment-0001.htm>
More information about the discuss
mailing list