Request for Enhancement: java.io.Writer.of(Appendable) as an efficient alternative to java.io.StringWriter

Chen Liang liangchenblue at gmail.com
Tue Dec 31 05:42:15 UTC 2024


Hi Markus,
Thanks for your analysis that a Writer can be seen as a composition as an
Appendable, a Flushable, and a Closeable.
Given this view, I think we should add a Writer.of(Appenable, Flushable,
Closeable) to specify the 3 component behaviors of the returned writer.
Each of the 3 arguments can be null, so that component will be no-op
(Writer's Appendable methods only need to trivially return the Writer
itself; all other methods return void).
We will always require all 3 arguments to be passed; a null component means
the caller knowingly demands no-op behavior for that component.
I believe this approach would be safer, and avoids the accidental
delegation of unwanted features from a given input Appendable when it
happens to duck type.

Regards,
Chen Liang

On Sat, Dec 28, 2024 at 10:41 PM Markus KARG <markus at headcrashing.eu> wrote:

> Chen,
>
> thank you for your comments! My ideas to address them are:
>
> * flush(): If the Appendable implements Flushable, then perform
> Flushable.flush() on it. Otherwise, Writer.flush() will be a no-op (besides
> checking if Writer is open).
>
> * close(): If the Appendable implements Closeable, then perform
> Closeable.close() on it. Otherwise, Writer.close() will be a no-op (besides
> calling this.flush() if open, and internally marking itself as closed).
>
> * Writer.of(Writer): The original sense of the new API is to create a
> Writer wrapping non-Writers like StringBuilder, CharBuffer etc., but not to
> reduce a Writer to an Appendable (that would rather be
> Appendable.narrow(Writer) or so). IMHO there is neither any need nor
> benefit to return a limited Writer instead of the actual writer. So
> actually I would plea for directly returning the given writer itself, so
> Writer.of(Writer) is a no-op. I do not see why someone would intentionally
> pass in a Writer in the hope to get back a more limited, non-flushing /
> non-closing variant of it, and I have a bad feeling about returning a
> Writer which is deliberately cutting away the ability to flush and close
> without any technical need. Maybe you could elaborate on your idea if you
> have strong feelings about that use case?
>
> * StringWriter: Writer.of() is -by intention- not a "fire and forget"
> drop-in replacement, but a "real" Writer. It comes with a price, but in do
> not see a big problem here. If one is such happy with StringWriter that
> dealing with IOException would be a no-go, then simply keep the app as-is.
> But if one really wants the benefits provided by Writer.of(), then dealing
> with IOExcpetion should be worth it. This is a (IMHO very) low price the
> programmer has to pay for the benefit of gaining non-sync, non-copy
> behavior. In most code using StringWriter I have seen so far, IOException
> was dealt with anyways, as the code was mostly IO-bound already (it expects
> "some" Writer, not a StringWriter, as it wants to perform I/O, but the
> target is "by incident" a String).
>
> To sum up: IMHO still it sounds feasible and the benefits outweigh the
> costs. :-)
>
> -Markus
>
>
> Am 28.12.2024 um 01:51 schrieb Chen Liang:
>
> Hi Markus,
> I think the idea makes sense, but it comes with more difficulties than in
> the case of Reader.of. An Appendable is a higher abstraction modeling only
> the character writing aspects, without concerns with resource control (such
> as flush or close).
>
> One detail of note is that Writer itself implements Appendable, but I
> don't think the new method should return a Writer as-is; I think it should
> return another writer whose close will not close the underlying writer as
> we are only modelling the appendable behavior without exporting the
> resource control methods. Not sure about flush.
>
> One use case you have mentioned is StringWriter. StringWriter is distinct
> from StringReader: its write and append methods do not throw IOE while the
> base Writer does. So Writer.of cannot adequately replace StringWriter
> without use-site ugliness, until we have generic types that represent the
> bottom type.
>
> Regards,
>
> Chen Liang
>
> On Fri, Dec 20, 2024, 11:12 PM Markus KARG <markus at headcrashing.eu> wrote:
>
>> Dear Sirs,
>>
>> JDK 24 comes with Reader.of(CharSequence), now let's provide the
>> symmetrical counterpart Writer.of(Appendable) in JDK 25! :-)
>>
>> For performance reasons, hereby I like to propose the new public factory
>> method Writer.of(Appendable). This will provide the same benefits for
>> writing, that Reader.of(CharSequence) provides for reading since JDK 24
>> (see JDK-8341566). Before sharing a pull request, I'd kindly like to
>> request for comments.
>>
>> Since Java 1.1 we have the StringWriter class. Since Java 1.5 we have
>> the Appendable interface. StringBuilder, StringBuffer and CharBuffer are
>> first-class implementations of it in the JDK, and there might exist
>> third-party implementations of non-String text sinks. Until today,
>> however, we do not have a Writer for Appendables, but need to go costly
>> detours.
>>
>> Text sinks in Java are expected to implement the Writer interface.
>> Libraries and frameworks expect application code to provide Writers to
>> consume text produced by the library or framework, for example.
>> Application code often wants to modify the received text, e. g. embed
>> received SVG text into in a larger HTML text document, or simply forward
>> the text as-is to I/O, so StringBuilder or CharBuffer is what the
>> application code actually uses, but not Strings! In such cases, taking
>> the StringWriter.toString() detour is common but inefficient: It implies
>> duplicating the COMPLETE text for the sole sake of creating a temporary
>> String, while the subsequent processing will copy the data anyways or
>> just uses a small piece of it. This eats up time and memory uselessly,
>> and increases GC pressure. Also, StringWriter is synchronized (not
>> explicitly, but de-facto, as it uses StringBuffer), which implies
>> another needless slowdown. In many cases, the synchronization has no use
>> at all, as in real-world applications least Writers are actually
>> accessed concurrently. As a result, today the major benefit of
>> StringBuilder over StringBuffer (being non-synchronized) vanishes as
>> soon as a StringWriter is used to provide its content. This means,
>> "stringBuilder.append(stringWriter.toString())" imposes slower
>> performance than essentially needed, in two ways: toString(),
>> synchronized.
>>
>> In an attempt to improve performance of this rather typical use case, I
>> like to contribute a pull request providing the new public factory
>> method java.io.Writer.of(Appendable). This is symmetrical to the
>> solution we implemented in JDK-8341566 for the reversed case:
>> java.io.Reader.of(CharSequence).
>>
>> My idea is to mostly copy the existing code of StringWriter, but wrap a
>> caller-provided Appendable instead of an internally created
>> StringBuilder; this strips synchronization; then add optimized use for
>> the StringBuffer, StringBuilder and CharBuffer implementations (in the
>> sense of write(char[],start,end) to prevent a char-by-char loop in these
>> cases).
>>
>> Alternatives:
>>
>> - Applications could use Apache Commons IO's StringBuilderWriter, which
>> is limited to StringBuilder, so is not usable for the CharBuffer or
>> custom Appendable case. As it is an open-source third-party dependency,
>> some authors might not be allowed to use it, or may not want to carry
>> this additional burden just for the sake of this single performance
>> improvement. In addition, this library is not actively modernized; its
>> Java baseline still is Java 8. There is no commercial support.
>>
>> - Applications could write their own Writer implementation. Given the
>> assumption that this is a rather common use case, this imposes
>> unjustified additional work for the authors of thousands of
>> applications. It is hard to justify why there is a StringWriter but not
>> a Writer for other Appendables.
>>
>> - Instead of writing a new Writer factory method, we could slightly
>> modify StringWriter, so it uses StringBuilder (instead of StringBuffer).
>> This (still) results in unnecessary duplication of the full text at
>> toString() and (now also) at getBuffer(), and it will break existing
>> applications due the missing synchronization.
>>
>> - Instead of writing a new Writer factory method, we could write a new
>> AppendableWriter class. This piles up the amount of public classes,
>> which was the main reason in JDK-8341566 to go with the
>> "Reader.of(CharSequence)" factory method instead of the
>> "CharSequenceReader" class. Also it would be confusing to have
>> Reader.of(...) but not Writer.of(...) in the API.
>>
>> - We could go with a specific Appendable class (like StringBuilder)
>> instead of supporting all Appendable implementations. This would reduce
>> the number of applicable use cases daramatically (in particular as
>> CharBuffer is not supported any more) without providing any considerable
>> benefit (other than making the OpenJDK-internal source code a bit
>> shorter). In particular it makes it impossible to opt-in for the below
>> option:
>>
>> Option:
>>
>> - Once we have Writer.of(Appendable), we could replace the full
>> implementation of StringWriter by synchronized calls to the new Writer.
>> This would reduce duplicate code.
>>
>> Kindly requesting comments.
>>
>> -Markus Karg
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20241231/c01712b6/attachment-0001.htm>


More information about the core-libs-dev mailing list