Request for Enhancement: java.io.Writer.of(Appendable) as an efficient alternative to java.io.StringWriter
Markus KARG
markus at headcrashing.eu
Sun Jan 26 16:37:22 UTC 2025
As there have not been any more comments so far in the past weeks, I
assume there is common agreement with my current proposal.
If not, please chime in ASAP. If there are no further comments, I will
continue with the existing PR next.
-Markus
Am 31.12.2024 um 14:43 schrieb Markus KARG:
>
> Hi Chen,
>
> thank you for your ideas!
>
> Actually I cannot see what is "safer" in your proposal, but maybe I am
> missing to see a hidden risk in instanceof. Can you please outline the
> potential risk you actually see in "if (appendable implements
> Flushable f) f.flush();"?
>
> I mean, Flushable and Closable are simply *mix-ins* existing for
> exactly the purpose of "flushing-if-flusing-is-supported" and
> "closing-if-closing-is-supported", which is what we do need right
> here. Nobody wants to pass in a standalone "flusher" or standalone
> "closer" in addition to the actual object to flush and close, i. e.,
> the Appendable. In particular, nobody actually reported the need to
> build a Writer from three distrinct implementation objects (or I
> missed this need). Explicitly passing "null" feels rather unintuitive
> and IMHO is doubtful. Why should someone want to do that? Again,
> apparently you see that use case, so if you really have strong
> feelings, then please make me understand who needs that and for what
> actual purpose. :-)
>
> To be all on the same side, again, please always share the core idea
> that this API more or less solely is the combination of
> "Writer.of(StringBuilder)" with "Writer.of(StringBuffer)" and
> "Writer.of(CharSet)".
>
> Note that the sole target still is to pass in a StringBuilder,
> StringWriter, or CharBuffer, as wrapping *them* is *the driver* for
> the new API. While someone *can* do that, it is *not the target* of
> this API to pass in any Writer or any arbitrary Appendable. Therefore,
> we just need to be able *to deal with that case* once it happens --
> which is why it is IMHO absolutely fine to directly return Writers
> *non-wrapped*. The API so far just says, "passing a Writer in turn
> returns a Writer", but it does *not* propose to enhance or limit that
> Writer in any way, and that is why it is (IMHO absolutely) safe to
> check all other Appendables for *their* actual ability to get flushed
> or to get closed. Remember, the target of *this* API proposal is *not*
> to be able to write into any Flushable-and-Closable-Appendable
> *without* flushing or closing it. Having that said, *I do not veto*
> adding an *additional* method like Writer.of(Appendable, boolean
> preventFlush, preventClose) *later* **if needed**, but IMHO that
> should rather be *separate* wrappers like
> Writer.withoutFlushing(Writer) and Writer.withoutClosing(Writer)
> (either you have the need to not-flush/not-close, or you don't have
> it, so it is not a special case of *this* API), or something like
> that, which both are, again, *non-targets* of my current proposal. In
> fact I still do not see *any* benefit of passing in a Writer into
> Writer.of(), neither as a single reference, nor split up into three
> interfaces (and BTW, I did *not* say a Writer is a combination of
> Appendable, Flushable and Closable). Neither do I see *any benefit* of
> being able to pass in in three different implementation objects. But
> what I do see in your proposal actually is:
>
> * It would make up a can of worms due to the possibility of providing
> three different implementation objects for that three parameters.
> Someone could do Writer.of(new StringBuilder(),
> Files.newBufferedWriter(), new CharBuffer()) and the outcome would be
> rather dubious (and mostly useless but confusing).
>
> * As the sole target is to allow wrapping StringBuilder, StringWriter,
> and CharBuffer, and as we solely came to Flushable and Closable due to
> the question about "How to call flush and close ON THE PASSED
> REFERENCE, IFF the Appendable implements them?" it would be a real
> pain for alle users to be FORCED to repeat the same object three times.
>
> Having said that, my proposal is (as this is it what is IMHO mostly
> intuitive and most wanted):
>
> * Let's have solely Writer.of(Appendable) without any other parameters
> *in the first PR*; discuss the use case of more parameters *in
> subsequent PRs* IFF NEEDED as these should be *additional* method
> signatures to not torture the 90% standard case users with parameters
> they never need.
>
> * Let's return Writer non-wrapped, and clearly document that in the
> JavaDocs. Have separate discussions about
> Writer.withoutFlushing(Writer) and Writer.withourClosing(Writer) *in
> subsequent threads* IFF NEEDED.
>
> * Let's use "if (appendable instanceof Flushable f) f.flush()" and "if
> (appendable instanceof Closebale c) c.close()", and clearly document
> that in the JavaDocs. In case users do really want non-flushed,
> non-closed appendables wrapped as Writer, they do not lose something,
> but have to wait for the outcome of *subsequent* discussions about
> *additional* wrappers.
>
> I think that could be a clean, safe and straightforward way towards
> the replacement of StringWriter.
>
> Regards and a happy new year! :-)
>
> -Markus
>
>
> Am 31.12.2024 um 06:42 schrieb Chen Liang:
>> Hi Markus,
>> Thanks for your analysis that a Writer can be seen as a composition
>> as an Appendable, a Flushable, and a Closeable.
>> Given this view, I think we should add a Writer.of(Appenable,
>> Flushable, Closeable) to specify the 3 component behaviors of the
>> returned writer.
>> Each of the 3 arguments can be null, so that component will be no-op
>> (Writer's Appendable methods only need to trivially return the Writer
>> itself; all other methods return void).
>> We will always require all 3 arguments to be passed; a null component
>> means the caller knowingly demands no-op behavior for that component.
>> I believe this approach would be safer, and avoids the accidental
>> delegation of unwanted features from a given input Appendable when it
>> happens to duck type.
>>
>> Regards,
>> Chen Liang
>>
>> On Sat, Dec 28, 2024 at 10:41 PM Markus KARG <markus at headcrashing.eu>
>> wrote:
>>
>> Chen,
>>
>> thank you for your comments! My ideas to address them are:
>>
>> * flush(): If the Appendable implements Flushable, then perform
>> Flushable.flush() on it. Otherwise, Writer.flush() will be a
>> no-op (besides checking if Writer is open).
>>
>> * close(): If the Appendable implements Closeable, then perform
>> Closeable.close() on it. Otherwise, Writer.close() will be a
>> no-op (besides calling this.flush() if open, and internally
>> marking itself as closed).
>>
>> * Writer.of(Writer): The original sense of the new API is to
>> create a Writer wrapping non-Writers like StringBuilder,
>> CharBuffer etc., but not to reduce a Writer to an Appendable
>> (that would rather be Appendable.narrow(Writer) or so). IMHO
>> there is neither any need nor benefit to return a limited Writer
>> instead of the actual writer. So actually I would plea for
>> directly returning the given writer itself, so Writer.of(Writer)
>> is a no-op. I do not see why someone would intentionally pass in
>> a Writer in the hope to get back a more limited, non-flushing /
>> non-closing variant of it, and I have a bad feeling about
>> returning a Writer which is deliberately cutting away the ability
>> to flush and close without any technical need. Maybe you could
>> elaborate on your idea if you have strong feelings about that use
>> case?
>>
>> * StringWriter: Writer.of() is -by intention- not a "fire and
>> forget" drop-in replacement, but a "real" Writer. It comes with a
>> price, but in do not see a big problem here. If one is such happy
>> with StringWriter that dealing with IOException would be a no-go,
>> then simply keep the app as-is. But if one really wants the
>> benefits provided by Writer.of(), then dealing with IOExcpetion
>> should be worth it. This is a (IMHO very) low price the
>> programmer has to pay for the benefit of gaining non-sync,
>> non-copy behavior. In most code using StringWriter I have seen so
>> far, IOException was dealt with anyways, as the code was mostly
>> IO-bound already (it expects "some" Writer, not a StringWriter,
>> as it wants to perform I/O, but the target is "by incident" a
>> String).
>>
>> To sum up: IMHO still it sounds feasible and the benefits
>> outweigh the costs. :-)
>>
>> -Markus
>>
>>
>> Am 28.12.2024 um 01:51 schrieb Chen Liang:
>>>
>>> Hi Markus,
>>> I think the idea makes sense, but it comes with more
>>> difficulties than in the case of Reader.of. An Appendable is a
>>> higher abstraction modeling only the character writing aspects,
>>> without concerns with resource control (such as flush or close).
>>>
>>> One detail of note is that Writer itself implements Appendable,
>>> but I don't think the new method should return a Writer as-is; I
>>> think it should return another writer whose close will not close
>>> the underlying writer as we are only modelling the appendable
>>> behavior without exporting the resource control methods. Not
>>> sure about flush.
>>>
>>> One use case you have mentioned is StringWriter. StringWriter is
>>> distinct from StringReader: its write and append methods do not
>>> throw IOE while the base Writer does. So Writer.of cannot
>>> adequately replace StringWriter without use-site ugliness, until
>>> we have generic types that represent the bottom type.
>>>
>>> Regards,
>>>
>>> Chen Liang
>>>
>>>
>>> On Fri, Dec 20, 2024, 11:12 PM Markus KARG
>>> <markus at headcrashing.eu> wrote:
>>>
>>> Dear Sirs,
>>>
>>> JDK 24 comes with Reader.of(CharSequence), now let's provide
>>> the
>>> symmetrical counterpart Writer.of(Appendable) in JDK 25! :-)
>>>
>>> For performance reasons, hereby I like to propose the new
>>> public factory
>>> method Writer.of(Appendable). This will provide the same
>>> benefits for
>>> writing, that Reader.of(CharSequence) provides for reading
>>> since JDK 24
>>> (see JDK-8341566). Before sharing a pull request, I'd kindly
>>> like to
>>> request for comments.
>>>
>>> Since Java 1.1 we have the StringWriter class. Since Java
>>> 1.5 we have
>>> the Appendable interface. StringBuilder, StringBuffer and
>>> CharBuffer are
>>> first-class implementations of it in the JDK, and there
>>> might exist
>>> third-party implementations of non-String text sinks. Until
>>> today,
>>> however, we do not have a Writer for Appendables, but need
>>> to go costly
>>> detours.
>>>
>>> Text sinks in Java are expected to implement the Writer
>>> interface.
>>> Libraries and frameworks expect application code to provide
>>> Writers to
>>> consume text produced by the library or framework, for example.
>>> Application code often wants to modify the received text, e.
>>> g. embed
>>> received SVG text into in a larger HTML text document, or
>>> simply forward
>>> the text as-is to I/O, so StringBuilder or CharBuffer is
>>> what the
>>> application code actually uses, but not Strings! In such
>>> cases, taking
>>> the StringWriter.toString() detour is common but
>>> inefficient: It implies
>>> duplicating the COMPLETE text for the sole sake of creating
>>> a temporary
>>> String, while the subsequent processing will copy the data
>>> anyways or
>>> just uses a small piece of it. This eats up time and memory
>>> uselessly,
>>> and increases GC pressure. Also, StringWriter is
>>> synchronized (not
>>> explicitly, but de-facto, as it uses StringBuffer), which
>>> implies
>>> another needless slowdown. In many cases, the
>>> synchronization has no use
>>> at all, as in real-world applications least Writers are
>>> actually
>>> accessed concurrently. As a result, today the major benefit of
>>> StringBuilder over StringBuffer (being non-synchronized)
>>> vanishes as
>>> soon as a StringWriter is used to provide its content. This
>>> means,
>>> "stringBuilder.append(stringWriter.toString())" imposes slower
>>> performance than essentially needed, in two ways:
>>> toString(), synchronized.
>>>
>>> In an attempt to improve performance of this rather typical
>>> use case, I
>>> like to contribute a pull request providing the new public
>>> factory
>>> method java.io.Writer.of(Appendable). This is symmetrical to
>>> the
>>> solution we implemented in JDK-8341566 for the reversed case:
>>> java.io.Reader.of(CharSequence).
>>>
>>> My idea is to mostly copy the existing code of StringWriter,
>>> but wrap a
>>> caller-provided Appendable instead of an internally created
>>> StringBuilder; this strips synchronization; then add
>>> optimized use for
>>> the StringBuffer, StringBuilder and CharBuffer
>>> implementations (in the
>>> sense of write(char[],start,end) to prevent a char-by-char
>>> loop in these
>>> cases).
>>>
>>> Alternatives:
>>>
>>> - Applications could use Apache Commons IO's
>>> StringBuilderWriter, which
>>> is limited to StringBuilder, so is not usable for the
>>> CharBuffer or
>>> custom Appendable case. As it is an open-source third-party
>>> dependency,
>>> some authors might not be allowed to use it, or may not want
>>> to carry
>>> this additional burden just for the sake of this single
>>> performance
>>> improvement. In addition, this library is not actively
>>> modernized; its
>>> Java baseline still is Java 8. There is no commercial support.
>>>
>>> - Applications could write their own Writer implementation.
>>> Given the
>>> assumption that this is a rather common use case, this imposes
>>> unjustified additional work for the authors of thousands of
>>> applications. It is hard to justify why there is a
>>> StringWriter but not
>>> a Writer for other Appendables.
>>>
>>> - Instead of writing a new Writer factory method, we could
>>> slightly
>>> modify StringWriter, so it uses StringBuilder (instead of
>>> StringBuffer).
>>> This (still) results in unnecessary duplication of the full
>>> text at
>>> toString() and (now also) at getBuffer(), and it will break
>>> existing
>>> applications due the missing synchronization.
>>>
>>> - Instead of writing a new Writer factory method, we could
>>> write a new
>>> AppendableWriter class. This piles up the amount of public
>>> classes,
>>> which was the main reason in JDK-8341566 to go with the
>>> "Reader.of(CharSequence)" factory method instead of the
>>> "CharSequenceReader" class. Also it would be confusing to have
>>> Reader.of(...) but not Writer.of(...) in the API.
>>>
>>> - We could go with a specific Appendable class (like
>>> StringBuilder)
>>> instead of supporting all Appendable implementations. This
>>> would reduce
>>> the number of applicable use cases daramatically (in
>>> particular as
>>> CharBuffer is not supported any more) without providing any
>>> considerable
>>> benefit (other than making the OpenJDK-internal source code
>>> a bit
>>> shorter). In particular it makes it impossible to opt-in for
>>> the below
>>> option:
>>>
>>> Option:
>>>
>>> - Once we have Writer.of(Appendable), we could replace the full
>>> implementation of StringWriter by synchronized calls to the
>>> new Writer.
>>> This would reduce duplicate code.
>>>
>>> Kindly requesting comments.
>>>
>>> -Markus Karg
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20250126/03e28a80/attachment-0001.htm>
More information about the core-libs-dev
mailing list