Request for Enhancement: java.io.Writer.of(Appendable) as an efficient alternative to java.io.StringWriter
Markus KARG
markus at headcrashing.eu
Tue Dec 31 13:43:20 UTC 2024
Hi Chen,
thank you for your ideas!
Actually I cannot see what is "safer" in your proposal, but maybe I am
missing to see a hidden risk in instanceof. Can you please outline the
potential risk you actually see in "if (appendable implements Flushable
f) f.flush();"?
I mean, Flushable and Closable are simply *mix-ins* existing for exactly
the purpose of "flushing-if-flusing-is-supported" and
"closing-if-closing-is-supported", which is what we do need right here.
Nobody wants to pass in a standalone "flusher" or standalone "closer" in
addition to the actual object to flush and close, i. e., the Appendable.
In particular, nobody actually reported the need to build a Writer from
three distrinct implementation objects (or I missed this need).
Explicitly passing "null" feels rather unintuitive and IMHO is doubtful.
Why should someone want to do that? Again, apparently you see that use
case, so if you really have strong feelings, then please make me
understand who needs that and for what actual purpose. :-)
To be all on the same side, again, please always share the core idea
that this API more or less solely is the combination of
"Writer.of(StringBuilder)" with "Writer.of(StringBuffer)" and
"Writer.of(CharSet)".
Note that the sole target still is to pass in a StringBuilder,
StringWriter, or CharBuffer, as wrapping *them* is *the driver* for the
new API. While someone *can* do that, it is *not the target* of this API
to pass in any Writer or any arbitrary Appendable. Therefore, we just
need to be able *to deal with that case* once it happens -- which is why
it is IMHO absolutely fine to directly return Writers *non-wrapped*. The
API so far just says, "passing a Writer in turn returns a Writer", but
it does *not* propose to enhance or limit that Writer in any way, and
that is why it is (IMHO absolutely) safe to check all other Appendables
for *their* actual ability to get flushed or to get closed. Remember,
the target of *this* API proposal is *not* to be able to write into any
Flushable-and-Closable-Appendable *without* flushing or closing it.
Having that said, *I do not veto* adding an *additional* method like
Writer.of(Appendable, boolean preventFlush, preventClose) *later* **if
needed**, but IMHO that should rather be *separate* wrappers like
Writer.withoutFlushing(Writer) and Writer.withoutClosing(Writer) (either
you have the need to not-flush/not-close, or you don't have it, so it is
not a special case of *this* API), or something like that, which both
are, again, *non-targets* of my current proposal. In fact I still do not
see *any* benefit of passing in a Writer into Writer.of(), neither as a
single reference, nor split up into three interfaces (and BTW, I did
*not* say a Writer is a combination of Appendable, Flushable and
Closable). Neither do I see *any benefit* of being able to pass in in
three different implementation objects. But what I do see in your
proposal actually is:
* It would make up a can of worms due to the possibility of providing
three different implementation objects for that three parameters.
Someone could do Writer.of(new StringBuilder(),
Files.newBufferedWriter(), new CharBuffer()) and the outcome would be
rather dubious (and mostly useless but confusing).
* As the sole target is to allow wrapping StringBuilder, StringWriter,
and CharBuffer, and as we solely came to Flushable and Closable due to
the question about "How to call flush and close ON THE PASSED REFERENCE,
IFF the Appendable implements them?" it would be a real pain for alle
users to be FORCED to repeat the same object three times.
Having said that, my proposal is (as this is it what is IMHO mostly
intuitive and most wanted):
* Let's have solely Writer.of(Appendable) without any other parameters
*in the first PR*; discuss the use case of more parameters *in
subsequent PRs* IFF NEEDED as these should be *additional* method
signatures to not torture the 90% standard case users with parameters
they never need.
* Let's return Writer non-wrapped, and clearly document that in the
JavaDocs. Have separate discussions about Writer.withoutFlushing(Writer)
and Writer.withourClosing(Writer) *in subsequent threads* IFF NEEDED.
* Let's use "if (appendable instanceof Flushable f) f.flush()" and "if
(appendable instanceof Closebale c) c.close()", and clearly document
that in the JavaDocs. In case users do really want non-flushed,
non-closed appendables wrapped as Writer, they do not lose something,
but have to wait for the outcome of *subsequent* discussions about
*additional* wrappers.
I think that could be a clean, safe and straightforward way towards the
replacement of StringWriter.
Regards and a happy new year! :-)
-Markus
Am 31.12.2024 um 06:42 schrieb Chen Liang:
> Hi Markus,
> Thanks for your analysis that a Writer can be seen as a composition as
> an Appendable, a Flushable, and a Closeable.
> Given this view, I think we should add a Writer.of(Appenable,
> Flushable, Closeable) to specify the 3 component behaviors of the
> returned writer.
> Each of the 3 arguments can be null, so that component will be no-op
> (Writer's Appendable methods only need to trivially return the Writer
> itself; all other methods return void).
> We will always require all 3 arguments to be passed; a null component
> means the caller knowingly demands no-op behavior for that component.
> I believe this approach would be safer, and avoids the accidental
> delegation of unwanted features from a given input Appendable when it
> happens to duck type.
>
> Regards,
> Chen Liang
>
> On Sat, Dec 28, 2024 at 10:41 PM Markus KARG <markus at headcrashing.eu>
> wrote:
>
> Chen,
>
> thank you for your comments! My ideas to address them are:
>
> * flush(): If the Appendable implements Flushable, then perform
> Flushable.flush() on it. Otherwise, Writer.flush() will be a no-op
> (besides checking if Writer is open).
>
> * close(): If the Appendable implements Closeable, then perform
> Closeable.close() on it. Otherwise, Writer.close() will be a no-op
> (besides calling this.flush() if open, and internally marking
> itself as closed).
>
> * Writer.of(Writer): The original sense of the new API is to
> create a Writer wrapping non-Writers like StringBuilder,
> CharBuffer etc., but not to reduce a Writer to an Appendable (that
> would rather be Appendable.narrow(Writer) or so). IMHO there is
> neither any need nor benefit to return a limited Writer instead of
> the actual writer. So actually I would plea for directly returning
> the given writer itself, so Writer.of(Writer) is a no-op. I do not
> see why someone would intentionally pass in a Writer in the hope
> to get back a more limited, non-flushing / non-closing variant of
> it, and I have a bad feeling about returning a Writer which is
> deliberately cutting away the ability to flush and close without
> any technical need. Maybe you could elaborate on your idea if you
> have strong feelings about that use case?
>
> * StringWriter: Writer.of() is -by intention- not a "fire and
> forget" drop-in replacement, but a "real" Writer. It comes with a
> price, but in do not see a big problem here. If one is such happy
> with StringWriter that dealing with IOException would be a no-go,
> then simply keep the app as-is. But if one really wants the
> benefits provided by Writer.of(), then dealing with IOExcpetion
> should be worth it. This is a (IMHO very) low price the programmer
> has to pay for the benefit of gaining non-sync, non-copy behavior.
> In most code using StringWriter I have seen so far, IOException
> was dealt with anyways, as the code was mostly IO-bound already
> (it expects "some" Writer, not a StringWriter, as it wants to
> perform I/O, but the target is "by incident" a String).
>
> To sum up: IMHO still it sounds feasible and the benefits outweigh
> the costs. :-)
>
> -Markus
>
>
> Am 28.12.2024 um 01:51 schrieb Chen Liang:
>>
>> Hi Markus,
>> I think the idea makes sense, but it comes with more difficulties
>> than in the case of Reader.of. An Appendable is a higher
>> abstraction modeling only the character writing aspects, without
>> concerns with resource control (such as flush or close).
>>
>> One detail of note is that Writer itself implements Appendable,
>> but I don't think the new method should return a Writer as-is; I
>> think it should return another writer whose close will not close
>> the underlying writer as we are only modelling the appendable
>> behavior without exporting the resource control methods. Not sure
>> about flush.
>>
>> One use case you have mentioned is StringWriter. StringWriter is
>> distinct from StringReader: its write and append methods do not
>> throw IOE while the base Writer does. So Writer.of cannot
>> adequately replace StringWriter without use-site ugliness, until
>> we have generic types that represent the bottom type.
>>
>> Regards,
>>
>> Chen Liang
>>
>>
>> On Fri, Dec 20, 2024, 11:12 PM Markus KARG
>> <markus at headcrashing.eu> wrote:
>>
>> Dear Sirs,
>>
>> JDK 24 comes with Reader.of(CharSequence), now let's provide the
>> symmetrical counterpart Writer.of(Appendable) in JDK 25! :-)
>>
>> For performance reasons, hereby I like to propose the new
>> public factory
>> method Writer.of(Appendable). This will provide the same
>> benefits for
>> writing, that Reader.of(CharSequence) provides for reading
>> since JDK 24
>> (see JDK-8341566). Before sharing a pull request, I'd kindly
>> like to
>> request for comments.
>>
>> Since Java 1.1 we have the StringWriter class. Since Java 1.5
>> we have
>> the Appendable interface. StringBuilder, StringBuffer and
>> CharBuffer are
>> first-class implementations of it in the JDK, and there might
>> exist
>> third-party implementations of non-String text sinks. Until
>> today,
>> however, we do not have a Writer for Appendables, but need to
>> go costly
>> detours.
>>
>> Text sinks in Java are expected to implement the Writer
>> interface.
>> Libraries and frameworks expect application code to provide
>> Writers to
>> consume text produced by the library or framework, for example.
>> Application code often wants to modify the received text, e.
>> g. embed
>> received SVG text into in a larger HTML text document, or
>> simply forward
>> the text as-is to I/O, so StringBuilder or CharBuffer is what
>> the
>> application code actually uses, but not Strings! In such
>> cases, taking
>> the StringWriter.toString() detour is common but inefficient:
>> It implies
>> duplicating the COMPLETE text for the sole sake of creating a
>> temporary
>> String, while the subsequent processing will copy the data
>> anyways or
>> just uses a small piece of it. This eats up time and memory
>> uselessly,
>> and increases GC pressure. Also, StringWriter is synchronized
>> (not
>> explicitly, but de-facto, as it uses StringBuffer), which
>> implies
>> another needless slowdown. In many cases, the synchronization
>> has no use
>> at all, as in real-world applications least Writers are actually
>> accessed concurrently. As a result, today the major benefit of
>> StringBuilder over StringBuffer (being non-synchronized)
>> vanishes as
>> soon as a StringWriter is used to provide its content. This
>> means,
>> "stringBuilder.append(stringWriter.toString())" imposes slower
>> performance than essentially needed, in two ways: toString(),
>> synchronized.
>>
>> In an attempt to improve performance of this rather typical
>> use case, I
>> like to contribute a pull request providing the new public
>> factory
>> method java.io.Writer.of(Appendable). This is symmetrical to the
>> solution we implemented in JDK-8341566 for the reversed case:
>> java.io.Reader.of(CharSequence).
>>
>> My idea is to mostly copy the existing code of StringWriter,
>> but wrap a
>> caller-provided Appendable instead of an internally created
>> StringBuilder; this strips synchronization; then add
>> optimized use for
>> the StringBuffer, StringBuilder and CharBuffer
>> implementations (in the
>> sense of write(char[],start,end) to prevent a char-by-char
>> loop in these
>> cases).
>>
>> Alternatives:
>>
>> - Applications could use Apache Commons IO's
>> StringBuilderWriter, which
>> is limited to StringBuilder, so is not usable for the
>> CharBuffer or
>> custom Appendable case. As it is an open-source third-party
>> dependency,
>> some authors might not be allowed to use it, or may not want
>> to carry
>> this additional burden just for the sake of this single
>> performance
>> improvement. In addition, this library is not actively
>> modernized; its
>> Java baseline still is Java 8. There is no commercial support.
>>
>> - Applications could write their own Writer implementation.
>> Given the
>> assumption that this is a rather common use case, this imposes
>> unjustified additional work for the authors of thousands of
>> applications. It is hard to justify why there is a
>> StringWriter but not
>> a Writer for other Appendables.
>>
>> - Instead of writing a new Writer factory method, we could
>> slightly
>> modify StringWriter, so it uses StringBuilder (instead of
>> StringBuffer).
>> This (still) results in unnecessary duplication of the full
>> text at
>> toString() and (now also) at getBuffer(), and it will break
>> existing
>> applications due the missing synchronization.
>>
>> - Instead of writing a new Writer factory method, we could
>> write a new
>> AppendableWriter class. This piles up the amount of public
>> classes,
>> which was the main reason in JDK-8341566 to go with the
>> "Reader.of(CharSequence)" factory method instead of the
>> "CharSequenceReader" class. Also it would be confusing to have
>> Reader.of(...) but not Writer.of(...) in the API.
>>
>> - We could go with a specific Appendable class (like
>> StringBuilder)
>> instead of supporting all Appendable implementations. This
>> would reduce
>> the number of applicable use cases daramatically (in
>> particular as
>> CharBuffer is not supported any more) without providing any
>> considerable
>> benefit (other than making the OpenJDK-internal source code a
>> bit
>> shorter). In particular it makes it impossible to opt-in for
>> the below
>> option:
>>
>> Option:
>>
>> - Once we have Writer.of(Appendable), we could replace the full
>> implementation of StringWriter by synchronized calls to the
>> new Writer.
>> This would reduce duplicate code.
>>
>> Kindly requesting comments.
>>
>> -Markus Karg
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20241231/b093e212/attachment-0001.htm>
More information about the core-libs-dev
mailing list