OutputStreamWriter (not) flushing stateful Charsetencoder
Jason Mehrens
jason_mehrens at hotmail.com
Wed Nov 10 19:16:03 UTC 2021
Here are the old details I can dig up when we ran into this on JavaMail:
https://bugs.openjdk.java.net/browse/JDK-6995537
https://github.com/javaee/javamail/commit/145d18c1738d3cf33b52bc005835980ff78ce4af
I recall digging through the code years ago and I don't recall if adding
w.write("");
Will trigger the encoder flush. Not that is a great workaround either.
Jason
________________________________________
From: core-libs-dev <core-libs-dev-retn at openjdk.java.net> on behalf of Bernd Eckenfels <ecki at zusammenkunft.net>
Sent: Wednesday, November 10, 2021 8:12 AM
To: core-libs-dev
Subject: OutputStreamWriter (not) flushing stateful Charsetencoder
(I thought this was discussed a while back on a OpenJDK mailing list, but I can’t find it. So apologies if this is a duplicate, but I might have seen it on Apache Commons-io instead - which fixed a similar issue on reader side)
The problem: I have code using a OutputStreamWriter with a customer defined charset name. this writer is flushed, and the code expects all pending bytes to be written. However when a stateful charset like cp930 is used, this is not the case. The final unshift byte for example is only written when the writer is closed. This is probably because it does not call end of data encode on the encoder in the flush().
The class does not clearly say or not say what is the correct behavior, however the flush() is formulated in a way that one could expect it should produce a complete stream.
So, is this a Bug in the implementation, if not should it be added to the documentation?
Here is a small JShell reproducer, you see the extra unshift byte (dec 15) only after the close:
var b = new byte[] { 0x31, (byte)0xef, (byte)0xbc, (byte)0x91 };
var s = new String(b, "UTF-8"); // „12“ (1 is ascii, 2 is fw)
var bos = new ByteArrayOutputStream();
var w = new OutputStreamWriter(bos, "cp930"); // stateful ebcdic with Shift chars
w.write(s);
w.flush();
bos.toByteArray()
$8 ==> byte[4] { -15, 14, 66, -15 }
w.close();
bos.toByteArray()
$10 ==> byte[5] { -15, 14, 66, -15, 15 }
--
http://bernd.eckenfels.net
More information about the core-libs-dev
mailing list