JDK-8352891 Performance improvements to ByteArrayOutputStream

Engebretson, John jengebr at amazon.com
Wed Apr 2 18:04:01 UTC 2025


  Apologies, human error – here’s the message I intended:

  Thank you!  I’ve updated the PR accordingly and summarized the benchmarks in the description.  Here’s the short version:

  *   For small payloads, unsynchronized and optimized versions are 2-4x faster than base
  *   For large payloads, optimized version is 3x faster than base or unsynchronized

  I discovered a capacity-related incompatibility between ByteArrayOutputStream and MemoryOutputStream: the size() method returns int, but MemoryOutputStream can exceed that value.  I added range checking to size() and a new sizeAsLong() method… but it really makes me wonder MemoryOutputStream belongs as a subclass of ByteArrayOutputStream.  It now has two significant incompatibilities: ignoring the protected fields, and size restrictions.
     John


From: Engebretson, John
Sent: Wednesday, April 2, 2025 12:58 PM
To: 'Alan Bateman' <alan.bateman at oracle.com>; Markus KARG <markus at headcrashing.eu>; core-libs-dev at openjdk.org
Subject: RE: [EXTERNAL] JDK-8352891 Performance improvements to ByteArrayOutputStream

  Thank you!  I’ve updated the PR accordingly and summarized the benchmarks in the description.  Here’s the short version:


From: Alan Bateman <alan.bateman at oracle.com<mailto:alan.bateman at oracle.com>>
Sent: Wednesday, April 2, 2025 5:52 AM
To: Engebretson, John <jengebr at amazon.com<mailto:jengebr at amazon.com>>; Markus KARG <markus at headcrashing.eu<mailto:markus at headcrashing.eu>>; core-libs-dev at openjdk.org<mailto:core-libs-dev at openjdk.org>
Subject: RE: [EXTERNAL] JDK-8352891 Performance improvements to ByteArrayOutputStream


CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


On 31/03/2025 16:51, Engebretson, John wrote:
  Alan – is this what you have in mind:

ByteArrayOutputStream.getInstance() // returns existing class
ByteArrayOutputStream.getUnsynchronizedInstance() // returns subclass of BAOS that overrides the synchronization
ByteArrayOutputStream.get<Scalable|Memory|Fast|Segmented>Instance() // returns the new class


BAOS has been synchronized since JDK 1.0. While undocumented, it's possible that existing code depends on this 30 year behavior so I think we are stuck with it.

The removal of biased locking has spurred on a few complaints that the class is needlessly synchronized. A static factory to return an unsynchronized BOAS would help but only if it isn't used with code that assumes all operations are synchronized. So I think we will have to look at the API docs for this.

It's not clear that we need to have several implementation with different performance tradeoffs. So I think part of the exploration will be to see what usages perform better or worse, and whether having a parameter to specify the initial size or some hint of the max size would help the discussion.

-Alan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20250402/9c596e58/attachment-0001.htm>


More information about the core-libs-dev mailing list