<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.msonormal0, li.msonormal0, div.msonormal0
{mso-style-name:msonormal;
mso-margin-top-alt:auto;
margin-right:0in;
mso-margin-bottom-alt:auto;
margin-left:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.gmailsignatureprefix
{mso-style-name:gmail_signature_prefix;}
span.EmailStyle21
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"> Thank you Archie and Chen!<o:p></o:p></p>
<p class="MsoNormal"> Chen - I’m prototyping the generic allocator you describe and it’s extremely effective for Objects – but I’m hamstrung by trying to use generics on primitive byte. I’m not aware of a way to work around that, and changing the array from
byte[] to Byte[] would be a terrible idea, so I think we’re looking at two different allocators. The template suggested by Archie may help implement that, but ultimately it’ll be multiple classes.<o:p></o:p></p>
<p class="MsoNormal"> Archie – your suggestion generally matches the implementation on the PR, except that the implementation is flexible on the segment size and each instance “self-tunes” based on inputs. There are a few hard-coded scaling constants that
we could consider tweaking, but my perf tests so far show they’re reasonable in the general case. Self-managing eliminates guesswork about N and, most importantly, eliminates duplicative copying/allocation after the byte has been recorded. The benchmark
tests a handful of hard-coded sizes and can easily be expanded to handle more, at the expense of longer runtimes.
<o:p></o:p></p>
<p class="MsoNormal"> I’ll update the PR later today with these new suggestions alongside the current, so we can clearly evaluate pros and cons.<o:p></o:p></p>
<p class="MsoNormal"> Thanks!<o:p></o:p></p>
<p class="MsoNormal"> John<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div>
<div>
<p class="MsoNormal" style="margin-left:.5in">At the risk of repeating my <a href="https://mail.openjdk.org/pipermail/core-libs-dev/2025-March/141871.html">
previous comment</a>, I agree with Chen.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">That is to say, there is a separate, more fundamental unsolved problem lurking underneath this discussion, and the two problem "layers" are perhaps better addressed separately.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">Once the lower layer problem is properly framed and resolved, it becomes reusable, and wrapping it to solve various higher-layer problems is easy.<o:p></o:p></p>
</div>
<div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">An internal class would be a reasonable and conservative way to start. There could even be a suite of such classes, built from templates a la X-Buffer.java.template.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">These could be used all over the place (e.g., refactor StringBuilder). For example, I wonder how much the performance of e.g. ArrayList could be improved in scenarios where you are building (or removing elements
from) large lists?<o:p></o:p></p>
</div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
<div>
<p class="MsoNormal" style="margin-left:.5in">Just thinking out loud (apologies)... Define a "segmented array allocator" as an in-memory byte[] array builder that "chunks" the data into individual segments of size at most N.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">We can think of the current ByteArrayOutputStream as such a thing with N = 2³² that is, there's only ever one "chunk".<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">The assertion is that N = 2³² is not the most efficient value. And obviously neither is N = 1.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">So somewhere in the middle there is an optimal value for N, which presumably could be discovered via experimentation. It may be different for different architectures.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">Another parameter would be: What is the size M ≤ N of a new chunk? E.g. you could start with M = 16 and then the chunk grows exponentially until it reaches N, at which point you start a new chunk. The optimal value
for M could also be performance tested (it may already have been).<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">Of course, for performance optimization we'd need some distribution of array sizes that models "typical" use, etc.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">-Archie<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
<div>
<div>
<p class="MsoNormal" style="margin-left:.5in">On Wed, Apr 9, 2025 at 6:19 PM Chen Liang <<a href="mailto:liangchenblue@gmail.com">liangchenblue@gmail.com</a>> wrote:<o:p></o:p></p>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<div>
<p class="MsoNormal" style="margin-left:.5in">Hi John Engebretson, <o:p></o:p></p>
<div>
<p class="MsoNormal" style="margin-left:.5in">I still wonder if we can make the byte array allocator a utility to the JDK, at least an internal one. I find that besides replacing BAOS uses, it can also optimize users like InputStream.readNBytes, BufWriterImpl
of classfile, and maybe many more usages. Such an internal addition may be accepted to the JDK immediately because it has no compatibility impact and does not need to undergo CSR review.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">Chen Liang<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
</blockquote>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</div>
</div>
</body>
</html>