Proposal: Optimizing Efficiency using Read-only Arrays

Nathan Reynolds numeralnathan at gmail.com
Thu Dec 29 17:04:11 UTC 2022


In one project, we changed the code to be zero copy I/O.  This dramatically
improved the performance of I/O intensive applications.  So, getting to
zero copy I/O is a tremendous win.

How would the Java language prevent a rogue custom method from altering the
array?  Someone could write a method that receives a read-only array and
then change the bytecode in the .class to say it receives a writable
array.  The enforcement would have to be at the JVM level.  Furthermore,
reflection would have to deal with enforcing the read-only attribute.



On Thu, Dec 29, 2022 at 7:46 AM Markus Karg <markus at headcrashing.eu> wrote:

> Proposal: Optimizing Efficiency using Read-only Arrays
>
>
>
> TL;DR: Read-only Arrays will improve speed, reduce memory and power
> consumption, provide security by default, and make programming and reviews
> easier and quicker.
>
>
>
> Looking at the profile of any average real-world application, it is
> apparent that a lot of memory activity stems from allocating byte arrays.
>
> Byte arrays are a core building block of several APIs in OpenJDK.
>
> Just to name two of them: First and foremost Strings, as they are
> ubiquitous, but also I/O, as byte arrays are the buckets which carry all
> data through any InputStream/OutputStream.
>
> While I was authoring several java.io optimizations in the past months,
> the latter became the driver for me write down this propsal.
>
> Nevertheless, the proposal is focusing on a general solution, applicable
> to all Java APIs, beyond I/O.
>
>
>
> To perform any I/O in Java, all data MUST pass one or multiple byte
> arrays, each and every day.
>
> As it is easy to imagine, we can easily talk about multiple Gigabytes per
> day for an average server product.
>
> Once this array reference is passed to a custom method, it leaves the safe
> harbor of the JDK while entering possibly evil outside world - it becomes
> compromised.
>
> The called custom method ("Mr Evil") could either read privata data
> sitting in the array beyond passed read lower and upper limits, or could
> write poisoned data into the passed array, picked up afterwards by the JDK
> code (hence is treated as "safe" data).
>
> To mitigate these risks, typically byte arrays are duplicated (at least
> within limits) before forwarded to the outer world, so the "evil" receiver
> will only see a temporary / trimmed copy of the array.
>
> Just due to that single safety means alone, each day tens of thousands of
> Java servers are squandering precious memory and power, producing
> considerable amounts of carbon dioxide in turn.
>
> While copying buffers is effective, it also is inefficient.
>
> "Inefficiency" is definitively not a term we want Java to be recognized as
> in the age of climate change.
>
> N.B.: As soon as we omit explicit creation of an array copy, either due to
> a human programming fault, or due to an unexpected technical failure,
> security is ineffective! Hence relying on explicit copies is also a
> suboptimal ("flaky") safety means. Due to that risk, reviews of I/O code
> often become complex, lenghty and exhausting, making them rather expensive.
>
> This is just one single example. You could easily find lots more in the
> JDK.
>
>
>
> If the Java language would have a means to mark arrays as "read-only" to
> the Compiler / JVM (just like it alrady has for final variables), then no
> more need for an explicit copy exists.
>
> Several benefits would arise from the fact that no copy of the array is
> created (and removed) in turn:
>
> * Speed is improved. While System.arraycopy() is quick, not calling it at
> all is quicker.
>
> * GC pressure is reduced. While it might be low already, not creating a
> copy of an array makes it zero.
>
> * Security by default. As the JVM cannot write "read-only" arrays, there
> is no harm when an explicit copy is omitted.
>
> * Reduced memory consumption. No copy at all means literally zero
> additional memory.
>
> * Reduced power consumption. No power to invest into squandered CPU cycles.
>
> * Easier programming. No need to remind explicit creation of copies.
>
> * Simpler code. No copies means no code to create them, making the
> reminder simpler to understand.
>
> * Quicker reviews. Reviewer does not have to take care to check for
> compromised buffers, which is easily forgotten.
>
>
>
> While each single effect might be small, remind that all these effects
> will happen all together at once, and are massively applied each and every
> day, as arrays are building blocks of the JDK.
>
>
>
> To sum up, I'd like to propose to add a means to the Java language which
> turns arrays into "read-only" arrays.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-dev/attachments/20221229/2769629b/attachment.htm>


More information about the amber-dev mailing list