Proposal: Optimizing Efficiency using Read-only Arrays

Markus Karg markus at headcrashing.eu
Thu Dec 29 15:46:41 UTC 2022


Proposal: Optimizing Efficiency using Read-only Arrays

 

TL;DR: Read-only Arrays will improve speed, reduce memory and power
consumption, provide security by default, and make programming and reviews
easier and quicker.

 

Looking at the profile of any average real-world application, it is apparent
that a lot of memory activity stems from allocating byte arrays.

Byte arrays are a core building block of several APIs in OpenJDK.

Just to name two of them: First and foremost Strings, as they are
ubiquitous, but also I/O, as byte arrays are the buckets which carry all
data through any InputStream/OutputStream.

While I was authoring several java.io optimizations in the past months, the
latter became the driver for me write down this propsal.

Nevertheless, the proposal is focusing on a general solution, applicable to
all Java APIs, beyond I/O.

 

To perform any I/O in Java, all data MUST pass one or multiple byte arrays,
each and every day.

As it is easy to imagine, we can easily talk about multiple Gigabytes per
day for an average server product.

Once this array reference is passed to a custom method, it leaves the safe
harbor of the JDK while entering possibly evil outside world - it becomes
compromised.

The called custom method ("Mr Evil") could either read privata data sitting
in the array beyond passed read lower and upper limits, or could write
poisoned data into the passed array, picked up afterwards by the JDK code
(hence is treated as "safe" data).

To mitigate these risks, typically byte arrays are duplicated (at least
within limits) before forwarded to the outer world, so the "evil" receiver
will only see a temporary / trimmed copy of the array.

Just due to that single safety means alone, each day tens of thousands of
Java servers are squandering precious memory and power, producing
considerable amounts of carbon dioxide in turn.

While copying buffers is effective, it also is inefficient.

"Inefficiency" is definitively not a term we want Java to be recognized as
in the age of climate change.

N.B.: As soon as we omit explicit creation of an array copy, either due to a
human programming fault, or due to an unexpected technical failure, security
is ineffective! Hence relying on explicit copies is also a suboptimal
("flaky") safety means. Due to that risk, reviews of I/O code often become
complex, lenghty and exhausting, making them rather expensive.

This is just one single example. You could easily find lots more in the JDK.

 

If the Java language would have a means to mark arrays as "read-only" to the
Compiler / JVM (just like it alrady has for final variables), then no more
need for an explicit copy exists.

Several benefits would arise from the fact that no copy of the array is
created (and removed) in turn:

* Speed is improved. While System.arraycopy() is quick, not calling it at
all is quicker.

* GC pressure is reduced. While it might be low already, not creating a copy
of an array makes it zero.

* Security by default. As the JVM cannot write "read-only" arrays, there is
no harm when an explicit copy is omitted.

* Reduced memory consumption. No copy at all means literally zero additional
memory.

* Reduced power consumption. No power to invest into squandered CPU cycles.

* Easier programming. No need to remind explicit creation of copies.

* Simpler code. No copies means no code to create them, making the reminder
simpler to understand.

* Quicker reviews. Reviewer does not have to take care to check for
compromised buffers, which is easily forgotten.

 

While each single effect might be small, remind that all these effects will
happen all together at once, and are massively applied each and every day,
as arrays are building blocks of the JDK.

 

To sum up, I'd like to propose to add a means to the Java language which
turns arrays into "read-only" arrays.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-dev/attachments/20221229/5e59fb08/attachment-0001.htm>


More information about the amber-dev mailing list