Towards better serialization
Stuart Marks
stuart.marks at oracle.com
Sat Jun 15 16:52:11 UTC 2019
On 6/13/19 9:24 AM, Kłeczek, Michał wrote:
> The whole premise of the proposal we are discussing is that convenience is the
> root of all evil.
Hi Michał,
This is an inaccurate characterization of the proposal.
Also, your earlier statements,
> What's more - it does not really address security concerns! ...
> The issue here is that we try to fix security problems in the wrong place. Almost all security issues with serialization are not really caused by serialization itself but by:
> - huge classpath with all libraries accessible to each other (ie. deserialization gadgets availability in classpath)
> - running applications with no SecurityManager (starting a JVM with no SecurityManager by default was the single biggest mistake Java designers made in the past IMHO)
Those are indeed security concerns, but you've overlooked an important and
fundamental class of security issues that are directly attributable to the way
serialization was designed and implemented in Java.
The line of reasoning about convenience in the proposal is not that convenience
itself is evil, but that in pursuit of convenience, the original design adopted
extralinguistic mechanisms to achieve it. This weakens some of the fundamentals
of the Java platform, and it has led directly to several bugs and security
holes, several of which I've fixed personally. Let me illustrate this with a
couple examples.
First, consider the bug JDK-6896297 [1] which I fixed several years ago.
Briefly, the problem is that a test failed intermittently, throwing
ConcurrentModificationException. The class in question is thread-safe, and
locking is applied within all method calls. How could the CME occur?
The exception occurred when another thread took a snapshot of this object
periodically; the snapshot was performed using serialization. This object didn't
provide a readObject() method, so the serialization mechanism "magically"
provided one that serialized the object using direct field access. This direct
access bypassed the locking protocol established by the rest of the class,
causing a race condition.
The code was in place for ten years before I fixed it. During that time
applications, were potentially exposed to corrupted snapshots. In a sense, we
were lucky that a CME was thrown. If it weren't thrown, we might never have
noticed the problem.
The second issue concerns a whole class of security vulnerabilities that arise
because serialization bypasses some fundamental mechanisms of the language. I
won't describe the vulnerabilities in detail, but I'll show this by describing
to an old and well-known Java security bug.
As you know, String is immutable, and its methods have well-defined behavior.
Therefore, it's possible to write secure code that relies on these
characteristics, e.g. a String reference can be stored in a data structure
without making a defensive copy, because Strings are immutable.
It turns out that in early versions of Java [2] it was possible to load a
"spoof" version of java.lang.String and hand instances of the spoofed String to
sensitive code. It's likely that this code is relying on well-known, safe
characteristics of the "real" java.lang.String. However, the spoofed String
class could supply different behavior for its methods or mutate itself.
This is impossible to see by inspecting the secure code. The security bug
existed because the fundamental assumptions the secure code was making about the
type-safety of the platform were violated by the spoof class.
What does this have to do with serialization? Brian's proposal states that
serialization bypasses the constructors of serializable classes. Big deal, just
use readObject(), right?
No. If you look carefully at the Java specifications, you'll see that
constructors have a bunch of special characateristics. The sequence of steps
that occurs when an object is created are precise and well-defined. [3] There
are other characteristics of constructors as well (which one can find by digging
through the JLS) such as: an object isn't finalizable until after the Object()
constructor returns; writes to final fields in constructors happen-before reads
that occur outside the constructor; the compiler ensures that all final fields
of an object are definitely assigned through all paths through constructors and
initializers; field and instance initializers are executed in a well-defined
order; and so forth.
Deserializing an object bypasses the constructors, thus none of this applies to
objects created via deserialization.
What are the consequences of this? Briefly, it means that it's possible to
create objects that appear impossible to create. At least, they appear
impossible, if you're trying to assess the security of the code by inspecting
it. Such objects might have unknown and unexpected behaviors. Since the system's
security and correctness depends on well-defined behaviors, it means that all
bets are off. No matter how carefully you inspect code to try to ensure that
it's secure, if it's handling objects that can violate Java's fundamentals, you
can't guarantee anything.
**
THIS is the point of the proposal. Bringing serialization into the realm of
well-defined language constructs, instead of using extralinguistic "magic"
mechanisms, is a huge step forward in improving quality and security of Java
programs.
s'marks
[1] https://bugs.openjdk.java.net/browse/JDK-6896297
[2] Vijay Saraswat. Java is not type-safe. 1997. Copy available at
https://www.cis.upenn.edu/~bcpierce/courses/629/papers/Saraswat-javabug.html
[3] https://docs.oracle.com/javase/specs/jls/se12/html/jls-12.html#jls-12.5
More information about the amber-dev
mailing list