Improve Java serialization with APPCDS (as a fast path)

Ioi Lam ioi.lam at oracle.com
Thu May 24 04:17:00 UTC 2018


Hi Mingyu,

I think this is a very interesting idea. Are you thinking about 
serialize/deserialize with the same JVM on the same host, or different 
JVMs across the network.

Currently different JVM installations can have different CDS archives. 
You can even use different archives for the same JVM by running with

$JAVA_HOME/bin/java -XX:SharedArchiveFile=/tmp/a.jsa
$JAVA_HOME/bin/java -XX:SharedArchiveFile=/tmp/b.jsa

So there's no guarantee that the same class will have the same ID in 
these 2 JVM process.

Thanks
- Ioi


On 5/23/18 7:55 PM, Mingyu Wu wrote:
> Hi all,
>
> APPCDS is a very interesting optimization aiming at reducing memory
> footprint and class loading overhead for multiple Java processes. In my
> opinion, it can also be used in other scopes such as serialization.
>
> Currently, Java serializer is slow and induces a large footprint (compared
> to application-level serializer). A major problem is that the serializer
> should write the description of classes into the serialized bytes, which
> increases the total memory consumption.
>
> On the other hand, application-level (or 3rd-party) serializers like Kryo
> can reduce the memory footprint by requiring users to assign IDs to certain
> classes manually.
> This assignment step should be finished very carefully to avoid
> inconsistency problem among different JVMs, so application-level
> serializers are not that easy to use.
>
> However, we can actually borrow the idea from application-level serializers
> with APPCDS (or even CDS) enabling. Consider we already have dumped a class
> list below:
> java/lang/Object
> java/lang/String
> ......
>
> We can assign IDs directly to those classes according to the order in the
> class list:
> java/lang/Object 0
> java/lang/String 1
> ......
>
> Since multiple JVMs will share the same APPCDS archive correspondence with
> the class list, those JVMs can directly use IDs to serialize/deserialize
> the classes stored in the archive. This avoids writing class descriptions
> into serialization bytes and simplifies the serialization/deserialization
> phase. Furthermore, it also saves users from manually assigning IDs to
> classes.
>
> Note that APPCDS only provides a fast path for ser/deser. If a class is not
> on the class list (and the archive), the serializer falls back to class
> description. However, the fast path can become more efficient with more
> advanced features, such as supporting custom classloaders.
>
> Anyway, I think APPCDS is a good fit to improve Java serialization.
>
> I am willing to take suggestions!
>
> Mingyu



More information about the hotspot-runtime-dev mailing list