What's the status of / relation between "JEP 169: Value Objects" / "Value Types for Java" / "Object Layout"

Thu Jan 29 11:02:32 UTC 2015

> I just want to quickly summarize my
> current findings here and gently ask for feedback in case you think
> I've totally misunderstood something. Of course any comments and
> additional information is highly welcome as well.
I don't know if that can be useful, but here is my point of view of 
developer oriented towards the question: "What feature for solving my 
problem?". This contains probably some or many errors, but it is another 
point of view (only mine), if useful.

I will not use strictly projects/proposal list as the structure of my 
mail because content of proposal is changing and it is not my target. I 
am oriented towards the final user, i.e. the developer consuming these 
projects, not the implementer working in each of these projects.

I will preferably split in three scopes following my perceived split of 
job between developer and runtime. The problem is data, then what can do 
JVM/GC with an object? I find two possibilities regarding this domain: 
move it, clone it.

If JVM can clone the object, JVM can also move the object because the 
clone will not have the same address, then we have the following three 
features:
---
1) JVM can clone and move objects (Project Valhalla):
Constraint: no complex constructor/no complex finalizer, because 
lifecycle of object is managed by JVM (JVM can clone, then JVM can 
create and destroy the object like JVM want). Only field affectation 
constructor, possibly with simple conversion of data format.
Constraint: immutable, because we don't know which clone is good when 
one is modified and because modifying all clones simultaneously is 
slow/complex/parallel-unfriendly.
Constraint: non-null because cloning a non-existing object is a 
non-existing problem.

Use-case "Performance": objects to clone for being closer to execution 
silicon and better parallelism (registers or cache of CPU/GPU)
- Runtime: expose features of CPU/GPU like SIMD (mostly like a modern 
version of javax.vecmath).
- Developer: create custom low-level structures for CPU/GPU parallel 
computing.
- Java language: small tuples, like complex numbers (immutable by 
performance choice, like SIMD, for being close to silicon; cloned at 
each pass by value).

Use-case "Language": objects to clone for being closer to registers (in 
stack, then less allocations in heap; simpler than escape analysis)
- Java language: multiple return values from a method (immutable because 
it's a result; cloned, by example, at the return of each delegate or not 
even created when stack-only).

Use-case "Efficiency": others immutable non-null objects possibly 
concerned for reducing indirection/improving cache, given by 
specialization of collection classes
- Database: primary key for Map (like HashMap)/B-Tree (like MapDB)/SQL 
(like JPA). A primary key is immutable and non-null by choice of 
developer, then possible gains.
---
2) JVM can move but not clone objects

It's current state of Java objects:
Constraint: developer need to define lifecycle in object, for being 
triggered by GC (constructor/finalizer) like current Java class.
Constraint: small object, because when GC move a big object, there is 
possibly a noticeable latency.
Constraint: usable directly only in Java code (because native code will 
need an indirection level for finding the real address of the object, 
changing after each move)

Improvement by adding custom layout for objects (Project Panama on heap 
/ ObjectLayout):
Specific constraint: objects which are near identity-less, i.e. only one 
other object (the owner) know their identity/have pointer on it.
Non-constraint: applicable to all objects types, contrary to Project 
Valhalla. Applicable to complex constructor, because complex constructor 
can be inlined in owner code where called. Applicable to mutable objects 
, because no cloning then no incoherency. Applicable to nullable objects 
only by adding a boolean field in the custom layout for storing 
potential existence or non-existence of the inlined object, and updating 
code testing nullability for using this boolean.

Use-case "General efficiency": Custom layout (Inline sub-object in the 
object owning it):
- Reduce memory use with less objects then less headers and less pointers.
- Improve cache performance with better locality (objects inlined are in 
same cache line, then no reference to follow).
- Applicable to many fields containing reference, requiring only the 
referenced object to be invisible from all objects except one (the owner).

By example, a private field containing an internal ArrayList (without 
getter/setter) can probably be replaced by the integer containing the 
used size and the reference to backing array, with inlining of the few 
methods of ArrayList really used.
It need probably to be driven by developer after real profiling for 
finding best ratio between efficiency/code expansion. It will probably 
have much more use-cases when AOT will be available and 
developer-manageable precisely (Jigsaw???), because most slow work of 
object-code inlining and following optimizations can be done at AOT 
time, while gains will be at running time.
Probably useful for the hottest code (JIT after this pre-optimization at 
AOT time) and clearly bad for the coldest code (interpreter then avoid 
code expansion), but very useful for the big quantity of code between, 
which will gain from AOT if complex optimizations are available. This 
will very probably require developer help/instructions/annotations using 
profiler data obtained on functional tests of application.
---
3) JVM can not move or clone objects (Project Panama off heap / 
PackedObjects)
Constraint: developer need to manage externally the full lifecycle of 
object and need to choose when creating or destroying it. Object is 
off-heap and an handle is on-heap for managing off-heap part.
Constraint: potential fragmentation of free memory when frequently 
creating and removing objects not having the same size (taking attention 
to object size vs. page size is probably important).

Use-case "GC Latency": big data structure inducing GC latency when moved 
if stored in heap
- All big chunks of data, like Big Data or textures in games, etc.
- Few number of objects for being manageable more explicitly by 
developer (without too much work).

Use-case "Native": communicate with native library
- Modern version of JNI

Only my 2 cents,
Daniel.