RFC (round 1), JEP draft: Low-level Object layout introspection methods
Brian Goetz
brian.goetz at oracle.com
Fri Aug 7 19:53:28 UTC 2020
> I would like to solicit early feedback on "Low-level Object layout
> introspection methods" JEP:
> https://openjdk.java.net/jeps/8249196
I enjoyed reading this JEP draft; clearly a lot of thought has gone into
the technical details. I find much to like, and some to dislike, in
this proposal.
First, the good news. I am sympathetic to the needs that certain
libraries have to assess memory usage of objects, such as for cache
eviction. While to some level this is an impossible problem (how do we
know whether an object reference buried in a graph is aliased or not?),
providing users with better ways of measuring cost than simply counting
cache entries is a noble cause, and the work you've done on
`estimateSize` definitely is a step forward.
I am also sympathetic for the desire to better understand layout, for
purposes of optimizing footprint ("cool, I can stash this field in the
alignment shadow") and cache efficiency ("how do I get these fields so
they are more likely on the same cache line.") While such feedback can
never be relied on 100%, I agree it can be useful to have this
information so that one can make more accurate estimates of cost.
Now, the bad news: I have deep objections to several of the sub-features
in this JEP (unfortunately some of them even overlap with the second
point above.) Specifically, I have deep misgivings about exposing field
offsets, and deeper misgivings about exposing object addresses.
For context, remember that we are in the middle of a deliberate,
decade-long transition away from Unsafe. We knew that we can't just
turn off Unsafe immediately, but we also knew we had to wean people from
Unsafe -- and not only from the specific class, but from some of the
concepts. And of course we can't do so without providing good-enough
replacements for at least some of the use cases, so it will necessarily
take time. (Reasonable people can reasonably disagree on the meaning of
"good enough" and which use cases, but that's a separate discussion.)
The down payment on this plan was VarHandles; a key goal here was
obviate the need for using Unsafe to access data in the Java heap.
Naturally it took some time to flesh out the feature set, shake out the
performance, and adapt the JDK code to use it, but the goal is clear --
there should be no excuse for using Unsafe to access the Java heap at
all. And we're well on the way.
People still use Unsafe to access off-heap memory, but we're working on
a plan there too -- the Panama foreign memory access API. This is less
mature than VarHandles, but the goal is similar; by the time we're done,
there should be no excuse to use Unsafe for access to memory at all,
ever, because there will be better, safer, supported APIs that do the
same thing with comparable performance. (Similarly, Panama has a notion
of Layout, and it is possible that, over time, it might be possible to
get a Layout for a Java object, and access it through the safer Panama
APIs.)
The very model that Unsafe assumes for on-heap data (and therefore
encourages users to assume) -- that a field offset is a fixed thing that
can be relied on -- undermines the VMs ability to optimize. As just one
example, the VM could dynamically redo layout based on profiling data
(putting frequently-accessed-together fields so they are in the same
cache line) and rewrite existing instances during GC. In a world where
even one user can have Unsafe, then _no one_ can ever have this
optimization. This seems a bad trade! Unfortunately the proposal to
include field offset and address information in the API grabs the ball
and runs not only to the opposite goal line, but into the next stadium
-- by elevating this model to "something you rely on at your own risk"
to "something we promise Java will be constrained by until the end of
time." No thank you! And it seems especially bad to pour fresh
concrete on "no one can have this ever" when we're five years into
jackhammering away the old dam.
In fact, the next methods I would like to _remove_ from Unsafe are the
get-field-offset and get-object-address methods. You might reply you
are enabling this goal, but that would be mistaking the letter for the
spirit: I want people to stop assuming that fields HAVE an offset, or
that Java objects HAVE an address. And, what else would someone do with
the address and offset, other than provide it to Unsafe (or to the
native equivalent)? This is propping up the wrong model, and runs
counter to the long-term direction we've been on for years, and are
already halfway there on.
So, while there are some things in this JEP that seem quite cool and
which I enthusiastically support, there are some fundamental assumptions
-- that field offsets and object addresses should be accessible to Java
code -- that I disagree with.
Setting aside philosophy and stewardship for a moment, I think what's
really going on is that this JEP is trying to serve two different
audiences: those that are happy to have estimates and/or use the
information for making approximate decisions, and those who are
interested in low-level hackery for accessing the Java heap, rather than
through `getfield` and `putfield`. I think if you were to split this
into two JEPs, you would find enthusiastic support for the first, and
strong resistance to the second.
To the extent that there is overlap between these groups (e.g.,
fieldOffsetOf is useful to offline tools that may be used to estimate
layouts, as well as online users who would be tempted to believe that
this is really a field offset), some creativity will be required to keep
the first from becoming an attractive nuisance to the second.
So, my feedback is "glass half full" -- I'm happy to see the JEP that
provides estimates and such. My recommendation is to focus on that half
for now.
Cheers,
-Brian
More information about the jdk-dev
mailing list