RFC (round 1), JEP draft: Low-level Object layout introspection methods
Erik Österlund
erik.osterlund at oracle.com
Mon Aug 17 15:13:47 UTC 2020
Hi Aleksey,
I think the way this addressOf API (and others) is used is really a key
factor. You have a question that you want answers to, and addressOf can
help you figure out the answer. But
knowing what the question is, seems crucial for what form the API should
take to answer that question. And I don't think I understand well enough
how these low-level APIs are
*really* intended to be used. What are the actual high level questions
we want answered? I read some use cases in the JEP description, but I
don't see how neither addresses nor
offsets have to be exposed to answer the actual high level questions.
This problem seems strikingly similar to that of measuring time. Let's
say you would like to measure how long time it took to run your micro
benchmark and you need an API to do
that. The most obvious solution is to expose an API that tells you how
long time has passed since some reference point. This allows you to
measure a start time and an end time,
and compute the difference. Excellent.
Except, now as a provider of this API, you have to go through a world of
trouble to deal with various things like monotonicity of time counters
on different levels in the stack.
Because the implicit expectation is that surely time never goes
backward. Except when it does because one socket is hotter than another
and threads migrate between sockets and
what not, and now we have to hide that from the unknowing user. To
ensure monotonicity of the time stamps, you gotta put in stuff in the
whole tech stack (hardware, hypervisor,
VM, OS, JVM, etc) and deal with many problems.
An alternative API would to get the same job done for this use case
would be to expose a timer that you can start and stop and then return
the duration. It would hide the details
about time stamps and encapsulate how to reason about them with a check
that if the presumed duration between internal time stamps is negative,
return zero. Voila - monotonic time
measurements.
In a very similar way, if what you wish to do is to measure the distance
in bytes between two objects in order to measure some sense of locality
(I think you see where I am going
with this), then the obvious API to do that is to expose the current
address of an object. Then you can manually compute the distance by
taking the address of one object minus
the address of the other object. Except the user code does not have
explicit control of scheduling of safepoints between the two points of
measurement. Therefore, the result of
computing the difference might lead to some similarly surprising
results, including but not limited to:
* The computed distance between o1 and o2 is zero bytes, but it is not
the same object. Impossible! Except when it isn't.
* The computed distance between o1 and o2 is non-zero bytes, but it is
the same object. Impossible! Except when it isn't.
* The computed distance between o1 and o2 does not reflect the actual
distance between o1 and o2 at any snapshot of time. Impossible! Except
when it isn't.
Again we have exposed something slippery exposed to a lot of
implementation weirdness, like a time stamp, and let the user of the API
know how to interpret relative differences,
hoping they won't get surprised by potential implementation artifacts
(like relocation, slippery safepoints, pointer tagging, pointer
compression schemes, etc).
Perhaps, another way of answering the same question without the
addressOf API, is to have an estimatedDistanceInBytes(Object o1, Object
o2) or even an
estimatedDistanceInBytes(Object o1, Field f1, Object o2, Field f2) API.
This API could run in a mode where there are no
safepoints, and ensure that none of the above mentioned "impossible"
situations actually remain impossible, and hence more effectively
actually answering the high-level question.
It would also importantly never expose any addresses or offsets, while
still allowing various locality heuristics to be computed by performance
people.
As for the other question - what is the cache line alignment of this
object - a similar API closer to the high level question could be built,
like: alignmentInBytes(Object o1, int alignment),
where alignment is a power of 2 up to a "reasonable" size that allows
you to answer all the questions you had about the object layout, without
exposing its address. Although we might want
to think a bit more about this one. We don't want to hardcode
assumptions that the alignmentInBytes of an object is where its oop is
pointing at a cache line. There are no oops in the user
model. If you for example consider an alternative JVM implementation
like Jikes RVM, then the object pointer is at an offset into the object
payload (in fact where an array payload would
start). This was done to reduce the instructions needed for RISC
processors to perform array element addressing. This reinforces that a
JVM implementation might choose to have their object
pointers point at different offsets either before or after the payload
of where its memory cell begin. In the previous Shenandoah design
(before the LRB), it would for example point one
word into the payload of the cell. And that would still be before the
payload of the object - a rather arbitrary point. So we would have to
have some kind of consensus about what offset
into the payload to expose the alignment of. Where the fields start?
Once again, the nature of the question becomes very relevant. Because is
the root question really "what is the cache
line alignment of my object", or is it "what is the cache line alignment
of the field foo", which might be better represented with
alignmentInBytes(Object o1, Field field, int alignment).
That would allow a possibly less sensitive and more heuristic in nature
piece of information, to leave the JVM. But then why are you asking what
the alignment of the field is really?
Is there yet a more high level question we are trying to answer, such as
"is field foo and bar on the same cache line in object o?". Now at this
high level, we are getting to a point where
we could expose just a boolean that is heuristic in nature. This would
allow covering up many thinkable implementation choices from the user of
the API.
I, like Brian, am not a big fan of exposing the very concept that fields
have offsets or that objects have addresses for that matter. For me, the
exposed user model is that objects don't
have addresses, they are logical concepts composed of a mapping to
"memory cells" that may or may not be 1:1. As you know better than most
people, there are GC designs that do not have
a to-space invariant. Like Shenandoah a few years ago, and seemingly the
new Alibaba Platinum GC. There are also GC designs where fields are not
offsets, like Schism and Jamaica VM in
the literature, that split objects into multiple fixed sized memory
cells, to combat fragmentation without the need for compaction.
I think that we want to maintain as much flexibility as we can for
future GC algorithms to thrive on the Java platform, and not perform the
same mistake that languages like Go did by
exposing the address of the objects, and hence forever closing the door
to moving garbage collection for the go platform. I hope you understand
where I am coming from as a GC maintainer.
TL/DR: Exposing the low level details about object layout that requires
the user to know implementation details such as the very concept that an
object is associated with an address, or
that fields are associated with an offset, ought to be the last possible
resort. And I have a feeling that if we look more closely at the high
level questions you really want answers for
with the proposed low-level API, we might be able to design more
high-level APIs, closer to the original questions, that might be both
more effective at answering such questions without
strange implementation anomalies due to leaving the measuring between
different relative points of data be done in an uncontrolled fashion by
users, instead of by letting the JVM know in
the API what you are really asking), and hide more implementation
details (addresses, offsets) we don't want to expose, at the same time.
I have heard two high-level questions that addressOf is proposed to answer:
* What is the byte distance between o1 and o2
* What is the alignment of o1?
As mentioned, zooming out even one more step, are we asking these
questions as a means to an end, or because we have another even more
high level question? Perhaps if we get to the root
of what we would like to find out, we might never have to expose neither
offsets of fields nor addresses of objects. Because why do you need them
if not to answer an even more high level
question about the layout of an object, e.g. "is foo of obj1 and bar of
obj2 on the same cache line" or "what is the byte distance between foo
of obj1 and bar of obj2"? When I read your
JEP description, it seems like those are indeed the kind of high level
questions we want answered really. And for that, I think a much more
high level API would be more appropriate.
What do you think?
Thanks,
/Erik
On 2020-08-17 14:33, Aleksey Shipilev wrote:
> On 8/16/20 12:41 PM, Peter Levart wrote:
>> On 8/11/20 12:22 PM, Aleksey Shipilev wrote:
>>> ...but dislike:
>>> public static long addressOf(Object obj);
>>> public static long fieldOffsetOf(Field field);
>> What exactly is the purpose of "addressOf" method in terms of
>> information API? Is it used to estimate relative placement of several
>> objects in the heap to see how they are scattered around which affects
>> the CPU cache performance when accessing them?
> Yes, it says so in "Motivation" section in JEP. Additionally, checking the object address against
> the cache line size.
>
>> If this is the case, then maybe the method could return a "mangled"
>> address: the address + some secret random value calculated once for the
>> whole VM.
> Now that is an interesting suggestion!
>
> Implemented here:
> https://hg.openjdk.java.net/jdk/sandbox/rev/248807bfa78e
>
> There is little-to-none loss of performance, because the offset can be trivially used in intrinsics.
> JEP text is updated to mention this technique. I believe this makes the address exposure story less
> problematic, although the result is still conceptually a useful proxy for a memory location.
>
More information about the jdk-dev
mailing list