Question/Extension proposal: references to off-heap objects and support for multiple heaps

Krystal Mok rednaxelafx at gmail.com
Thu Jul 26 15:25:02 UTC 2012


Hi Leo,

You may be interested in this:
http://www.slideshare.net/RednaxelaFX/jvm-taobao (Case 1, page 13-27)

While I was working at Taobao's JVM team, I took part in an experimental
project called "GCIH", which stands for "GC Invisible Heap". It is still
being developed by that team. You could contact Joseph <changren at taobao.com>
for more details.
The basic concept is pretty similar to your proposed off-heap storage. Even
though it doesn't support direct allocation in this space, it could be
modified to do so.
We made deep modifications to the HotSpot VM to implement the features. As
you stated, it's unlikely to implement such feature without modifying the
internals of the VM, at least with the current standard APIs.

GCIH as it is now is by no means "complete". We're using it in very
specific scenarios, with some invariants on the use of objects within the
GCIH space that plain Java doesn't enforce. To be "complete" for general
use, a lot more has to be done, and the (development-) cost / (runtime-)
benefit ratio might drop to range where we would reconsider if it's worth
it.

Regards,
Kris

On Thu, Jul 26, 2012 at 10:15 PM, Leo Romanoff <romixlev at yahoo.com> wrote:

>
> Hi,
>
> I have the following question about GC, which is probably a bit unorthodox
> ...  It is an extension proposal. I tried to find something in the mailing
> list archives or on the Internet, but couldn't find anything related.
> Therefore I decided to ask on this mailing list, because people here are
> the
> ultimates experts on the JVM's GC mechanisms.
>
> I'm working on an object cache project a-la Terracotta BigMemory, which
> makes use of the off-heap storage. The usual approach with such off-heap
> solutions is that you first have to serialize objects and then put them as
> byte arrays into off-heap memory or direct buffers. This works, but has
> quite some drawbacks, e.g. a significant overhead due to serialization and
> deserialization, inability to work with off-heap representations as with
> usual objects, etc.
>
> Thinking about these issues, I started wondering, if it would be
> (theoretically) possible to allow having objects allocated in the off-heap
> memory?  I did some experiments. Right now, it is possible using low-level
> java.misc.Unsafe tricks to create an  object with proper headers in the
> off-heap memory and refer to it from on-heap objects or stack. You can work
> with it as with a normal object without any additional overhead, using
> normal operations, e.g. array access, method invocations, access to fields,
> etc. But this of course does not work reliably, because as soon as you have
> a full GC, the garbage collector detects a reference from a reachable
> on-heap object to an address outside of the heap and you start getting JVM
> crashes of all kinds.
>
> Based on these observations and experiments, what would be nice to  have
> is:
> 1) off-heap objects, which can be referenced from on-heap or on-stack
> objects  (and if possible support for creation of such objects at a given
> place/address off-heap, i.e. something like explicit placement). It could
> be
> also OK to put some limitations on such off-heap objects, e.g. limit the
> set
> of classes, whose instances could be placed off-heap and referred from
> on-heap objects; limits on  what can be referred from off-heap objects;
> off-heap object alignment rules, etc.
> 2) off-heap objects are pinned/non-movable from GC's point of view - under
> no circumstances should GC try to move them around.
> 3) (optional) off-heap objects, which are allowed to refer to on-heap
> objects. If this would be possible, GC should of course scan reachable
> off-heap objects to find references to on-heap objects and mark them as
> reachable.
>
> But I'm wondering about what is required to achieve at least (1) and (2)?
> Is
> it feasible to do it with not too many changes to HotSpot/GC? At the first
> glance, I have the naive impression that one could try to relax the
> condition that all references from on-heap objects should refer to an
> address inside heap. Instead, reference should refer to an address inside
> heap or one of the off-heap memory regions allocated by current
> application.
> One can still check that all the object headers are OK and according to the
> JVM rules. And once such a reference to an off-heap object is found, there
> is no need to trace/scan the referred off-heap object, because it is known
> that such objects cannot refer to on-heap objects. In case, it is required
> to support (3) as well, there is a need to scan off-heap objects as well,
> which may become tricky. But let's not concentrate on (3) for now.
>
> Questions:
> - Was something like this already discussed/considered by JVM developers or
> researchers? If so, could you provide links/references to such discussions
> and related issues?
>
> - Is such extension as described here technically feasible? Would it really
> require just minor changes in HotSpot JVM /GC as I explained or do I miss
> something obvious, which would make it very difficult or impossible to
> implement. I understand that there is also a "political" dimension of such
> an extension, which may result in rejecting it for many of other reasons.
> But I'd like to understand a technical feasibility
>
> Generalization of this idea:
>
> Overall, this proposal is just a special case of a more general approach,
> which would be to allow multiple (dynamically created/managed) heaps inside
> one JVM. Each heap may have its own policy for garbage collection, object
> allocation (e.g. any class or only a specific class, explicit placement
> support vs automatic address assignment) and constraints regarding which
> other heaps can be referenced from a given heap (e.g. only the same heap,
> only specific heaps, etc). Obviously, such an approach would require quite
> some changes to garbage collection implementation (e.g. checking
> cross-references between heaps, probably special read/write barriers, etc).
> It may also require some extensions at the bytecode/language/standard
> library level, because it should be possible to allocate objects on a given
> heap either on a per-instance or per-class level (this reminds me the C++
> class-specific new operators, which can take optional parameters, which in
> this case would be a specific heap), move objects/object graphs between
> heaps and so on.
>
> If multiple heaps with their own policies would be supported, it would open
> a lot of interesting possibilities:
> - non-collectable heaps - useful for JNI, interaction with external
> processes, explicit control over memory allocation
> - heaps at specific memory regions, which could be very interesting for
> embedded systems
> - light-weight processes (a-la Erlang) with their own heaps, where such
> heaps can be garbage collected independently
> - very fast object caches without big overhead
> and many, many more possible applications of such a feature.
>
> Of course, there are also potential drawbacks:
> - explicit allocation considered harmful
> - more complex garbage collection implementation
> - potentially slower garbage collection due to increased complexity
>
> What do you think about this suggestion? Is it possible to implement it
> technically in an efficient way by extending current implementation? Is it
> possible at all to implement it technically in an efficient way? What would
> be the biggest issues to get it working? What would be the implication for
> security mechanisms, Java memory model, etc? What could be the biggest
> obstacle?
>
> Thanks in advance for any feedback & comments,
>    Leo
> --
> View this message in context:
> http://old.nabble.com/Question-Extension-proposal%3A-references-to-off-heap-objects-and-support-for-multiple-heaps-tp34215852p34215852.html
> Sent from the OpenJDK Hotspot Garbage Collection mailing list archive at
> Nabble.com.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20120726/d9278801/attachment.htm>


More information about the hotspot-gc-dev mailing list