Question/Extension proposal: references to off-heap objects and support for multiple heaps

changren changren at taobao.com
Mon Jul 30 03:41:56 UTC 2012


Hi Leo
sorry for replying late,

*Do you have a link to this CR?*

Kris, I can't find out the link, could you help?

*How was it preceived by JVM developers? Any hope to get into the mainline?*

Since G1 has inherent merit of 'isolating', we will try out prototypes based on G1 after it becomes stable. We don't think current ParNew+CMS based impl
is a general enhancement and I am afraid no one will support this.

*Another question: do you have any plans to contribute/provide as open-source
your CMS-based GCIH implementation? Or at least publish any sort of a
detailed technical report about it, explaining what and how was done to
support this feature? Or is it an internal work at Taobao, which has to be
kept secret and used only internally inside the company?*

We plan to open source all our JVM enhancements(including GCIH) at 
jvm.taobao.org in near future, Taobao(subsidiary of Alibaba corp) is a 
leading opensource company in China :).

*Can you quantify how much of it is due to sharing and how much  of it is due
to ability to work with objects in the off-heap memory as with normal
objects? I ask it because, if you only want to share off-heap memory between
JVMs or with other processes there are other ways. For example, you can have
byte buffers using memory mapped files. And if your files are e.g. in the
shared memory (shm) on Linux, you get sharing between processed without
changes in the JVM. Therefore it is interesting to see, what exactly and to
which extent contributes to this 10x performance gain and how much JVM
changes contribute.*

The improvement was not measured comparing other off-heap solutions with 
GCIH. The compared solution is straightforward, which shares data 
through a RPC based and centralized data center, so you can image how 
much we can improve after diminishing overheads of networking, 
serialization/deserialization etc.. I am not familiar with other 
off-heap solutions, but if I am right, the inevitable overhead of any 
off-heap solution without JVM changes is serialization/deserialization 
when it comes into context of sharing java objects.

Regards,
Joseph



于 2012-7-27 15:47, Leo Romanoff 写道:
> Hi Joseph,
>
> Thanks for this additional information.
>
>
> changren wrote:
>> BTW, Kris and I ever talked with Ramki about the idea and the GCIH
>> implementation and since CMS is not on the rador he suggested to do some
>> modifications on G1 to support static heap subsets and one CR was filed
>> 6660122 G1: support for large, mostly-static heap subsets
>>
> Do you have a link to this CR? How was it preceived by JVM developers? Any
> hope to get into the mainline?
>
> Another question: do you have any plans to contribute/provide as open-source
> your CMS-based GCIH implementation? Or at least publish any sort of a
> detailed technical report about it, explaining what and how was done to
> support this feature? Or is it an internal work at Taobao, which has to be
> kept secret and used only internally inside the company?
>
>
> changren wrote:
>> GCIH now has been adopted online, the hadoop team in taobao uses GCIH as
>> an efficient way of sharing static Java objects(data dictionary) among JVM
>> processes(Hadoop Map processses) on same physical machine which helps to
>> achieve 10x performance gains.
>>
> This is quite an improvement!
>
> Can you quantify how much of it is due to sharing and how much  of it is due
> to ability to work with objects in the off-heap memory as with normal
> objects? I ask it because, if you only want to share off-heap memory between
> JVMs or with other processes there are other ways. For example, you can have
> byte buffers using memory mapped files. And if your files are e.g. in the
> shared memory (shm) on Linux, you get sharing between processed without
> changes in the JVM. Therefore it is interesting to see, what exactly and to
> which extent contributes to this 10x performance gain and how much JVM
> changes contribute.
>
> -Leo
>
>
>
> changren wrote:
>> Thank you Kris for the explain,
>> BTW, Kris and I ever talked with Ramki about the idea and the GCIH
>> implementation and since CMS is not on the rador he suggested to do some
>> modifications on G1 to support static heap subsets and one CR was filed
>>
>> 6660122 G1: support for large, mostly-static heap subsets
>>
>> GCIH now has been adopted online, the hadoop team in taobao uses GCIH as
>> an efficient way of sharing static Java objects(data dictionary) among JVM
>> processes(Hadoop Map processses) on same physical machine which helps to
>> achieve 10x performance gains.
>> Regards,
>> Joseph
>>
>> the write barrier
>> 于 2012-7-27 0:16, Krystal Mok 写道:
>> Hi Leo,
>>
>> Thanks for being interested :-) I think it's time for Joseph to chime in,
>> if he will.
>> I don't work for Taobao anymore; it's better if someone from the inside to
>> share the details.
>>
>> I could briefly cover the parts of the VM we touched. We tried a lot of
>> variants, and not all of them are meant to be **safe**.
>>
>> GCIH as it is only works with ParNew+CMS configuration. We modified all
>> tracing actions that's involved in ParNew and CMS so that during GC it
>> wouldn't trace into objects within GCIH. We also modified the pointer
>> adjusting logic so that it would fix-up object pointers (oops) originated
>> from GCIH that point to moved objects.
>>
>> Note that we actually only allow such pointers to be metadata pointers
>> pointing into the PermGen; after the PermGen elimination project is done,
>> such pointers wouldn't even exist anymore (but I'm not sure if the PermGen
>> elimination project allows metadata to move; it'd be nice for GCIH if
>> metadata doesn't move).
>> Otherwise, the object graph in GCIH must be self-contained, i.e. all
>> object pointers originating from GCIH should also point into GCIH. In
>> addition, objects in GCIH don't move. So, no pointer fix-ups other than
>> the metadata pointers are needed.
>>
>> To enforce the invariant above, write-barriers also need to be modified.
>> Note that this could impact the throughput of normal Java programs, so
>> it'd be preferable if it could be turned off -- but we're talking about
>> trading safety for performance here, so umm...there really isn't a choice.
>>
>> Regards,
>> Kris
>>
>> On Thu, Jul 26, 2012 at 11:43 PM, Leo Romanoff
>> <romixlev at yahoo.com<mailto:romixlev at yahoo.com>> wrote:
>>
>> Hi Kris,
>>
>> Thanks a lot for this link about GCIH and other JVM extensions done at
>> Taobao. Very interesting!
>> The GCIH use-cases are almost identical to what I had in mind.
>>
>>
>> Krystal Mok wrote:
>>> We made deep modifications to the HotSpot VM to implement the features.
>>> As
>>> you stated, it's unlikely to implement such feature without modifying the
>>> internals of the VM, at least with the current standard APIs.
>>>
>> Very interesting. Is any general (or even better - detailed) information
>> about those deep modifications available anywhere? It would be interesting
>> to better understand which parts of the HotSpot are affected or impacted
>> by
>> such an extension and to which extent.
>>
>> Thanks again,
>>    Leo
>>
>> --
>> View this message in context:
>> http://old.nabble.com/Question-Extension-proposal%3A-references-to-off-heap-objects-and-support-for-multiple-heaps-tp34215852p34216295.html
>> Sent from the OpenJDK Hotspot Garbage Collection mailing list archive at
>> Nabble.com.
>>
>>
>>
>>
>> ________________________________
>>
>> This email (including any attachments) is confidential and may be legally
>> privileged. If you received this email in error, please delete it
>> immediately and do not copy it or use it for any purpose or disclose its
>> contents to any other person. Thank you.
>>
>> 本电邮(包括任何附件)可能含有机密资料并受法律保护。如您不是正确的收件人,请您立即删除本邮件。请不要将本电邮进行复制并用作任何其他用途、或透露本邮件之内容。谢谢。
>>
>>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20120730/041e733b/attachment.htm>


More information about the hotspot-gc-dev mailing list