Re: RFC: Experiment in accessing/managing persistent memory from Java
Hi Andrew/Jonathan, Thanks a lot for sharing this work. Copying hotspot-compiler-dev to get their feedback as well. Couple of thoughts/observations below: * Supporting ByteBuffer on persistent memory using existing FileChannel and MappedByteBuffer mechanism sounds like a very good idea. * Extending FileChannel.map to take additional parameter to indicate that the ByteBuffer is backed by persistent memory is a small API change. * Adding MappedByteBuffer.force(int from, int to) method on smaller range is very useful in addition to the force() on entire ByteBuffer. * The underlying force0_mapsync() could be implemented in terms of new unsafe APIs which in turn could be intrinsified. The advantage of this is that the unsafe APIs could then be used for other future persistent memory APIs in the JRE. Specifically the following two unsafe APIs would be useful: a) public native void flush(long address, long size); b) public native void storeFence(); storeFence() exists today but doesn't generate any instruction for x86. Wondering if we could have additional boolean parameter to force the sfence generation. * DEFAULT_CACHE_LINE_SIZE is 128 in src/hotspot/cpu/x86/globalDefinitions_x86.hpp whereas actual cache line on the hardware is 64 bytes. This could be the cause for some of performance that you saw with compiler intrinsic vs pure C native. Best Regards, Sandhya RFC: Experiment in accessing/managing persistent memory from Java Andrew Dinn adinn at redhat.com Mon May 21 09:47:46 UTC 2018 I have been helping one of my Red Hat colleagues, Jonathan Halliday, to investigate provision of a Java equivalent to Intel's libpmem suite of C libraries [1]. This approach avoids the significant cost of using the Intel libraries from Java via JNI (or, worse, as a virtual driver for a persistent memory device). Jonathan has modified the JVM/JDK to allow a MappedByteBuffer to be mapped over persistent memory, providing equivalent function to libpmem itself. On top of this he implemented a Java journaled log class providing equivalent functionality to one of the Intel client libs, libpmemlog, built over libpmem. The modified MappedByteBuffer can be configured to use either i) a registered native method or ii) a JIT intrinsic to perform the critical task of cache line writeback i.e. the persistence step (the intrinsic is my contribution). Jonathan's tests compare use of JNI, registered native and intrinsic with an equivalent C program to write a large swathe of records to a journaled log file stored in persistent memory. Performance is worse than C when relying on JNI and significantly better with JVM/JDK support. Indeed, as one might reasonably expect, use of the JIT intrinsic almost completely eliminates writeback costs. The journaled log code, jdk dev tree patch, build instructions, test code plus C equivalent and test results are all available from Jonathan's git repo [2]. For those who do not want to look at the actual code, the README file [3] provides background to use of persistent memory, an overview of the design, and summary details of the test process and results. [1] https://pmem.io/pmdk/ [2] https://github.com/jhalliday/pmem [3] http://github.com://jhalliday/pmem/README.md<http://github.com/jhalliday/pmem/README.md> n.b. Jonathan has experimented with using this same prototype to replace the journaled log used in the Red Hat Narayana transaction manager. It provides a significant improvement on the current disk file based log, both for throughput and latency (the code is not yet available as getting it to work involved some horrible hacking of the build to migrate up to jdk11). regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander ________________________________
Hi Andrew, Jonathan, Sandhya gave an overview to a few of us Oracle folks. I agree with what Sandhya says regarding the API, a small surface, and on pursuing an unsafe intrinsic. I like it and would encourage the writing of a draft JEP, especially to give this visibility. I expect this will be beneficial for experimentation with the Panama foreign API where we can use a Pointer to reference into a byte buffer and scribble on it. Further, i hope this work may also benefit the persistent collections effort (PCJ). It intersects with https://bugs.openjdk.java.net/browse/JDK-8153111 ((bf) Allocating ByteBuffer on heterogeneous memory), which is attempting to be more generic. We might also need to increase the velocity on https://bugs.openjdk.java.net/browse/JDK-8180628 (retrofit direct buffer support for size beyond gigabyte scales), and i would be very interested your views on this, how you might be currently working around such size limitations, and what buffer enhancements would work for you. Thanks, Paul.
On May 30, 2018, at 10:21 PM, Viswanathan, Sandhya <sandhya.viswanathan@intel.com> wrote:
Hi Andrew/Jonathan,
Thanks a lot for sharing this work. Copying hotspot-compiler-dev to get their feedback as well.
Couple of thoughts/observations below: * Supporting ByteBuffer on persistent memory using existing FileChannel and MappedByteBuffer mechanism sounds like a very good idea.
* Extending FileChannel.map to take additional parameter to indicate that the ByteBuffer is backed by persistent memory is a small API change.
* Adding MappedByteBuffer.force(int from, int to) method on smaller range is very useful in addition to the force() on entire ByteBuffer.
* The underlying force0_mapsync() could be implemented in terms of new unsafe APIs which in turn could be intrinsified. The advantage of this is that the unsafe APIs could then be used for other future persistent memory APIs in the JRE. Specifically the following two unsafe APIs would be useful: a) public native void flush(long address, long size); b) public native void storeFence(); storeFence() exists today but doesn’t generate any instruction for x86. Wondering if we could have additional boolean parameter to force the sfence generation. * DEFAULT_CACHE_LINE_SIZE is 128 in src/hotspot/cpu/x86/globalDefinitions_x86.hpp whereas actual cache line on the hardware is 64 bytes. This could be the cause for some of performance that you saw with compiler intrinsic vs pure C native.
Best Regards, Sandhya
RFC: Experiment in accessing/managing persistent memory from Java Andrew Dinn adinn at redhat.com Mon May 21 09:47:46 UTC 2018
I have been helping one of my Red Hat colleagues, Jonathan Halliday, to investigate provision of a Java equivalent to Intel's libpmem suite of C libraries [1]. This approach avoids the significant cost of using the Intel libraries from Java via JNI (or, worse, as a virtual driver for a persistent memory device).
Jonathan has modified the JVM/JDK to allow a MappedByteBuffer to be mapped over persistent memory, providing equivalent function to libpmem itself.
On top of this he implemented a Java journaled log class providing equivalent functionality to one of the Intel client libs, libpmemlog, built over libpmem.
The modified MappedByteBuffer can be configured to use either i) a registered native method or ii) a JIT intrinsic to perform the critical task of cache line writeback i.e. the persistence step (the intrinsic is my contribution).
Jonathan's tests compare use of JNI, registered native and intrinsic with an equivalent C program to write a large swathe of records to a journaled log file stored in persistent memory. Performance is worse than C when relying on JNI and significantly better with JVM/JDK support. Indeed, as one might reasonably expect, use of the JIT intrinsic almost completely eliminates writeback costs.
The journaled log code, jdk dev tree patch, build instructions, test code plus C equivalent and test results are all available from Jonathan's git repo [2].
For those who do not want to look at the actual code, the README file [3] provides background to use of persistent memory, an overview of the design, and summary details of the test process and results.
[1] https://pmem.io/pmdk/ [2] https://github.com/jhalliday/pmem [3] http://github.com://jhalliday/pmem/README.md
n.b. Jonathan has experimented with using this same prototype to replace the journaled log used in the Red Hat Narayana transaction manager. It provides a significant improvement on the current disk file based log, both for throughput and latency (the code is not yet available as getting it to work involved some horrible hacking of the build to migrate up to jdk11).
regards,
Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander
Hi Paul Looks like we're all on the same page regarding the basic approach of using a small API and making the critical bits intrinsic. We perhaps have some way to go on exactly what that API looks like in terms of the classes and methods, but iterating on it by discussion of a JEP seems like the best way forward. The important thing from my perspective is that so far nobody has come forward with a use case that is not covered by the proposed primitives. So it's a small API, but not too small. As far a tweaks go, we have considered making the low level primitive method / intrinsic just a flushCacheline(base_address), since the arithmetic and loop for writing flush(from,to) in terms of that low level op is something that can be optimized fine by the JIT already. Though that does mean exposing the cache line size to the Java layer, whereas currently it's only visible in the C code. My own background and focus is transactions systems, so I'm more about the speed and fault tolerance than the capacity, but I can see long vs. int being of interest to our Infinispan data grid team and likewise for e.g. Oracle Coherence or databases like Cassandra. OTOH it's not uncommon to prefer moderately sized files and shard over them, which sidesteps the issue. Utility code to assist with fine-grained memory management within the buffer/file may be more useful than support for really large buffers, since they tend to be used with some form of internal block/heap structure anyhow, rather than to hold very large objects. Providing that may be the role of a 3rd party pure Java library like PCJ though, rather than something we want in the JDK itself at this early stage. The researcher in me is kinda interested in how much of the memory allocation and GC code can be re-purposed here though... What's the intended timeline on long buffer indexing at present? My feeling is a pmem API JEP is probably targeting around JDK 13, but we're flexible on that. We may also want to look at related enhancements like unmapping buffers. I think those pieces are sufficient decoupled that they won't be dependencies for the pmem API though, unlike other factors such as the availability of test hardware. Regards Jonathan. On 08/06/18 01:42, Paul Sandoz wrote:
Hi Andrew, Jonathan,
Sandhya gave an overview to a few of us Oracle folks. I agree with what Sandhya says regarding the API, a small surface, and on pursuing an unsafe intrinsic. I like it and would encourage the writing of a draft JEP, especially to give this visibility.
I expect this will be beneficial for experimentation with the Panama foreign API where we can use a Pointer to reference into a byte buffer and scribble on it. Further, i hope this work may also benefit the persistent collections effort (PCJ).
It intersects with https://bugs.openjdk.java.net/browse/JDK-8153111 ((bf) Allocating ByteBuffer on heterogeneous memory), which is attempting to be more generic.
We might also need to increase the velocity on https://bugs.openjdk.java.net/browse/JDK-8180628 (retrofit direct buffer support for size beyond gigabyte scales), and i would be very interested your views on this, how you might be currently working around such size limitations, and what buffer enhancements would work for you.
Thanks, Paul.
On May 30, 2018, at 10:21 PM, Viswanathan, Sandhya <sandhya.viswanathan@intel.com> wrote:
Hi Andrew/Jonathan,
Thanks a lot for sharing this work. Copying hotspot-compiler-dev to get their feedback as well.
Couple of thoughts/observations below: * Supporting ByteBuffer on persistent memory using existing FileChannel and MappedByteBuffer mechanism sounds like a very good idea.
* Extending FileChannel.map to take additional parameter to indicate that the ByteBuffer is backed by persistent memory is a small API change.
* Adding MappedByteBuffer.force(int from, int to) method on smaller range is very useful in addition to the force() on entire ByteBuffer.
* The underlying force0_mapsync() could be implemented in terms of new unsafe APIs which in turn could be intrinsified. The advantage of this is that the unsafe APIs could then be used for other future persistent memory APIs in the JRE. Specifically the following two unsafe APIs would be useful: a) public native void flush(long address, long size); b) public native void storeFence(); storeFence() exists today but doesn’t generate any instruction for x86. Wondering if we could have additional boolean parameter to force the sfence generation. * DEFAULT_CACHE_LINE_SIZE is 128 in src/hotspot/cpu/x86/globalDefinitions_x86.hpp whereas actual cache line on the hardware is 64 bytes. This could be the cause for some of performance that you saw with compiler intrinsic vs pure C native.
Best Regards, Sandhya
RFC: Experiment in accessing/managing persistent memory from Java Andrew Dinn adinn at redhat.com Mon May 21 09:47:46 UTC 2018
I have been helping one of my Red Hat colleagues, Jonathan Halliday, to investigate provision of a Java equivalent to Intel's libpmem suite of C libraries [1]. This approach avoids the significant cost of using the Intel libraries from Java via JNI (or, worse, as a virtual driver for a persistent memory device).
Jonathan has modified the JVM/JDK to allow a MappedByteBuffer to be mapped over persistent memory, providing equivalent function to libpmem itself.
On top of this he implemented a Java journaled log class providing equivalent functionality to one of the Intel client libs, libpmemlog, built over libpmem.
The modified MappedByteBuffer can be configured to use either i) a registered native method or ii) a JIT intrinsic to perform the critical task of cache line writeback i.e. the persistence step (the intrinsic is my contribution).
Jonathan's tests compare use of JNI, registered native and intrinsic with an equivalent C program to write a large swathe of records to a journaled log file stored in persistent memory. Performance is worse than C when relying on JNI and significantly better with JVM/JDK support. Indeed, as one might reasonably expect, use of the JIT intrinsic almost completely eliminates writeback costs.
The journaled log code, jdk dev tree patch, build instructions, test code plus C equivalent and test results are all available from Jonathan's git repo [2].
For those who do not want to look at the actual code, the README file [3] provides background to use of persistent memory, an overview of the design, and summary details of the test process and results.
[1] https://pmem.io/pmdk/ [2] https://github.com/jhalliday/pmem [3] http://github.com://jhalliday/pmem/README.md
n.b. Jonathan has experimented with using this same prototype to replace the journaled log used in the Red Hat Narayana transaction manager. It provides a significant improvement on the current disk file based log, both for throughput and latency (the code is not yet available as getting it to work involved some horrible hacking of the build to migrate up to jdk11).
regards,
Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander
-- Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander
Hi Jonathan,
On Jun 8, 2018, at 3:59 AM, Jonathan Halliday <jonathan.halliday@redhat.com> wrote:
Hi Paul
Looks like we're all on the same page regarding the basic approach of using a small API and making the critical bits intrinsic. We perhaps have some way to go on exactly what that API looks like in terms of the classes and methods, but iterating on it by discussion of a JEP seems like the best way forward. The important thing from my perspective is that so far nobody has come forward with a use case that is not covered by the proposed primitives. So it's a small API, but not too small.
Yes, a smallish API we can iterate on.
As far a tweaks go, we have considered making the low level primitive method / intrinsic just a flushCacheline(base_address), since the arithmetic and loop for writing flush(from,to) in terms of that low level op is something that can be optimized fine by the JIT already. Though that does mean exposing the cache line size to the Java layer, whereas currently it's only visible in the C code.
That’s ok. The simpler the intrinsics while relying on Java code + JIT would be my generally preferred pattern.
My own background and focus is transactions systems, so I'm more about the speed and fault tolerance than the capacity, but I can see long vs. int being of interest to our Infinispan data grid team and likewise for e.g. Oracle Coherence or databases like Cassandra. OTOH it's not uncommon to prefer moderately sized files and shard over them, which sidesteps the issue.
Ok, which is conveniently how developers currently work around the issue of mapping large files :-)
Utility code to assist with fine-grained memory management within the buffer/file may be more useful than support for really large buffers, since they tend to be used with some form of internal block/heap structure anyhow, rather than to hold very large objects. Providing that may be the role of a 3rd party pure Java library like PCJ though, rather than something we want in the JDK itself at this early stage. The researcher in me is kinda interested in how much of the memory allocation and GC code can be re-purposed here though...
What's the intended timeline on long buffer indexing at present?
Unsure, but it's probably something we want to solve soonish.
My feeling is a pmem API JEP is probably targeting around JDK 13, but we're flexible on that.
Note that the JEP process can be started before then and JEPs are not targeted to a release until ready, if its ready sooner great! otherwise later. Keeping such a JEP focused on the mapping/flushing of BBs for NVM would be my recommendation rather than expanding its scope.
We may also want to look at related enhancements like unmapping buffers. I think those pieces are sufficient decoupled that they won't be dependencies for the pmem API though, unlike other factors such as the availability of test hardware.
That’s tricky! We have been through many discussions over the years on how to achieve this without much resolution. Andrew Haley came up with an interesting solution which IIRC requires the deallocating/unmapping thread to effectively reach safe point and wait for all other threads to pass through a check point. Project Panama is looking at the explicit scoping of resources, perhaps also resources that are thread confined or owned. My sense is Project Panama will eventually push strongly on this area and that’s where we should focus our efforts. Thanks, Paul.
On 06/12/2018 06:12 PM, Paul Sandoz wrote:
We may also want to look at related enhancements like unmapping buffers. I think those pieces are sufficient decoupled that they won't be dependencies for the pmem API though, unlike other factors such as the availability of test hardware.
That’s tricky! We have been through many discussions over the years on how to achieve this without much resolution. Andrew Haley came up with an interesting solution which IIRC requires the deallocating/unmapping thread to effectively reach safe point and wait for all other threads to pass through a check point. Project Panama is looking at the explicit scoping of resources, perhaps also resources that are thread confined or owned. My sense is Project Panama will eventually push strongly on this area and that’s where we should focus our efforts.
Yeah, perhaps so. I've been waiting to come up for air to have enough time to handle the ByteBuffer.unmap() bug. I can see the advantage of handling it at a static language level, but the solutions aren't necessarily exclusive. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
Hi Paul, Sorry for the delay in responding to this -- holiday and then an urgent bug fix intervened . . . On 08/06/18 01:42, Paul Sandoz wrote:
Sandhya gave an overview to a few of us Oracle folks. I agree with what Sandhya says regarding the API, a small surface, and on pursuing an unsafe intrinsic. I like it and would encourage the writing of a draft JEP, especially to give this visibility.
Great! Thanks for your feedback (also to Sandhya). I'll start drafting a JEP staright away. I'll also work on revising the current intrinsic implementation so it is presented via Unsafe (which should be fairly simple to achieve).
It intersects with https://bugs.openjdk.java.net/browse/JDK-8153111 ((bf) Allocating ByteBuffer on heterogeneous memory), which is attempting to be more generic.
Ok, thanks. I'll have a think about how we night try to integrate these two approaches and see what I can work into the draft JEP.
We might also need to increase the velocity on https://bugs.openjdk.java.net/browse/JDK-8180628 (retrofit direct buffer support for size beyond gigabyte scales), and i would be very interested your views on this, how you might be currently working around such size limitations, and what buffer enhancements would work for you.
I think Jonathan answered that better than I can in his response. However, if this accelerates delivery of a fix for JDK-8180628 then all to the good. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander
Hi,
On Jun 21, 2018, at 9:32 AM, Andrew Dinn <adinn@redhat.com> wrote:
Hi Paul,
Sorry for the delay in responding to this -- holiday and then an urgent bug fix intervened . . .
On 08/06/18 01:42, Paul Sandoz wrote:
Sandhya gave an overview to a few of us Oracle folks. I agree with what Sandhya says regarding the API, a small surface, and on pursuing an unsafe intrinsic. I like it and would encourage the writing of a draft JEP, especially to give this visibility.
Great! Thanks for your feedback (also to Sandhya). I'll start drafting a JEP staright away. I'll also work on revising the current intrinsic implementation so it is presented via Unsafe (which should be fairly simple to achieve).
Great!
It intersects with https://bugs.openjdk.java.net/browse/JDK-8153111 ((bf) Allocating ByteBuffer on heterogeneous memory), which is attempting to be more generic.
Ok, thanks. I'll have a think about how we night try to integrate these two approaches and see what I can work into the draft JEP.
My impressions are that your approach may be a sufficiently good step forward that we don’t need to introduce a new abstraction for buffer allocation. Vivek any views on this?
We might also need to increase the velocity on https://bugs.openjdk.java.net/browse/JDK-8180628 (retrofit direct buffer support for size beyond gigabyte scales), and i would be very interested your views on this, how you might be currently working around such size limitations, and what buffer enhancements would work for you.
I think Jonathan answered that better than I can in his response. However, if this accelerates delivery of a fix for JDK-8180628 then all to the good.
Agreed! Paul.
participants (5)
-
Andrew Dinn
-
Andrew Haley
-
Jonathan Halliday
-
Paul Sandoz
-
Viswanathan, Sandhya