RFR(xs): (aix but affects shared code too) 8186665: buffer overflow in Java_java_nio_MappedByteBuffer_isLoaded0

Thu Oct 19 13:29:03 UTC 2017

Thanks Christoph!

On Thu, Oct 19, 2017 at 3:22 PM, Langer, Christoph <christoph.langer at sap.com
> wrote:

> Hi Thomas,
>
>
>
> ok from my end.
>
>
>
> Best regards
>
> Christoph
>
>
>
> *From:* Thomas Stüfe [mailto:thomas.stuefe at gmail.com]
> *Sent:* Donnerstag, 19. Oktober 2017 15:22
> *To:* Peter Levart <peter.levart at gmail.com>; Langer, Christoph <
> christoph.langer at sap.com>
> *Cc:* Alan Bateman <Alan.Bateman at oracle.com>; nio-dev at openjdk.java.net;
> ppc-aix-port-dev at openjdk.java.net; Java Core Libs <
> core-libs-dev at openjdk.java.net>
> *Subject:* Re: RFR(xs): (aix but affects shared code too) 8186665: buffer
> overflow in Java_java_nio_MappedByteBuffer_isLoaded0
>
>
>
> Hi Peter, Christoph,
>
>
>
> if you have no objections, I'd like to push this change. As I explained in
> my earlier mail, I'd prefer not to change MappedByteBuffer::load(), and if
> you are fine with the change in its current form (
> http://cr.openjdk.java.net/~stuefe/webrevs/8186665-buffer-
> overflow-in-mincore/webrev.02/webrev/), I'd like to push it.
>
>
>
> Thanks, Thomas
>
>
>
>
>
> On Wed, Oct 18, 2017 at 12:24 PM, Thomas Stüfe <thomas.stuefe at gmail.com>
> wrote:
>
> Hi Peter, Christoph,
>
>
>
> Thank you both for reviewing.
>
>
>
> New webrev: http://cr.openjdk.java.net/~stuefe/webrevs/
> 8186665-buffer-overflow-in-mincore/webrev.02/webrev/
>
>
>
> @Peter:
>
>
>
> >Shouldn't the following line:
> >
> >  47     return len2 + pagesize - 1 / pagesize;
> >
> >..be written as:
> >
> >           return (len2 + pagesize - 1) / pagesize;
>
>
>
> You are right. Did not cause the mincore output buffer to be unnecessarily
> large. Thanks for catching this.
>
>
>
> As for your other concern:
>
>
>
>
>
> On Wed, Oct 18, 2017 at 10:32 AM, Peter Levart <peter.levart at gmail.com>
> wrote:
>
> --
> In Java_java_nio_MappedByteBuffer_isLoaded0, we call mincore(2) to
> retrieve the paging status of an address range.
>
> The size of the output buffer for mincore(2) depends on the number of
> pages in *system page size* in the given memory range (this is spelled out
> more or less explicitly on AIX and Linux, nothing is said on BSD/OSX, but I
> assume the same). The number of pages in the memory range is calculated by
> MappedByteBuffer.isLoaded() and handed down to Java_java_nio_MappedByteBuffer_isLoaded0()
> together with the memory range to test.
>
> MappedByteBuffer.isLoaded() calculates this number of pages based on
> jjdk.internal.misc.Unsafe.pagesize(), which ultimately comes down to
> os::vm_page_size().
>
> For AIX, os::vm_page_size() may return a page size larger than the system
> page size of 4K. The reason for this is that on AIX, memory can be backed
> by different page sizes, usually either 4K or 64K - e.g. posix thread
> stacks may have 4K pages, java heap (system V shared memory) with 64K
> pages, but mmap memory is always 4K page backed...
>
>
> If this is true and Unsafe.pagesize() returns a value > 4K, then perhaps
> also the MappedByteBuffer.load() method is wrong for AIX?
>
>     public final MappedByteBuffer load() {
>         checkMapped();
>         if ((address == 0) || (capacity() == 0))
>             return this;
>         long offset = mappingOffset();
>         long length = mappingLength(offset);
>         load0(mappingAddress(offset), length);
>
>         // Read a byte from each page to bring it into memory. A checksum
>         // is computed as we go along to prevent the compiler from
> otherwise
>         // considering the loop as dead code.
>         Unsafe unsafe = Unsafe.getUnsafe();
>         int ps = Bits.pageSize(); // << LOOK HERE
>         int count = Bits.pageCount(length);
>         long a = mappingAddress(offset);
>         byte x = 0;
>         for (int i=0; i<count; i++) {
>             x ^= unsafe.getByte(a);
>             a += ps; // << AND HERE
>         }
>         if (unused != 0)
>             unused = x;
>
>         return this;
>     }
>
> ...this loop reads a byte from the start of each block in lumps of
> Bits.pageSize(). Should it always read in lumps of 4K for AIX? Do we rather
> need a special Unsafe.mmappedPageSize() method instead of just a hack in
> isLoaded0 ?
>
>
>
> Yes, I considered this too. In effect, on AIX, we only touch every 16th
> page, thereby reducing MappedByteBuffer::load() to something like
> load_every_16th_page... However, this bug is very old (even our 1.4 VM
> already does this when the touching was still implemented in
> MappedByteBuffer.c) and did not cause any problems AFAIK.
>
>
>
> The story behind this is long and quite boring :) basically, 64k pages are
> used for the java heap and give a large performance bonus over 4K paged
> java heap. But we cannot switch all memory regions to 64K pages, so we live
> with multiple page sizes and above us we have a ton of code which assumes
> one consistent page size for everything. So we lie about the page size to
> everyone - claiming system page size to be 64k - and except for very rare
> cases like this one get away with this.
>
>
>
> I would like to keep lying consistently. There is not a hard reason for
> it, just that I am afraid that starting to publish a different page size to
> parts of the VM will confuse things and may introduce errors further down
> the line.
>
>
>
> I think a proper solution would be to keep page size on a per-ByteBuffer
> base, because ByteBuffers may be allocated in different memory regions -
> they are now allocated with mmap() in system page size, but that may change
> in the future. That is assuming that one byte buffer cannot span areas of
> multiple page sizes, which would complicate matters further.
>
>
>
> Btw, I also wondered whether other platforms could have a clash between
> the real memory page size and MappedByteBuffer's notion of that size - e.g.
> whether it is possible to have MappedByteBuffers with huge pages on Linux.
> But all cases I could think of are cases where the page size the JDK would
> assume is smaller than the actual page size, which would not be a problem
> for both mincore and load/touch. In the above example (huge pages on
> Linux), pages would be pinned anyway, so load() and isLoaded() would be
> noops.
>
>
>
>
>
> @Christoph:
>
>
>
> > apart from the point that Peter found, I’d also think it would look
> better if the typedef section (line 51-56) would be placed before the AIX
> only function (line 41-49).
>
>
>
> Sure. I moved the section up, below the includes.
>
>
>
> Kind Regards, Thomas
>
>
>
>
>