RFR: 8252500: ZGC on aarch64: Unable to allocate heap for certain Linux kernel configurations
Stefan Karlsson
stefank at openjdk.java.net
Mon Sep 7 09:16:13 UTC 2020
On Mon, 7 Sep 2020 07:27:05 GMT, Christoph Göttschkes <cgo at openjdk.org> wrote:
> The patch introduces a new function to probe for the highest valid bit in the virtual address space for userspace
> programs on Linux.
> I guarded the whole implementation to only probe on Linux, other platforms will remain unaffected. Possibly, it would
> be nicer to move the probing code into an OS+ARCH specific source file. But since this is only a single function, I
> thought it would be better to put it right next to the caller and guard it with an #ifdef LINUX. The probing mechanism
> uses a combination of msync + mmap, to first check if the address is valid using msync (if msync succeeds, the address
> was valid). If msync fails, mmap is used to check if msync failed because the memory wasn't mapped, or if it failed
> because the address is invalid. Due to some undefined behavior (documented in the msync man page), I also use a single
> mmap at the end, if the msync approach failed before. I tested msync with different combinations of mappings, and also
> with sbrk, and it always succeeded, or failed with ENOMEM. I never got back any other error code. The specified
> minimum value has been chosen "randomly". The JVM terminates (unable to allocate heap), if this minimum value is
> smaller than the requested Java Heap size, so it might be better to make the minimum dependent on the MaxHeapSize and
> not a compile time constant? I didn't want to make the minimum too big, since for aarch64 on Linux, the documented
> minimum would be 38 (see [1]). I avoided MAP_FIXED_NOREPLACE, because according to the man page, it has been added in
> Linux 4.17. There are still plenty of stable kernel versions around which will not have that feature, which means we
> need to implement a workaround for it. Some of my test devices also have a kernel version lower than that. I executed
> the HotSpot tier1 JTreg tests on two different aarch64 devices. One with 4KB pages and 3 page levels and the other with
> 4KB pages and 4 page levels. Tests passed on both devices. [1]
> https://www.kernel.org/doc/Documentation/arm64/memory.txt
src/hotspot/cpu/aarch64/gc/z/zGlobals_aarch64.cpp line 156:
> 154: max_address_bit = i;
> 155: break;
> 156: }
Is there a one-off error here? Taking i == 47 as an example. This means that you test the base_address == '10000000
00000000 00000000 00000000 00000000 00000000' (in bits). That is, you test that the address range with the 48th bit set
(128T-256T) is usable. However, when 47 then is returned to the caller, it interprets it as if the 64T-128T range is
usable.
src/hotspot/cpu/aarch64/gc/z/zGlobals_aarch64.cpp line 177:
> 175: munmap(result_addr, page_size);
> 176: }
> 177: }
If you swap the order of these if-statements you could add a break after 172.
This way the exit of the loop would look like the one you have after msync. You would also be able to get rid of the '
&& max_address_bit == 0' check in the for statement.
-------------
PR: https://git.openjdk.java.net/jdk/pull/40
More information about the hotspot-gc-dev
mailing list