Use DAX in ZGC
Yasumasa Suenaga
suenaga at oss.nttdata.com
Fri Feb 14 14:23:04 UTC 2020
On 2020/02/14 23:08, Per Liden wrote:
> Hi,
>
> On 2/14/20 2:31 PM, Yasumasa Suenaga wrote:
>> Hi Per,
>>
>> On 2020/02/14 20:52, Per Liden wrote:
>>> Hi Yasumasa,
>>>
>>> On 2/14/20 10:07 AM, Yasumasa Suenaga wrote:
>>>> Hi all,
>>>>
>>>> I tried to allocate heap to DAX on Linux with -XX:AllocateHeapAt, but it couldn't.
>>>> It seems to allow when filesystem is hugetlbfs or tmpfs.
>>>>
>>>> According to kernel document [1], DAX is supported in ext2, ext4, and xfs.
>>>> Also we need to mount it with "-o dax".
>>>>
>>>> I want to use ZGC on DAX, so I want to introduce new option -XX:ZAllowHeapOnFileSystem to allow to use all filesystem as backing storage.
>>>> What do you think this change?
>>>
>>>
>>> + experimental(bool, ZAllowHeapOnFileSystem, false, \
>>> + "Allow to use filesystem as Java heap backing storage " \
>>> + "specified by -XX:AllocateHeapAt") \
>>> + \
>>>
>>> Instead of adding a new option it would be preferable to automatically detect that it's a dax mounted filesystem. But I haven't has a chance to look into the best way of doing that.
>>
>> I thought so, but I guess it is difficult.
>> PMDK also does not check it automatically.
>>
>> https://urldefense.com/v3/__https://github.com/pmem/pmdk/blob/master/src/libpmem2/pmem2_utils_linux.c*L18__;Iw!!GqivPVa7Brio!PlQs19bQVBJF7PDA9RLZ9JLbXOQ2KYocNW6DJH-eOUqXZcYwl-cSvSjpfC316y0$
>> In addition, we don't seem to be able to get mount option ("-o dax") via syscall.
>> I strace'ed `mount -o dax ...`, I saw "-o dax" was passed to 5th argument (const void *data). It would be handled in each filesystem, so I could not get it.
>>
>> Another solution, we can use /proc/mounts, but it might be complex.
>
> I was maybe hoping you could get this information through some ioctl() command on the file descriptor?
I tried to FS_IOC_FSGETXATTR ioctl (FS_XFLAG_DAX might be set in fsx_xflags), but I couldn't get.
(I use ext4 with "-o dax")
>>> const size_t expected_block_size = is_tmpfs() ? os::vm_page_size() : os::large_page_size();
>>> - if (expected_block_size != _block_size) {
>>> + if (!ZAllowHeapOnFileSystem && (expected_block_size != _block_size)) {
>>> log_error(gc)("%s filesystem has unexpected block size " SIZE_FORMAT " (expected " SIZE_FORMAT ")",
>>> is_tmpfs() ? ZFILESYSTEM_TMPFS : ZFILESYSTEM_HUGETLBFS, _block_size, expected_block_size);
>>> return;
>>> }
>>>
>>> This part looks potentially dangerous, since we might then be working with an incorrect _block_size.
>>
>> I guess block size in almost filesystems is 4KB even if DAX.
>> (XFS allows variable block sizes...)
>
> With your current patch, a user could use -XX:AllocateHeapAt to point to any kind of file system, which (at least in theory) could have any block size. For things to work down the road we must ensure than ZGranuleSize is a multiple of _block_size.
Ok.
>> https://urldefense.com/v3/__https://nvdimm.wiki.kernel.org/2mib_fs_dax__;!!GqivPVa7Brio!PlQs19bQVBJF7PDA9RLZ9JLbXOQ2KYocNW6DJH-eOUqXZcYwl-cSvSjpxnIc0as$
>> So I think we can limit _block_size to OS page size (4KB).
>>
>>
>>> int ZPhysicalMemoryBacking::create_file_fd(const char* name) const {
>>> + if (ZAllowHeapOnFileSystem && (AllocateHeapAt == NULL)) {
>>> + log_error(gc)("-XX:AllocateHeapAt is needed when ZAllowHeapOnFileSystem is specified");
>>> + return -1;
>>> + }
>>> +
>>> const char* const filesystem = ZLargePages::is_explicit()
>>> ? ZFILESYSTEM_HUGETLBFS
>>> : ZFILESYSTEM_TMPFS;
>>>
>>> This part looks unnecessary, no?
>>
>> I added ZAllowHeapOnFileSystem to use with AllocateHeapAt.
>> So I want to warn if AllocateHeapAt == NULL.
>
> Yes, but that seems unnecessary, and I suggest it's removed.
Ok.
BTW is it worth to file JBS?
Cheers,
Yasumasa
> cheers,
> /Per
>
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>>> cheers,
>>> Per
>>>
>>>>
>>>> http://cr.openjdk.java.net/~ysuenaga/dax-z/
>>>>
>>>> If it can be accepted, I will file it to JBS and will propose CSR.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> [1] https://urldefense.com/v3/__https://www.kernel.org/doc/Documentation/filesystems/dax.txt__;!!GqivPVa7Brio!PlQs19bQVBJF7PDA9RLZ9JLbXOQ2KYocNW6DJH-eOUqXZcYwl-cSvSjpe5WElhc$
>>
More information about the hotspot-gc-dev
mailing list