RFR 7143743: (zipfs) Potential memory leak with zip provider

Jaikiran Pai jai.forums2013 at gmail.com
Sat Jan 11 09:51:25 UTC 2020


Can I please get a review and a sponsor for a patch which fixes
JDK-7143743[1]? The patch is hosted as a webrev at [2].

As noted in the JBS issue, the ZipFileSystem, even after being closed
can end up holding on to large amounts of memory. The root of the issue
is the "inodes" map which is maintained by the
jdk.nio.zipfs.ZipFileSystem class. This map holds on to "IndexNode"(s).
IndexNode(s) are added/updated/removed to/from this map as and when the
filesystem is used to add/update/remove files. One such IndexNode type
is the jdk.nio.zipfs.ZipFileSystem$Entry type and an instance of this
type will actually hold on to the underlying data of the file as a
byte[] (the member is called "bytes").

Once the instance of the ZipFileSystem is closed, this "inodes" map
isn't cleared and as a result, potentially, large amounts of data can
end up staying in the jdk.nio.zipfs.ZipFileSystem$Entry.bytes member(s).

The commit here fixes that issue by simply clearing the "inodes" map in
the jdk.nio.zipfs.ZipFileSystem.close() method. I have checked the usage
of the "inodes" map and from what I see, it's usage in various places is
guarded by "ensureOpen" checks, which means that once the ZipFileSystem
instance is closed, the contents of these "inodes" map is no longer
relevant and hence clearing it shouldn't cause any issues.

Given the nature of this issue, I haven't included a jtreg test for this
change. However, I have run the existing ZipFSTester.java testcase to
make sure no obvious regressions are introduced. That test passed
successfully with this change.

A program (a slightly modified version of the one available in the JBS
issue) is made available at the end of this mail to reproduce this issue
(and verify the fix). To run this program, pass it a path to a directory
(as first argument) which contains files with large sizes and a path to
a (non-existent) zip file that needs to be created (as second argument).

In my manual testing, I used a directory with 3 files of a total size of
around 831MB:

$ pwd

/home/me/testdir

$ ls -lh

total 1702360
-rw-r--r--@ 1 jaikiran  184M Oct 29 09:52 a
-rw-r--r--@ 1 jaikiran  258M Oct  9 19:52 b
-rw-r--r--@ 1 jaikiran  368M Dec 26 08:30 c

$ du -sh

831M .

Running the program resulted in:

$ java ZipFSLeak.java /home/me/testdir/ before-fix.zip

Zipping /home/me/testdir/ to before-fix.zip
Copied a total of 849617826 bytes from /home/me/testdir/
Running some GC ...
Memory in use (after GC): 853071256

(notice that the memory in use at the end of the program, after the
ZipFileSystem has been closed a while back, stays extremely high and
almost equal to the total bytes of the files that were added to the
ZipFileSystem).

Now after the fix, running the same program against the same directory
results in:

$ java ZipFSLeak.java /home/me/testdir/ after-fix.zip

Zipping /home/me/testdir/ to after-fix.zip
Copied a total of 849617826 bytes from /home/me/testdir/
Running some GC ...
Memory in use (after GC): 4769904

(notice the drastic improvement in the memory usage)

Following is the program used to reproduce the issue:


import java.io.IOException;
import java.net.URI;
import java.nio.file.DirectoryStream;
import java.nio.file.FileSystem;
import java.nio.file.FileSystems;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.HashMap;
import java.util.Map;


public class ZipFSLeak {

    public static void main(String[] args) throws IOException {
        // first arg is dir contatining (preferably) files of large sizes
        final Path srcDir = FileSystems.getDefault().getPath(args[0]);
        // second arg is path to the target zip which will be created by
this program
        final Path targetZip = FileSystems.getDefault().getPath(args[1]);
        System.out.println("Zipping " + srcDir + " to " + targetZip);

        final FileSystem zip = createZipFileSystem(targetZip, true);
        long totalCopiedSize = 0;
        try (final DirectoryStream<Path> dirStream =
Files.newDirectoryStream(srcDir);
                var zipFS = zip) {

            final Path zipRoot = zipFS.getPath("/");
            for (Path path : dirStream) {
                Files.copy(path,
zipRoot.resolve(path.getFileName().toString()));
                totalCopiedSize += path.toFile().length();
            }
        }

        System.out.println("Copied a total of " + totalCopiedSize + "
bytes from " + srcDir);
        System.out.println("Running some GC ...");
        // run some GC
        for (int i = 0; i < 10; i++) {
            Runtime.getRuntime().gc();
        }
        System.out.println("Memory in use (after GC): " +
(Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory()));
    }

    private static FileSystem createZipFileSystem(Path path, boolean
create) throws IOException {
        final URI uri = URI.create("jar:file:" + path.toUri().getPath());

        final Map<String, String> env = new HashMap<>();
        if (create) {
            env.put("create", "true");
        }
        return FileSystems.newFileSystem(uri, env);

    }
}


[1] https://bugs.openjdk.java.net/browse/JDK-7143743

[2] https://cr.openjdk.java.net/~jpai/webrev/7143743/1/webrev/

P.S: This affects all prominent Java versions since Java 8 (that's the
least one I tested) and not just the latest upstream.

-Jaikiran




More information about the nio-dev mailing list