Suspected regression: fix for 6735255 causes delay in GC of ZipFile InputStreams, increase in heap demand

Neil Richards neil.richards at ngmr.net
Wed Mar 30 19:53:30 UTC 2011


Hi Sherman,
Thanks for your review and comments on this.

On Tue, 2011-03-29 at 12:05 -0700, Xueming Shen wrote:
> Hi Neil,
> 
> It appears to be a "regression" in scenario you described (user 
> application never close the input stream after use and the ZipFile
> instance being retained during the lifetime of the process). The
> proposed approach seems to solve this particular problem.
> 
> Here are my comments regarding the patch.
> 
> (1) ZipFileInflaterInputStream.close() now synchronizes on 
>       Zipfile.this, is it really necessary?

Synchronization is used so that an update to 'isClosed' on one thread is
seen by another.

Adding finalization (ie. a call from finalize() to close()) effectively
introduces a new thread into the mix, namely the finalization thread.
Whilst you're correct in saying that, in general, InputStream gives no
thread-safe guarantees, in this particular case I believe it is valid to
make sure updates to this field are seen across threads, to prevent the
logic of close() being run twice.

I chose to synchronize on ZipFile.this merely for consistency - ie.
because that's what ZipFileInputStream (the other InputStream impl in
ZipFile) does. 

I suppose I felt the risks of creating some kind of obscure deadlock
scenario was minimised by copying the existing pattern. 

I can be easily swayed if you think there's a better object to
synchronize on, that doesn't run an increased risk of deadlock.

> (2) Is there any particular reason that we don't want to "reuse" the 
>     Inflater(), if the close comes from the finalize() in your testing 
>     evn? One thing we noticed before is that the moment the
>     InputStream gets finalized, the Inflater embedded might have been
>     finalized already, this is the reason why we have a "inf.ended()" 
>     check in releaseInflater().

The ZipFileInflaterInputStream object's finalize() method is called by
the finalization thread, after it has been added to the finalization
queue by GC. 

If *it* has been added to the finalization queue (because it's eligible
for GC), then so will the Inflater object it is referencing.

As finalization gives no guarantees as to the order in which objects are
finalized, regardless of whether finalize() has been called on the
Inflater object, it is nevertheless the case that its method *will* be
called at some point, so the Inflater object *will* be ended.

If it has been put back into the inflater cache (because ZFIIS's
finalize happened to be called first), then this ended inflater will lie
in wait for another ZFIIS to use it, which would result in a
particularly unexpected exception being thrown.

In general, I believe it to be quite frowned on to "resurrect" objects
which are going through finalization (including those queued up for
finalization) and place them back into the live set of object, as weird
behaviour often ensues if one does.

(Especially as, iirc, finalize() is only called once per object,
regardless of whether it is "dug back up" in this fashion). 

> (3) The attached test case does "f.close()" in its finally block, I 
> guess it should be f.delete()?

When I wrote the testcase, I relied upon File.deleteOnExit() to take
care of cleaning up the temporary file(s) created.

However, I recall now that I don't think it guarantees deletion on
abnormal termination.

I agree that calling f.delete() instead of f.close() would be a good
improvement to the testcase.

Please find below an updated changeset, with this change made and the
bug number filled in.

Hope this helps to clarify things,
Thanks again,
Neil

-- 
Unless stated above:
IBM email: neil_richards at uk.ibm.com
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


# HG changeset patch
# User Neil Richards <neil.richards at ngmr.net>, <neil_richards at uk.ibm.com>
# Date 1300289208 0
# Branch zip-heap
# Node ID 8ab5aa0669a176d95502d9cf68027ae9679ccab2
# Parent  554adcfb615e63e62af530b1c10fcf7813a75b26
7031076: Retained ZipFile InputStreams increase heap demand
Summary: Allow unreferenced ZipFile InputStreams to be finalized, GC'd
Contributed-by: <neil.richards at ngmr.net>

diff --git a/src/share/classes/java/util/zip/ZipFile.java b/src/share/classes/java/util/zip/ZipFile.java
--- a/src/share/classes/java/util/zip/ZipFile.java
+++ b/src/share/classes/java/util/zip/ZipFile.java
@@ -30,11 +30,14 @@
 import java.io.IOException;
 import java.io.EOFException;
 import java.io.File;
+import java.lang.ref.ReferenceQueue;
+import java.lang.ref.WeakReference;
 import java.nio.charset.Charset;
 import java.util.Vector;
 import java.util.Enumeration;
 import java.util.Set;
 import java.util.HashSet;
+import java.util.Iterator;
 import java.util.NoSuchElementException;
 import java.security.AccessController;
 import sun.security.action.GetPropertyAction;
@@ -315,7 +318,16 @@
     private static native void freeEntry(long jzfile, long jzentry);
 
     // the outstanding inputstreams that need to be closed.
-    private Set<InputStream> streams = new HashSet<>();
+    private final Set<WeakReference<InputStream>> streams = new HashSet<>();
+    private final ReferenceQueue<? super InputStream> staleStreamQueue =
+        new ReferenceQueue<>();
+
+    private void clearStaleStreams() {
+        Object staleStream;
+        while (null != (staleStream = staleStreamQueue.poll())) {
+            streams.remove(staleStream);
+        }
+    }
 
     /**
      * Returns an input stream for reading the contents of the specified
@@ -339,6 +351,7 @@
         ZipFileInputStream in = null;
         synchronized (this) {
             ensureOpen();
+            clearStaleStreams();
             if (!zc.isUTF8() && (entry.flag & EFS) != 0) {
                 jzentry = getEntry(jzfile, zc.getBytesUTF8(entry.name), false);
             } else {
@@ -351,51 +364,19 @@
 
             switch (getEntryMethod(jzentry)) {
             case STORED:
-                streams.add(in);
+                streams.add(
+                        new WeakReference<InputStream>(in, staleStreamQueue));
                 return in;
             case DEFLATED:
-                final ZipFileInputStream zfin = in;
                 // MORE: Compute good size for inflater stream:
                 long size = getEntrySize(jzentry) + 2; // Inflater likes a bit of slack
                 if (size > 65536) size = 8192;
                 if (size <= 0) size = 4096;
-                InputStream is = new InflaterInputStream(zfin, getInflater(), (int)size) {
-                    private boolean isClosed = false;
-
-                    public void close() throws IOException {
-                        if (!isClosed) {
-                            super.close();
-                            releaseInflater(inf);
-                            isClosed = true;
-                        }
-                    }
-                    // Override fill() method to provide an extra "dummy" byte
-                    // at the end of the input stream. This is required when
-                    // using the "nowrap" Inflater option.
-                    protected void fill() throws IOException {
-                        if (eof) {
-                            throw new EOFException(
-                                "Unexpected end of ZLIB input stream");
-                        }
-                        len = this.in.read(buf, 0, buf.length);
-                        if (len == -1) {
-                            buf[0] = 0;
-                            len = 1;
-                            eof = true;
-                        }
-                        inf.setInput(buf, 0, len);
-                    }
-                    private boolean eof;
-
-                    public int available() throws IOException {
-                        if (isClosed)
-                            return 0;
-                        long avail = zfin.size() - inf.getBytesWritten();
-                        return avail > (long) Integer.MAX_VALUE ?
-                            Integer.MAX_VALUE : (int) avail;
-                    }
-                };
-                streams.add(is);
+                InputStream is = 
+                    new ZipFileInflaterInputStream(in, getInflater(), 
+                            (int)size);
+                streams.add(
+                        new WeakReference<InputStream>(is, staleStreamQueue));
                 return is;
             default:
                 throw new ZipException("invalid compression method");
@@ -403,6 +384,58 @@
         }
     }
 
+    private class ZipFileInflaterInputStream extends InflaterInputStream {
+        private boolean isClosed = false;
+        private boolean eof = false;
+        private final ZipFileInputStream zfin;
+        private boolean inFinalizer = false;
+
+        ZipFileInflaterInputStream(ZipFileInputStream zfin, Inflater inf,
+                int size) {
+            super(zfin, inf, size);
+            this.zfin = zfin;
+        }
+
+        public void close() throws IOException {
+            synchronized (ZipFile.this) {
+                if (!isClosed) {
+                    super.close();
+                    if (false == inFinalizer)
+                        releaseInflater(inf);
+                    isClosed = true;
+                }
+            }
+        }
+        // Override fill() method to provide an extra "dummy" byte
+        // at the end of the input stream. This is required when
+        // using the "nowrap" Inflater option.
+        protected void fill() throws IOException {
+            if (eof) {
+                throw new EOFException("Unexpected end of ZLIB input stream");
+            }
+            len = in.read(buf, 0, buf.length);
+            if (len == -1) {
+                buf[0] = 0;
+                len = 1;
+                eof = true;
+            }
+            inf.setInput(buf, 0, len);
+        }
+
+        public int available() throws IOException {
+            if (isClosed)
+                return 0;
+            long avail = zfin.size() - inf.getBytesWritten();
+            return (avail > (long) Integer.MAX_VALUE ? 
+                    Integer.MAX_VALUE : (int) avail);
+        }
+
+        protected void finalize() throws IOException {
+            inFinalizer = true;
+            close();
+        }
+    }
+
     /*
      * Gets an inflater from the list of available inflaters or allocates
      * a new one.
@@ -543,11 +576,14 @@
         synchronized (this) {
             closeRequested = true;
 
-            if (streams.size() !=0) {
-                Set<InputStream> copy = streams;
-                streams = new HashSet<>();
-                for (InputStream is: copy)
+            Iterator<WeakReference<InputStream>> streamsIterator = 
+                streams.iterator();
+            while (streamsIterator.hasNext()) {
+                InputStream is = streamsIterator.next().get();
+                if (null != is) {
                     is.close();
+                }
+                streamsIterator.remove();
             }
 
             if (jzfile != 0) {
@@ -684,9 +720,12 @@
                     freeEntry(ZipFile.this.jzfile, jzentry);
                     jzentry = 0;
                 }
-                streams.remove(this);
             }
         }
+
+        protected void finalize() {
+            close();
+        }
     }
 
 
diff --git a/test/java/util/zip/ZipFile/ClearStaleZipFileInputStreams.java b/test/java/util/zip/ZipFile/ClearStaleZipFileInputStreams.java
new file mode 100644
--- /dev/null
+++ b/test/java/util/zip/ZipFile/ClearStaleZipFileInputStreams.java
@@ -0,0 +1,148 @@
+/*
+ * Copyright (c) 2011 Oracle and/or its affiliates. All rights reserved.
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ * 
+ * This code is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 only, as
+ * published by the Free Software Foundation.
+ * 
+ * This code is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ * version 2 for more details (a copy is included in the LICENSE file that
+ * accompanied this code).
+ * 
+ * You should have received a copy of the GNU General Public License version
+ * 2 along with this work; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
+ * 
+ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
+ * or visit www.oracle.com if you need additional information or have any
+ * questions.
+ */
+
+/* 
+ * Portions Copyright (c) 2011 IBM Corporation 
+ */
+
+/*
+ * @test
+ * @bug 7031076
+ * @summary Allow stale InputStreams from ZipFiles to be GC'd
+ * @author Neil Richards <neil.richards at ngmr.net>, <neil_richards at uk.ibm.com>
+ */
+import java.lang.ref.ReferenceQueue;
+import java.lang.ref.WeakReference;
+import java.io.File;
+import java.io.FileOutputStream;
+import java.io.InputStream;
+import java.util.Enumeration;
+import java.util.HashSet;
+import java.util.Random;
+import java.util.Set;
+import java.util.zip.ZipEntry;
+import java.util.zip.ZipFile;
+import java.util.zip.ZipOutputStream;
+
+public class ClearStaleZipFileInputStreams {
+    private static final int ZIP_ENTRY_NUM = 5;
+
+    private static final byte[][] data;
+
+    static {
+        data = new byte[ZIP_ENTRY_NUM][];
+        Random r = new Random();
+        for (int i = 0; i < ZIP_ENTRY_NUM; i++) {
+            data[i] = new byte[1000];
+            r.nextBytes(data[i]);
+        }
+    }
+
+    private static File createTestFile(int compression) throws Exception {
+        File tempZipFile = 
+            File.createTempFile("test-data" + compression, ".zip");
+        tempZipFile.deleteOnExit();
+
+        ZipOutputStream zos = 
+            new ZipOutputStream(new FileOutputStream(tempZipFile));
+        zos.setLevel(compression);
+
+        try {
+            for (int i = 0; i < ZIP_ENTRY_NUM; i++) {
+                String text = "Entry" + i;
+                ZipEntry entry = new ZipEntry(text);
+                zos.putNextEntry(entry);
+                try {
+                    zos.write(data[i], 0, data[i].length);
+                } finally {
+                    zos.closeEntry();
+                }
+            }
+        } finally {
+            zos.close();
+        }
+
+        return tempZipFile;
+    }
+
+    private static void startGcInducingThread(final int sleepMillis) {
+        final Thread gcInducingThread = new Thread() {
+            public void run() {
+                while (true) {
+                    System.gc();
+                    try {
+                        Thread.sleep(sleepMillis);
+                    } catch (InterruptedException e) { }
+                }
+            }
+        };
+
+        gcInducingThread.setDaemon(true);
+        gcInducingThread.start();
+    }
+
+    public static void main(String[] args) throws Exception {
+        startGcInducingThread(500);
+        runTest(ZipOutputStream.DEFLATED);
+        runTest(ZipOutputStream.STORED);
+    }
+
+    private static void runTest(int compression) throws Exception {
+        ReferenceQueue<InputStream> rq = new ReferenceQueue<>();
+        
+        System.out.println("Testing with a zip file with compression level = "
+                + compression);
+        File f = createTestFile(compression);
+        try {
+            ZipFile zf = new ZipFile(f);
+            try {
+                Set<Object> refSet = createTransientInputStreams(zf, rq);
+
+                System.out.println("Waiting for 'stale' input streams from ZipFile to be GC'd ...");
+                System.out.println("(The test will hang on failure)");
+                while (false == refSet.isEmpty()) {
+                    refSet.remove(rq.remove());
+                }
+                System.out.println("Test PASSED.");
+                System.out.println();
+            } finally {
+                zf.close();
+            }
+        } finally {
+            f.delete();
+        }
+    }
+
+    private static Set<Object> createTransientInputStreams(ZipFile zf,
+            ReferenceQueue<InputStream> rq) throws Exception {
+        Enumeration<? extends ZipEntry> zfe = zf.entries();
+        Set<Object> refSet = new HashSet<>();
+
+        while (zfe.hasMoreElements()) {
+            InputStream is = zf.getInputStream(zfe.nextElement());
+            refSet.add(new WeakReference<InputStream>(is, rq));
+        }
+
+        return refSet;
+    }
+}





More information about the core-libs-dev mailing list