Suspected regression: fix for 6735255 causes delay in GC of ZipFile InputStreams, increase in heap demand

Xueming Shen xueming.shen at oracle.com
Thu Apr 14 21:48:24 UTC 2011


Steve, Neil,

As we discussed in previous emails that Neil's last patch (to put the 
inflater into the "streams" as
the value of the map) does solve the FinalizeInflater failure issue 
(provides a "orderly" final cleanup
between the "stream" and the "inflater". However to leave 
releaseInfalter() to be solely driven
by GC might delay the inflater to be cached/re-used in a timely manner, 
even in scenario that
those streams get closed explicitly by the invoker. Ideally it is 
desirable that the inflater can be
re-used/put back into the cache when ZFIIS.close() gets invoked explicitly.

It appears it might be too hard, if not impossible, to arrange the 
releaseInflater() to be invoked from
both clearStaleStreams()/staleStreamQueue and ZFIIS.close(), given the 
non-guarantee, by the VM/GC,
of the timing/order among the clear up of weak reference objects, 
enqueue them and the invocation of
finalize(). So I would suggest a relatively simple solution as showed in 
webrev at

http://cr.openjdk.java.net/~sherman/7031076/webrev/src/share/classes/java/util/zip/ZipFile.java.sdiff.html

(1) simply use the WeakHashMap to keep the <InputStream, Inflater> pair, 
we don't try
      to release/put the used inflater back to cache, if the ZFIIS 
stream does not get closed explicitly
      (its underlying ZFIS stream and the inflater will then get 
finalized by themself).

(2) release the inflater back to cache in ZFIIS.close(), only if the 
inflater is sill "alive" in the
      "streams", which guarantees that the inflater is not being 
finalized and safe to put back
      to cache (so long as we have the weak reference -> inflater pair 
in "streams", it does not
      matter if the reference to ZFIIS object is cleared or not). In 
most cases, I would expect
      when the ZFIIS.close() is invoked, both ZFIIS object and its 
inflater are still reachable, so
      the inflater gets re-used as expected. In "weird" scenario like 
what we have in
      FinalizeInflater, the ZFIIS object and its outer wrapper object 
are being finalized when
      ZFIIS.close() gets invoked (from wrapper.finalize()), there is 
possibility that the inflater might
      not be put back to cache, if the weak reverence->inflater pair has 
been expunged out of the
      "streams" already (because the weak reference to ZFIIS might have 
been cleared, so its
      entry in "streams" can be expunged any time, by any thread. At 
this moment, the inflater
      object might be declared as "un-reachable" and no longer safe to 
put back into cache), but
      I think we can live with this "missing".

(3) I realized that we actually are not removing the closed ZFIIS 
objects (only the ZFIS objects
      are removed when closed) from "streams" even though those input 
streams are closed
      gracefully by the invoker (as preferred, suggested, desired), 
which means all those ZFIIS
      streams will stay in "streams" (strong referenced in current 
implementation), no matter
      whether or not they get closed explicitly, until  the ZipFile 
object itself gets closed/finalized.
      This might contribute to the problem Neil observed at first place 
(the ZipFile InputStreams
      accumulat in heap), actually I do have another incident report 
#7033523 which I have
      closed as the dup of #7031076.

Opinion? anything I overlooked/missed?

-Sherman



On 04/13/2011 08:25 AM, Steve Poole wrote:
> On 13/04/11 16:19, Steve Poole wrote:
>> On 12/04/11 20:33, Xueming Shen wrote:
>>> Hi Neil,
>>
>>
>> Hi Sherman ,  Neil is out on vacation so  I will do my best to stand 
>> in for him.
>>>
>>> (1) I believe it would be better to keep the synchronization lock 
>>> for get/releaseInfalter()
>>>      "local" instead of using the "global" ZipFile.this, which I 
>>> agree is "simple". But it also
>>>      means each/every time when you release the used inflater back 
>>> to cache, ZipFile.this
>>>      has to be held and any possible/potential read operation on 
>>> ZipFile from other thead/
>>>      inputstream has to wait ("get" seems fine, the current impl 
>>> holds the ZipFile.this
>>>      anyway before reach the getInflater()).
>> Ok - I agree - seems sensible.  (Though I reserve the right to 
>> contradict myself later)
>>>
>>> (2) The "resource" Infalter mainly holds is the native memory block 
>>> it alloc-ed at native
>>>      for zlib, which is not in the Java heap, so I doubt making it 
>>> "softly" really helps for GC.
>>>      Sure, cleanup of those "objects" themself is a plus, but I'm 
>>> not sure if it is really worth
>>>      using SoftReference in this case (it appears you are now 
>>> invoking clearStaleStreams()
>>>      from the finalize()).
>> I'd like to keep this in -  not all implementations of zlib are equal 
>> in where they allocate memory.
>>>
>>> (3) The releaseInfalter() is now totally driven from 
>>> clearStaleStreams()/staleStreamQueue,
>>>      which is under full control of GC. If GC does not kick in for 
>>> hours/days, infalters can never
>>>      be put back into its cache after use, even the applications 
>>> correctly/promptly close those
>>>      streams after use. And when the GC kicks in, we might see 
>>> "bunch" (hundreds, thousands)
>>>      of inflaters get pushed into cache. The approach does solve the 
>>> "timing" issue that got
>>>      us here, but it also now has this "native memory cache" 
>>> mechanism totally under the
>>>      control of/driven by the GC, the java heap management 
>>> mechanism, which might not be
>>>      a preferred scenario. Just wonder if we can have a better 
>>> choice, say with this GC-driven
>>>      as the backup cleanup and meanwhile still have the 
>>> ZFIIS.close() to do something to safely
>>>      put the used inflater back to cache promptly. To put the 
>>> infalter as the value of the "streams"
>>>      map appears to be a good idea/start, now the "infalter" will 
>>> not be targeted for finalized
>>>      until the entry gets cleaned from the map, in which might in 
>>> fact provide us a sort of
>>>      "orderly" (between the "stream" and its "inflater") clearup 
>>> that the GC/finalizer can't
>>>      guarantee. We still have couple days before we hit the 
>>> "deadline" (to get this one in), so it
>>>      might worth taking some time on this direction?
>>>
>
> Dang! - pressed send when I meant save.   I understand your comments - 
> on the face of it I agree with what you're suggesting - let me think 
> it through some more though.
>> What is this "deadline"  you are talking about?
>>
>>> -Sherman
>>>
>>>
>>>
>>>
>>>
>>> On 04/11/2011 05:15 AM, Neil Richards wrote:
>>>> On Sun, 2011-04-10 at 18:28 +0100, Neil Richards wrote:
>>>>> With releaseInflater() being driven from the cleanup of stale 
>>>>> 'streams'
>>>>> entries, it no longer needs to be called from ZFIIS.close(), so,
>>>>> instead, only Inflater.reset() is called from there (providing the
>>>>> inflater object has not explicitly been ended) so that it releases 
>>>>> the
>>>>> buffer it has been holding.
>>>> Actually, I have a slight change of heart on this aspect.
>>>>
>>>> Because close() may be driven from a finalize() method in user code 
>>>> (as
>>>> is done in the InflaterFinalizer test case), I believe it is possible
>>>> (albeit unlikely) for an inflater object to have been reclaimed from
>>>> 'streams' (by a call to clearStaleStreams()), put into the inflater
>>>> cache, retrieved from there (by another thread creating an input 
>>>> stream)
>>>> and having started to be used by that other stream at the point at 
>>>> which
>>>> the code in close() is run.
>>>>
>>>> (This is because weak references will be cleared, and *may* be 
>>>> queued on
>>>> the ReferenceQueue, prior to finalization).
>>>>
>>>> Because of this, it's not actually entirely safe now to call 
>>>> inf.reset()
>>>> from ZFIIS.close().
>>>>
>>>> Instead, I have added additional calls to clearStaleStreams() from the
>>>> finalize() methods of both InputStream implementations in the latest
>>>> (hopefully in the meaning of "last") changeset included below.
>>>>
>>>> As always, please get back to me with any comments, questions or
>>>> suggestions on this,
>>>> Thanks,
>>>> Neil
>>>>
>>>
>>
>




More information about the core-libs-dev mailing list