Suspected regression: fix for 6735255 causes delay in GC of ZipFile InputStreams, increase in heap demand
Xueming Shen
xueming.shen at oracle.com
Thu Apr 14 21:48:24 UTC 2011
Steve, Neil,
As we discussed in previous emails that Neil's last patch (to put the
inflater into the "streams" as
the value of the map) does solve the FinalizeInflater failure issue
(provides a "orderly" final cleanup
between the "stream" and the "inflater". However to leave
releaseInfalter() to be solely driven
by GC might delay the inflater to be cached/re-used in a timely manner,
even in scenario that
those streams get closed explicitly by the invoker. Ideally it is
desirable that the inflater can be
re-used/put back into the cache when ZFIIS.close() gets invoked explicitly.
It appears it might be too hard, if not impossible, to arrange the
releaseInflater() to be invoked from
both clearStaleStreams()/staleStreamQueue and ZFIIS.close(), given the
non-guarantee, by the VM/GC,
of the timing/order among the clear up of weak reference objects,
enqueue them and the invocation of
finalize(). So I would suggest a relatively simple solution as showed in
webrev at
http://cr.openjdk.java.net/~sherman/7031076/webrev/src/share/classes/java/util/zip/ZipFile.java.sdiff.html
(1) simply use the WeakHashMap to keep the <InputStream, Inflater> pair,
we don't try
to release/put the used inflater back to cache, if the ZFIIS
stream does not get closed explicitly
(its underlying ZFIS stream and the inflater will then get
finalized by themself).
(2) release the inflater back to cache in ZFIIS.close(), only if the
inflater is sill "alive" in the
"streams", which guarantees that the inflater is not being
finalized and safe to put back
to cache (so long as we have the weak reference -> inflater pair
in "streams", it does not
matter if the reference to ZFIIS object is cleared or not). In
most cases, I would expect
when the ZFIIS.close() is invoked, both ZFIIS object and its
inflater are still reachable, so
the inflater gets re-used as expected. In "weird" scenario like
what we have in
FinalizeInflater, the ZFIIS object and its outer wrapper object
are being finalized when
ZFIIS.close() gets invoked (from wrapper.finalize()), there is
possibility that the inflater might
not be put back to cache, if the weak reverence->inflater pair has
been expunged out of the
"streams" already (because the weak reference to ZFIIS might have
been cleared, so its
entry in "streams" can be expunged any time, by any thread. At
this moment, the inflater
object might be declared as "un-reachable" and no longer safe to
put back into cache), but
I think we can live with this "missing".
(3) I realized that we actually are not removing the closed ZFIIS
objects (only the ZFIS objects
are removed when closed) from "streams" even though those input
streams are closed
gracefully by the invoker (as preferred, suggested, desired),
which means all those ZFIIS
streams will stay in "streams" (strong referenced in current
implementation), no matter
whether or not they get closed explicitly, until the ZipFile
object itself gets closed/finalized.
This might contribute to the problem Neil observed at first place
(the ZipFile InputStreams
accumulat in heap), actually I do have another incident report
#7033523 which I have
closed as the dup of #7031076.
Opinion? anything I overlooked/missed?
-Sherman
On 04/13/2011 08:25 AM, Steve Poole wrote:
> On 13/04/11 16:19, Steve Poole wrote:
>> On 12/04/11 20:33, Xueming Shen wrote:
>>> Hi Neil,
>>
>>
>> Hi Sherman , Neil is out on vacation so I will do my best to stand
>> in for him.
>>>
>>> (1) I believe it would be better to keep the synchronization lock
>>> for get/releaseInfalter()
>>> "local" instead of using the "global" ZipFile.this, which I
>>> agree is "simple". But it also
>>> means each/every time when you release the used inflater back
>>> to cache, ZipFile.this
>>> has to be held and any possible/potential read operation on
>>> ZipFile from other thead/
>>> inputstream has to wait ("get" seems fine, the current impl
>>> holds the ZipFile.this
>>> anyway before reach the getInflater()).
>> Ok - I agree - seems sensible. (Though I reserve the right to
>> contradict myself later)
>>>
>>> (2) The "resource" Infalter mainly holds is the native memory block
>>> it alloc-ed at native
>>> for zlib, which is not in the Java heap, so I doubt making it
>>> "softly" really helps for GC.
>>> Sure, cleanup of those "objects" themself is a plus, but I'm
>>> not sure if it is really worth
>>> using SoftReference in this case (it appears you are now
>>> invoking clearStaleStreams()
>>> from the finalize()).
>> I'd like to keep this in - not all implementations of zlib are equal
>> in where they allocate memory.
>>>
>>> (3) The releaseInfalter() is now totally driven from
>>> clearStaleStreams()/staleStreamQueue,
>>> which is under full control of GC. If GC does not kick in for
>>> hours/days, infalters can never
>>> be put back into its cache after use, even the applications
>>> correctly/promptly close those
>>> streams after use. And when the GC kicks in, we might see
>>> "bunch" (hundreds, thousands)
>>> of inflaters get pushed into cache. The approach does solve the
>>> "timing" issue that got
>>> us here, but it also now has this "native memory cache"
>>> mechanism totally under the
>>> control of/driven by the GC, the java heap management
>>> mechanism, which might not be
>>> a preferred scenario. Just wonder if we can have a better
>>> choice, say with this GC-driven
>>> as the backup cleanup and meanwhile still have the
>>> ZFIIS.close() to do something to safely
>>> put the used inflater back to cache promptly. To put the
>>> infalter as the value of the "streams"
>>> map appears to be a good idea/start, now the "infalter" will
>>> not be targeted for finalized
>>> until the entry gets cleaned from the map, in which might in
>>> fact provide us a sort of
>>> "orderly" (between the "stream" and its "inflater") clearup
>>> that the GC/finalizer can't
>>> guarantee. We still have couple days before we hit the
>>> "deadline" (to get this one in), so it
>>> might worth taking some time on this direction?
>>>
>
> Dang! - pressed send when I meant save. I understand your comments -
> on the face of it I agree with what you're suggesting - let me think
> it through some more though.
>> What is this "deadline" you are talking about?
>>
>>> -Sherman
>>>
>>>
>>>
>>>
>>>
>>> On 04/11/2011 05:15 AM, Neil Richards wrote:
>>>> On Sun, 2011-04-10 at 18:28 +0100, Neil Richards wrote:
>>>>> With releaseInflater() being driven from the cleanup of stale
>>>>> 'streams'
>>>>> entries, it no longer needs to be called from ZFIIS.close(), so,
>>>>> instead, only Inflater.reset() is called from there (providing the
>>>>> inflater object has not explicitly been ended) so that it releases
>>>>> the
>>>>> buffer it has been holding.
>>>> Actually, I have a slight change of heart on this aspect.
>>>>
>>>> Because close() may be driven from a finalize() method in user code
>>>> (as
>>>> is done in the InflaterFinalizer test case), I believe it is possible
>>>> (albeit unlikely) for an inflater object to have been reclaimed from
>>>> 'streams' (by a call to clearStaleStreams()), put into the inflater
>>>> cache, retrieved from there (by another thread creating an input
>>>> stream)
>>>> and having started to be used by that other stream at the point at
>>>> which
>>>> the code in close() is run.
>>>>
>>>> (This is because weak references will be cleared, and *may* be
>>>> queued on
>>>> the ReferenceQueue, prior to finalization).
>>>>
>>>> Because of this, it's not actually entirely safe now to call
>>>> inf.reset()
>>>> from ZFIIS.close().
>>>>
>>>> Instead, I have added additional calls to clearStaleStreams() from the
>>>> finalize() methods of both InputStream implementations in the latest
>>>> (hopefully in the meaning of "last") changeset included below.
>>>>
>>>> As always, please get back to me with any comments, questions or
>>>> suggestions on this,
>>>> Thanks,
>>>> Neil
>>>>
>>>
>>
>
More information about the core-libs-dev
mailing list