Trimming ArrayList and HashMap During GC Copy
Nathan Reynolds
numeralnathan at gmail.com
Thu Jan 26 20:22:56 UTC 2023
> This could severely affect performance. One might create an ArrayList and
use
ArrayList.ensureCapacity before adding a large number of elements, to reduce
incremental resizing of the internal array. Then you propose the GC can come
along and stomp on your capacity setting.
Yes, I am aware of this. This is why I am proposing 3 modes: off, idle,
and always. The default mode will be off until enough performance testing
shows that idle is good for most programs. I realize that some programs
may need to turn it off.
In idle mode, this begs the question in what case can GC trim the ArrayList
and the Java program call ArrayList.ensureCapacity()? The Java program is
by definition idle. So, it isn't calling ArrayList.ensureCapacity().
In always mode, this is the exact thing I am concerned about. Please see
my "Notes" section in a previous email today that discusses this. I doubt
that always mode will be viable without additional work.
> One can use ArrayList.trimToSize to eliminate excess capacity.
We need the program code to call trimToSize() or we would have to add
synchronization to ArrayList. Adding synchronization will incur a huge
synchronization cost.
The beauty of trimming in GC relocation is that the synchronization is
taken care of by the GC relocation algorithm. For Serial, Parallel, and
G1, the relocation happens during program pause and the pause is the
synchronization. For ZGC, the relocation algorithm is executed by GC and
Java threads. The relocation algorithm provides the synchronization. See
my previous email today on how this relocation algorithm needs to be
tweaked.
> HashMap performance depends on having excess capacity to reduce
collisions.
The HashMap constructor has an argument that is a hint of the expected
number
of entries. If your HashMaps are overly sparse you can either use a smaller
initial hint or copy into a new "right-sized" map.
HashMap uses a load factor to determine when the internal hashtable needs
to be increased in size to reduce collisions. The trimming follows this
load factor and won't trim the hashtable smaller than what the load factor
dictates.
> I don't know how the GC could possibly reduce a HashMap, since the
positions
of entries depend on information that the GC doesn't have nor can compute.
Please see my previous email today on how this works.
On Tue, Jan 24, 2023 at 4:31 AM Kim Barrett <kim.barrett at oracle.com> wrote:
> > On Jan 23, 2023, at 12:44 PM, Nathan Reynolds <numeralnathan at gmail.com>
> wrote:
> >
> > > 1. Such a change would have user observable differences in behaviour,
> which could introduce bugs in user code, due to the optimization.
> >
> > How is this user observable? The Object[] is buried inside an ArrayList
> or HashMap. This idea is not touching other Object[]'s outside a
> collection.
>
> This could severely affect performance. One might create an ArrayList and
> use
> ArrayList.ensureCapacity before adding a large number of elements, to
> reduce
> incremental resizing of the internal array. Then you propose the GC can
> come
> along and stomp on your capacity setting.
>
> One can use ArrayList.trimToSize to eliminate excess capacity.
>
> HashMap performance depends on having excess capacity to reduce collisions.
> The HashMap constructor has an argument that is a hint of the expected
> number
> of entries. If your HashMaps are overly sparse you can either use a
> smaller
> initial hint or copy into a new "right-sized" map.
>
> I don't know how the GC could possibly reduce a HashMap, since the
> positions
> of entries depend on information that the GC doesn't have nor can compute.
>
> > I suppose a performance impact from having to grow is somewhat
> observable. This was noted in my original email. However, growing is not
> functionally observable.
>
> Not functionally observable != not important for users.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20230126/d0f3df13/attachment.htm>
More information about the hotspot-gc-dev
mailing list