RFR: 8215159: Improve initial setup of system Properties

Martin Buchholz martinrb at google.com
Tue Dec 11 16:28:58 UTC 2018


HashMap sizing was a terrible design mistake, but too much software depends
on it, e.g.
https://google.github.io/guava/releases/23.0/api/docs/com/google/common/collect/Maps.html#newHashMapWithExpectedSize-int-

On Tue, Dec 11, 2018 at 1:26 AM Claes Redestad <claes.redestad at oracle.com>
wrote:

> Hi Peter,
>
> On 2018-12-11 10:02, Peter Levart wrote:
> > Hi Claes,
> >
> > Haven't checked all changes yet, although it looks like a
> > straightforward swap of Properties for HashMap for intermediary result,
> > but I noticed the following in SystemProps:
> >
> >   265             var cmdProps = new HashMap<String,
> > String>((vmProps.length / 2) + Raw.FIXED_LENGTH);
> >
> > The HashMap(int) constructor is different from Properties(int) in that
> > for the former, the argument represents the lower bound on the initial
> > size of the table (which is just a rounding of this parameter up to the
> > nearest power of 2). The threshold where the table is resized is
> > calculated as (initialCapacity rounded up to nearest power of 2) *
> > loadFactor. The default load factor is 0.75 which means that the table
> > will be resized in worst case after 3/4 * initialCapacity of elements
> > are inserted into it. In order to guarantee that the table is not
> > resized you have to pass (size * 4 + 2) / 3 to the HashMap constructor,
> > where size is the number of elements added...
> >
> > I hope I'm not misleading you, I just think this is how HashMap has been
> > from the beginning. Peeking at HashMap code (line 693) it seems that it
> > is doing that:
> >
> >              float ft = (float)newCap * loadFactor;
> >              newThr = (newCap < MAXIMUM_CAPACITY && ft <
> > (float)MAXIMUM_CAPACITY ?
> >                        (int)ft : Integer.MAX_VALUE);
> >
> > newCap above is holding the initialCapacity constructor parameter
> > rounded up to the nearest power of 2. newThr is the threshold at which
> > the resize occurs.
> >
> > The Properties(int) constuctor behaves differently as it passes the
> > parameter directly to the underlying ConcurrentHashMap, which says:
> >
> >       * @param initialCapacity The implementation performs internal
> >       * sizing to accommodate this many elements.
>
> Fun story, but I noticed this unfortunate difference when I was working
> on this very patch and brought it up in the team. I think most agrees
> the CHM behavior is the more intuitive and would have loved to
> consolidate to that behavior, but the message I've gotten is that it's
> probably too late to fix: Whichever way you go you get lots of little
> subtle changes to sizes that may lead to incompabilities in
> applications/tests that inadvertedly depend on the iteration order or
> HashMap etc..
>
> That said: Raw typically contains quite a few empty/null values that'll
> never be put in the map. Enough so that for the applications I've looked
> at the initialCapacity is already a fair bit higher than it needs to be
> to avoid resizing. Thus it made little sense to take the maximum
> possible size and adjust it up even further by factoring in the default
> load factor.
>
> (Unless you have a *lot* of properties coming in via command line, but I
> think we should optimize for the common cases...)
>
> Thanks!
>
> /Claes
>
> >
> > Regards, Peter
> >
> > On 12/10/18 10:17 PM, Claes Redestad wrote:
> >> Hi,
> >>
> >> by inverting the order in which the internal property maps are created,
> >> we avoid some classloading and get a slightly more efficient code
> >> execution profile in System.initPhase1.
> >>
> >> Webrev: http://cr.openjdk.java.net/~redestad/8215159/jdk.00/
> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8215159
> >>
> >> This reduces bytecode executed in initPhase1 from ~48k to ~36k. Not
> >> much by any measure, but minimizing System.initPhase1 is important since
> >> certain parts of the VM (JIT etc) are blocked from initializing until
> >> it's done.
> >>
> >> Thanks!
> >>
> >> /Claes
> >
>


More information about the core-libs-dev mailing list