RFR: 8293972: runtime/NMT/NMTInitializationTest.java#default_long-off failed with "Suspiciously long bucket chains in lookup table." [v2]

Fri Jun 23 22:03:03 UTC 2023

On Thu, 22 Jun 2023 16:40:30 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> Looking back at this, I wonder whether we should reduce the table size. 64k for this lookup table is pretty big. The MallocSizeTable itself, that arguably holds a lot more allocations, is just have as wide.

Tested with a smaller table size **4093**, about half of 8191:

Using original table size and Marsenne prime:

macOS data set:
  avg1 - max chain: 5.129, empties: 1897.891, ranges: 4.139, std_devs: 0.813
  avg2 - max chain: 4.228, empties: 1712.020, ranges: 3.238, std_devs: 0.686
  avg3 - max chain: 4.743, empties: 2030.426, ranges: 3.752, std_devs: 0.762

Linux data set:
  avg1 - max chain: 6.624, empties: 2067.396, ranges: 5.634, std_devs: 0.921
  avg2 - max chain: 6.109, empties: 2263.485, ranges: 5.119, std_devs: 0.899
  avg3 - max chain: 6.436, empties: 2213.792, ranges: 5.446, std_devs: 0.897

Windows data set:
  avg1 - max chain: 5.653, empties: 1970.208, ranges: 4.663, std_devs: 0.883
  avg2 - max chain: 4.802, empties: 2143.931, ranges: 3.812, std_devs: 0.822
  avg3 - max chain: 5.525, empties: 2136.158, ranges: 4.535, std_devs: 0.862

Note1: avg1 - the original code, table size of 7919
Note2: avg2 - gerard's proposed hash function, table size of 8191
Note3: avg3 - Thomas'es proposed hash function, table size of 8191
Note4: elapsed time: 46.5 sec
Note5: the smaller the numbers the better

Using smaller table size of 4093 for all cases:

macOS data set:
  avg1 - max chain: 6.871, empties: 318.000, ranges: 5.871, std_devs: 1.207
  avg2 - max chain: 5.881, empties: 219.634, ranges: 4.851, std_devs: 1.002
  avg3 - max chain: 6.891, empties: 308.396, ranges: 5.891, std_devs: 1.194

Linux data set:
  avg1 - max chain: 8.980, empties: 310.990, ranges: 7.990, std_devs: 1.420
  avg2 - max chain: 8.545, empties: 326.772, ranges: 7.554, std_devs: 1.401
  avg3 - max chain: 8.970, empties: 311.000, ranges: 7.980, std_devs: 1.420

Windows data set:
  avg1 - max chain: 7.703, empties: 277.386, ranges: 6.713, std_devs: 1.339
  avg2 - max chain: 6.901, empties: 314.089, ranges: 5.911, std_devs: 1.353
  avg3 - max chain: 7.703, empties: 277.366, ranges: 6.713, std_devs: 1.340

Note1: table_size == 4093 (about half of the Marsenne prime of 8191)
Note2: avg1 - the original code
Note3: avg2 - gerard's proposed hash function
Note4: avg3 - Thomas'es proposed hash function
Note5: elapsed time: 25.4 sec
Note6: the smaller the numbers the better

I think we could go smaller as you asked. My proposed hash function still is the best 2 out of 3 for stddev and all max chain and ranges.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/14607#issuecomment-1605028705