RFR: 8329968: os::random should be random

Tue May 28 18:13:01 UTC 2024

On Tue, 9 Apr 2024 16:55:07 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> Hi,
> 
> This change ensures that the random seed of os::random is initialized randomly. Before this patch, every sequence of os::random calls gave the same results. With this patch, a quasi-random seed based on javaTimeNanos is used.
> 
> Where this matters the most:
> 
> - Random distribution of ihashes. Atm, ihashes are always the same for the same sequence of System.identityHashCode.
> 
> - We use os::random to fuzzing gtests. This is the part most important to me. I want gtest tests that use os::random for stress testing data structures to be as efficient as possible in tickling out pathological behavior.
> 
> Note that this issue turned out to be much more controversial than I expected. Please see pro- and con-arguments in JBS comment section. I would like a consensus on this issue before pushing the fix (pinging @dholmes). If we cannot agree that this needs fixing, I'd be content with at least the gtests being random. So, a smaller version of this change would be to use a separate random seed for gtests.
> 
> ---
> 
> Tests:
> - tested that the regression test fails with Stock JVM, passes with patched JVM
> - GHAs
> (will do more tests to see if the new randomness shakes lose bugs. However, before investing the work I'd like consensus that this PR can go forward)

Hi,

If we can't guarantee reproducibility because of the non-determinism introduced by mutliple threads, then what point is there in keeping the seed the same for each run? It makes sense to me that the function should be the one which is least surprising, and that's coming from someone that generally jumps-to-definition a lot (and deeply) to find out what happens in the code. It's certainly surprising to me that we don't seed the RNG with some form of non-constant entropy (even if it's a poor one).

Today we have very strong tools, such as `rr` (deterministic+time travelling debugging), which we can use for reproducing runs. Partial reproducibility based on having the same initial seed seems fragile compared to that.

Finally, if you're trying to reproduce an issue on a JDK and you want to make as much state reproducible as possible, then wouldn't you have to go around in all of our RNG:s and figure out what their initial seeds were? Just fixing `os::random` doesn't seem as likely.

If `os::random` was added today, never having existed before, then I would think it would be added with some form of non-constant seeding, no?

All the best,
Johan

-------------

PR Comment: https://git.openjdk.org/jdk/pull/18702#issuecomment-2135843502