New benchmark modes

Tue Nov 29 15:28:57 UTC 2016

See attached a patch to benchmark generator with all the features discussed:* Parametrized target samples per ms (int, default to 20 which is current behaviour). Special value 0 will always sample.* Parametrized target ops per ms (double, default to NaN which is current behaviour).* Parametrized measure response time which measures from intended start time (default to false, current behaviour)I tried to make it easy for the compiler to eliminate dead code when configuration choices allow it, hence all the branches on constants.I am not certain how the delay code should be best done. Spinning offers the best accuracy while harassing System.nanoTime, parking is quite noisy for scheduling errors, in other projects I've used a mixed approach to try and find a compromise. 
Thanks,Nitsan

    On Friday, November 18, 2016 6:44 PM, Aleksey Shipilev <shade at redhat.com> wrote:

 Hi,

On 11/16/2016 01:56 PM, Nitsan Wakart wrote:
>> See how "targetSamples" is calculated in the generated stub.
> Yes:
>  int targetSamples = (int) (control.getDuration(TimeUnit.MILLISECONDS)
> * 20); // at max, 20 timestamps per millisecond
> 
> The generated measurement code (commented to see if I understand the
> intention):
>  int rnd = (int)System.nanoTime();
>  int rndMask = this.startRndMask; // startRndMask is initially 0
>  long time = 0;
>  int currentStride = 0;
>  do {
>    rnd = (rnd * 1664525 + 1013904223); // magic random number, I assume
> Shipilev did the maths, but wish he'd leave a comment

See:
https://en.wikipedia.org/wiki/Linear_congruential_generator#Parameters_in_common_use

>    boolean sample = (rnd & rndMask) == 0; // this is initially true
> until we hit "too many samples" below
>    if (sample) {
>        time = System.nanoTime();
>    }
>    for (int b = 0; b < batchSize; b++) {
>        if (control.volatileSpoiler) return;
>        l_jmhsample_02_benchmarkmodes0_0.measureSamples();
>    }
>    if (sample) {
>        buffer.add((System.nanoTime() - time) / opsPerInv);
>        if (currentStride++ > targetSamples) { // too many samples
>            buffer.half(); // half the sample counts, since we'll be
> sampling half as much from now on
>            currentStride = 0;
>            rndMask = (rndMask << 1) + 1; // block another bit
>        }
>    }
>    operations++;
>  } while(!control.isDone);
>  this.startRndMask = Math.max(startRndMask, rndMask);

I think we need to build on this code, and this code only. Other
benchmark modes have no business being rate-limited:
average-time/throughput are not supposed to be rate-limited, and
single-shot is meaningless with rate limiting.

If we want to correct for coordinated omission, that means we have to
measure every invocation to maintain schedule, right? If so, we may
ditch half of SampleTime mechanics, and replace it with another half for
measurement.

> This hits on the question of how to deliver parameters to benchmark
> modes that has not been tackled to date. Which is also, I'm guessing,
> what you mean by this:
>> The only troubling thing here is that you need to pass in the rate
> parameter somehow...
> Is it reasonable to add an annotation parameter here? The command line
> format can follow the same convention as profiler parameterization
> (e.g.:"-bm sample:targetSamplesPerMs=20")

Yes. Let's not concern ourselves with this for a time being, and accept
the rate from the system property. We can figure out how to add this to
API later.

Thanks,
-Aleksey